distribution 定义 bar plots for distribution data8 example compound way seaborn example 1 2 3 4 import seaborn as sns sns.countplot(x='variable', data=df) # rug plot sns.rugplot(x='variable', data=df, color='black') plotly example 处理异常值(outliers)和峰值(mode) density curve 密度曲线看峰 箱型图 1 2 import seaborn as sns sns.boxplot(x='variable', data=df) violin plot 和箱型图对比来看,violin plot宽度有意义 1 2 import seaborn as sns sns.violinplot(x='variable', data=df) 处理overplotting random jitter
text data python string methods .replace(str1, str2)
.split('/') regex 参考Tools正则表达式笔记
| 优先级较低 感兴趣的练习↓ https://alf.nu/RegexGolf
python re .sub() pattern(r"......") is a raw string, which means that backslashes are not interpreted as escape characters. eg: “\\section” in regular str, “\section” in raw str.
.findall(pattern, string) 1 2 3 4 5 6 7 import re pattern = r'\b\w{3}\b' string = 'The quick brown fox jumps over the lazy dog' matches = re.findall(pattern, string) print(matches) Output:
Discussion 2: Pandas Practice We will begin our discussion of Pandas. You will practice:
Selecting columns Filtering with boolean conditions Counting with value_counts 1 2 import pandas as pd import numpy as np Pandas Practise In the first Pandas question, we will be working with the elections dataset from lecture.
1 2 elections = pd.read_csv("elections.csv") # read in the elections data into a pandas dataframe! elections.head(5) Year Candidate Party Popular vote Result % 0 1824 Andrew Jackson Democratic-Republican 151271 loss 57.