1.本课程导学
2.pandas库的介绍
3.pandas库的Serious类型
4.pandas库的DataFrame类型
5.pandas库的数据类型操作
6.pandas库的数据类型运算
7.单元小结
[网页链接【Python数据分析与展示】.MOOC. 北京理工大学
https://www.bilibili.com/video/av10101509/?from=search&seid=8584212945516406240#page=35)
最近更新:2018-01-29
1.本课程导学
2.pandas库的介绍
2.1Pandas库的引用
i
Pandas库小测
左边0-19是索引,右边是值
import pandas as pd
d=pd.Series(range(20))
d
Out[15]:
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19
dtype: int64
d.cumsum()
Out[16]:
0 0
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
10 55
11 66
12 78
13 91
14 105
15 120
16 136
17 153
18 171
19 190
dtype: int64
2.2Pandas库的理解
3.pandas库的Serious类型
3.1Serious类型
import pandas as pd
a=pd.Series([9,8,7,6])
a
Out[19]:
0 9
1 8
2 7
3 6
dtype: int64
自定义索引
import pandas as pd
a=pd.Series([9,8,7,6],index=["a","b","c","d"])
a
Out[22]:
a 9
b 8
c 7
d 6
dtype: int64
-
从标量值创建
注意:图片上写着不可以省略index,实际是可以省略,a=pd.Series(25,["a","b","c"])
import pandas as pd
a=pd.Series(25,index=["a","b","c"])
a
Out[25]:
a 25
b 25
c 25
dtype: int64
-
从字典类型创建
import pandas as pd
a=pd.Series({"a":9,"b":8,"c":7})
a
Out[30]:
a 9
b 8
c 7
dtype: int64
import pandas as pd
e=pd.Series({"a":9,"b":8,"c":7},index=["c","a","b","d"])
e
Out[33]:
c 7.0
a 9.0
b 8.0
d NaN
dtype: float64
-
从ndarray类型创建
import pandas as pd
n=pd.Series(np.arange(5))
n
Out[36]:
0 0
1 1
2 2
3 3
4 4
dtype: int32
import pandas as pd
m=pd.Series(np.arange(5),index=np.arange(9,4,-1))
m
Out[41]:
9 0
8 1
7 2
6 3
5 4
dtype: int32
-
Serious类型总结
3.2Serious类型的基本操作
-
Serious类型包括index和values两部分
import pandas as pd
b=pd.Series([9,8,7,6],["a","b","c","d"])
b.index
Out[44]: Index(['a', 'b', 'c', 'd'], dtype='object')
b.values
Out[45]: array([9, 8, 7, 6], dtype=int64)
b["b"]
Out[46]: 8
b[1]
Out[47]: 8
b[["c","d",0]]
Out[49]:
c 7.0
d 6.0
0 NaN
dtype: float64
b[["c","d","a"]]
Out[50]:
c 7
d 6
a 9
dtype: int64
-
Serious类型的操作类似ndarray类型
import pandas as pd
b=pd.Series([9,8,7,6],["a","b","c","d"])
b
Out[52]:
a 9
b 8
c 7
d 6
dtype: int64
b[3]
Out[53]: 6
b[:3]
Out[54]:
a 9
b 8
c 7
dtype: int64
b[b>b.median()]
Out[55]:
a 9
b 8
dtype: int64
np.exp(b)
Out[56]:
a 8103.083928
b 2980.957987
c 1096.633158
d 403.428793
dtype: float64
- Serious类型的操作类似Python字典类型
0是否在自定义的索引中
import pandas as pd
b=pd.Series([9,8,7,6],["a","b","c","d"])
b["b"]
Out[59]: 8
"c" in b
Out[60]: True
0 in b
Out[61]: False
b.get("f",100)
Out[62]: 100
#在b中提取索引"f"'对应的值100,如果对应的值不存在就返回100.
3.3Serious类型对齐操作
import pandas as pd
a=pd.Series([1,2,3],["c","d","e"])
b=pd.Series([9,8,7,6],["a","b","c","d"])
a+b
Out[69]:
a NaN
b NaN
c 8.0
d 8.0
e NaN
dtype: float64
3.4Serious类型的name属性
import pandas as pd
b=pd.Series([9,8,7,6],["a","b","c","d"])
b.name
b.name="Serious 对象"
b.index.name="索引列"
b
Out[76]:
索引列
a 9
b 8
c 7
d 6
Name: Serious 对象, dtype: int64
3.5Serious类型的修改
import pandas as pd
b=pd.Series([9,8,7,6],["a","b","c","d"])
b["a"]=15
b.name="Serious"
b
Out[81]:
a 15
b 8
c 7
d 6
Name: Serious, dtype: int64
b.name="New Serious"
b["b","c"]=20
b
Out[84]:
a 15
b 20
c 20
d 6
Name: New Serious, dtype: int64
3.6Serious类型的总结
4.pandas库的DataFrame类型
4.1DataFram类型
-
二维ndarray对象
import pandas as pd
import numpy as np
d=pd.DataFrame(np.arange(10).reshape(2,5))
d
Out[88]:
0 1 2 3 4
0 0 1 2 3 4
1 5 6 7 8 9
-
由一维ndarray/列表/字典/元组或Serious构成的字典
1)从一维ndarray对象字典创建
import pandas as pd
dt={"one":pd.Series(([1,2,3]),index=["a","b","c"]),"two":pd.Series(([9,8,7,6]),index=["a","b","c","d"])}
d=pd.DataFrame(dt)
d
Out[92]:
one two
a 1.0 9
b 2.0 8
c 3.0 7
d NaN 6
pd.DataFrame(dt,index=["b","c","d"],columns=["two","three"])
Out[93]:
two three
b 8 NaN
c 7 NaN
d 6 NaN
2)从列表类型的字典创建
import numpy as np
d1={"one":[1,2,3,4],"two":[9,8,7,6]}
d=pd.DataFrame(d1,index=["a","b","c","d"])
d
Out[100]:
one two
a 1 9
b 2 8
c 3 7
d 4 6
网友评论