pandas 0.23.4 中的绘图函数, Anaconda ‘

作者: LeeMin_Z | 来源:发表于2018-08-20 23:05 被阅读80次

pandas 0.23.4 中的绘图函数, Anaconda ‘
可视化图表
python数据分析数据科学中文英文工具书籍下载-持续更新
Pandas笔记1-导入csv文件
python 可视化笔记
第八章绘图和可视化
1.pandas的安装与配置
利用Python进行数据分析（十三）之Pandas的绘图函数
04python数据分析处理库_pandas
pandas DataFrame索引行列

内容小结：

环境配置
折线图
柱状图
直方图
密度图
双峰正态分布图
散点图

学习小结：

2k+页的（经常改细节的）纯英手册很难看完，但没必要看完，需要时可以搜索官方pdf文档。函数和图示核心是为了更好地展示数据，更重要的是理解图示特点和重要参数。

1. 环境配置

matplotlib代码要写很长，套用函数是为了少写一点代码。

没错，这本书做到一半，作者说[这本书旧了!去看pandas官网的资料吧！]，目瞪口呆.jpg

升级到官网最新版

以下是是Anaconda集成环境

#看到是旧版

lee>conda list pandas
# packages in environment at C:\Users\****\Anaconda3:

# Name                    Version                   Build  Channel
pandas                    0.20.3           py36hce827b7_2

#升级一下

lee> conda update pandas

去官网下载一份最新的RN，并绝望地发现它有2573页

死心知道看不完，用的时候搜关键词，每次多看一点点。

release.png

注意事实画图在 Anaconda prompt打开 ipython --pylab

2. 折线图

In [2]: s = Series(np.random.randn(10).cumsum(),index=np.arange(0,100,10))

In [3]: s
Out[3]:
0     0.630734
10   -0.497936
20    0.499530
30   -0.242562
40    0.479425
50    2.252005
60    3.065480
70    1.579776
80    0.616986
90    2.368518
dtype: float64

In [4]: s.plot()
Out[4]: <matplotlib.axes._subplots.AxesSubplot at 0x451e694f28>

plot.png

In [5]: df = DataFrame(np.random.randn(10,4).cumsum(0),
   ...: columns=['A','B','C','D'],
   ...: index=np.arange(0,100,10))

In [6]:

In [6]: df.plot()
Out[6]: <matplotlib.axes._subplots.AxesSubplot at 0x451f98d6a0>

zx2.png

3. 柱状图

垂直柱状图

In [29]: data = Series(np.random.rand(16),index=['a', 'b', 'c', 'd', 'e', 'f',
    ...: 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p'])

In [30]: data
Out[30]:
a    0.653354
b    0.388024
c    0.341464
d    0.275227
e    0.968719
f    0.085227
g    0.496338
h    0.276607
i    0.302645
j    0.954232
k    0.293769
l    0.423546
m    0.400934
n    0.397526
o    0.849696
p    0.269723
dtype: float64

In [32]: data.plot(kind='bar',color='k',alpha=0.3)
Out[32]: <matplotlib.axes._subplots.AxesSubplot at 0x45247e16d8>

bar1.png

水平柱状图

In [34]: data.plot(kind='barh',color='k',alpha=0.3)
Out[34]: <matplotlib.axes._subplots.AxesSubplot at 0x452484a240>

barh.png

2.2 排序后的水平柱状图（sort(), order()在pandas23.4不能用了，变为sort_values()）

In [54]: result['Zinc, Zn'].sort_values()
Out[54]:
fgroup
Fats and Oils                        0.020
Beverages                            0.040
Fruits and Fruit Juices              0.100
Soups, Sauces, and Gravies           0.200
Vegetables and Vegetable Products    0.330
Sweets                               0.360
Baby Foods                           0.590
Meals, Entrees, and Sidedishes       0.630
Baked Products                       0.660
Finfish and Shellfish Products       0.670
Restaurant Foods                     0.800
Ethnic Foods                         1.045
Cereal Grains and Pasta              1.090
Legumes and Legume Products          1.140
Fast Foods                           1.250
Dairy and Egg Products               1.390
Snacks                               1.470
Sausages and Luncheon Meats          2.130
Pork Products                        2.320
Poultry Products                     2.500
Spices and Herbs                     2.750
Breakfast Cereals                    2.885
Nut and Seed Products                3.290
Lamb, Veal, and Game Products        3.940
Beef Products                        5.390
Name: value, dtype: float64

In [55]:

In [55]:

In [55]: result['Zinc, Zn'].sort_values().plot(kind='barh')
Out[55]: <matplotlib.axes._subplots.AxesSubplot at 0xea2e812710>

sort_values_barh.png

分组柱状图

书上那条指令会挤成一团，因为DataFrame的引用方式改了。

# 错误的挤成一团

In [2]: tips = pd.read_csv('ch08/tips.csv')

In [3]: party_counts = pd.crosstab(tips.day,tips.size)

In [4]: party_counts
Out[4]:
col_0  1708
day
Fri      19
Sat      87
Sun      76
Thur     62

# 正确引用

In [5]: party_counts = pd.crosstab(tips['day'],tips['size'])

In [6]: party_counts
Out[6]:
size  1   2   3   4  5  6
day
Fri   1  16   1   1  0  0
Sat   2  53  18  13  1  0
Sun   0  39  15  18  3  1
Thur  1  48   4   5  1  3

In [8]: party_counts.plot(kind='bar')
Out[8]: <matplotlib.axes._subplots.AxesSubplot at 0xe2f2f46160>

bar3.png

``

规格化为百分比的柱状图(和为1）

In [9]: party_pcts = party_counts.div(party_counts.sum(1).astype(float),axis=0)
   ...:

In [10]: party_pcts.plot(kind='bar',stacked = True)
Out[10]: <matplotlib.axes._subplots.AxesSubplot at 0xe2f3eea320>

bar4.png

4. 直方图

In [13]: tips['tips_pct'] = tips['tip'] / tips['total_bill']

In [14]: tips['tips_pct'].hist(bins=50)
Out[14]: <matplotlib.axes._subplots.AxesSubplot at 0xe2f682e978>

hist1.png

5. 密度图

核密度估计Kernel Density Estimation(KDE)

In [18]: tips['tips_pct'].plot(kind='kde')
Out[18]: <matplotlib.axes._subplots.AxesSubplot at 0xe2fa6dd7b8>

kde1.png

6. 双峰正态分布图

In [23]: comp1 = np.random.normal(0,1,size=200)

In [24]: comp2 = np.random.normal(10,2,size = 200)

In [25]: values = Series(np.concatenate([comp1,comp2]))

In [27]: values.hist(bins=100,alpha=0.3,color='k',normed = True)
Out[27]: <matplotlib.axes._subplots.AxesSubplot at 0xe2fa8f9780>

In [28]: values.plot(kind='kde',style='g--')
Out[28]: <matplotlib.axes._subplots.AxesSubplot at 0xe2fa8f9780>

double_normal.png

7. 散点图

In [29]: macro = pd.read_csv('ch08/macrodata.csv')

In [30]: data = macro[['cpi','m1','tbilrate','unemp']]

In [31]: trans_data = np.log(data).diff().dropna()

In [32]: trans_data[-5:]
Out[32]:
          cpi        m1  tbilrate     unemp
198 -0.007904  0.045361 -0.396881  0.105361
199 -0.021979  0.066753 -2.277267  0.139762
200  0.002340  0.010286  0.606136  0.160343
201  0.008419  0.037461 -0.200671  0.127339
202  0.008894  0.012202 -0.405465  0.042560

In [33]: plt.scatter(trans_data['m1'],trans_data['unemp'])
Out[33]: <matplotlib.collections.PathCollection at 0xe2fafd7710>

In [34]: plt.title('changes in log %s vs. log %s' % ('m1','unemp'))
Out[34]: Text(0.5,1,'changes in log m1 vs. log unemp')

scatter.png

一组数量的散点图，用于看规律。

In [39]: pd.scatter_matrix(trans_data,diagonal='kde',color = 'k',alpha=0.3)

scatter_matrix.png

2018.8.20

依旧是《用python进行数据分析》，这本书真好，卖力安利！亚马逊有kindle版本，可以用来搜索关键词。不过源码细节在pandas新版本有更改，以上是我调试过的可行代码。

其实是上周学的, 今天工作里也用上了，yeah~

网友评论

本文标题：pandas 0.23.4 中的绘图函数, Anaconda ‘

本文链接：https://www.haomeiwen.com/subject/vmswbftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

pandas 0.23.4 中的绘图函数, Anaconda ‘

1. 环境配置

2. 折线图

3. 柱状图

4. 直方图

5. 密度图

6. 双峰正态分布图

7. 散点图

相关文章

pandas 0.23.4 中的绘图函数, Anaconda ‘

可视化图表

python数据分析数据科学中文英文工具书籍下载-持续更新

Pandas笔记1-导入csv文件

python 可视化笔记

第八章绘图和可视化

1.pandas的安装与配置

利用Python进行数据分析（十三）之Pandas的绘图函数

04python数据分析处理库_pandas

pandas DataFrame索引行列

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读