import pandas as pd
import numpy as np
from tabulate import tabulate
pandas 中链式索引 选择数据1
df=pd.read_csv('data/sample_data.csv',index_col=0)
df
image.png
- 链式索引选择数据,示例1
df[['food','age','color']]['age']
Jane 30
Niko 2
Aaron 12
Penelope 4
Dean 32
Christina 33
Cornelia 69
Name: age, dtype: int64
+链式索引选择数据,示例2
df.loc[:,['food','color','age']]['age']
Jane 30
Niko 2
Aaron 12
Penelope 4
Dean 32
Christina 33
Cornelia 69
Name: age, dtype: int64
- 链式索引选择数据,示例3
df.loc[:,['food','color','age']].loc[:,'age']
Jane 30
Niko 2
Aaron 12
Penelope 4
Dean 32
Christina 33
Cornelia 69
Name: age, dtype: int64
- 链式索引选择数据,示例4
a=['food','color','age']
b=['age']
dd=df[a][b]
dd
image.png
注意:上例返回的Dataframe,而不是Series
- 链式索引选择数据,示例5
a=['Niko','Dean'],['food','color','age']
b=['age','color']
dd=df.loc[a][b]
dd
image.png
- 链式索引选择数据,使用iloc,示例6
df.iloc[2:5].iloc[:,-3:]
image.png
- 链式索引选择数据,示例7
df[df['age']>16]['score']
Jane 4.6
Dean 1.8
Christina 9.5
Cornelia 2.2
Name: score, dtype: float64
注意:以上示例都是非习惯用法,复杂度增加了不少
下面演示习惯用法
- 习惯用法1
选择1列,如下
df['age']
Jane 30
Niko 2
Aaron 12
Penelope 4
Dean 32
Christina 33
Cornelia 69
Name: age, dtype: int64
- 习惯用法2
df.loc[['Niko','Cornelia'],['height','color']]
# df.loc[['Niko', 'Cornelia'], ['state', 'height', 'color']][['height', 'color']] -非习惯用法,尽量不用
- 习惯用法3
df.iloc[2:5,-3:] # 前行,后列
image.png
- 习惯用法4
df.loc['Niko':'Dean',['age','food']]
image.png
- 习惯用法5
df.loc[df['age']>16,'score']
Jane 4.6
Dean 1.8
Christina 9.5
Cornelia 2.2
Name: score, dtype: float64
链式索引调用不好的原因:
原因1:
例如: df.loc[['Aaron', 'Dean', 'Christina']][['age', 'food']]
这个要执行2个操作
1.df.loc[['Aaron', 'Dean', 'Christina']]
2.在1的基础上进行[['age', 'food']] 操作
执行2次
而
df.loc[['Aaron', 'Dean', 'Christina'], ['age', 'food']]
只执行1次
原因2:
产生副本,例子如下
df[df['age'] > 10]['score'] = 99
E:\software\anaconda3\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
"""Entry point for launching an IPython kernel.
df
image.png
赋值没有效果,实际上发生了如下操作
df_tm=df[df['age']>10]
df_tm['score']=99
E:\software\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df_tm
image.png
df
image.png
**实际上给副本赋值了
正确方式 如下
**
df.loc[df['age']>10,'score']=99
df
image.png
网友评论