美文网首页
pandas 中链式索引 选择数据1

pandas 中链式索引 选择数据1

作者: 筝韵徽 | 来源:发表于2019-01-14 17:31 被阅读83次
    import pandas as pd
    import numpy as np
    from tabulate import tabulate
    

    pandas 中链式索引 选择数据1

    df=pd.read_csv('data/sample_data.csv',index_col=0)
    
    df
    
    image.png
    • 链式索引选择数据,示例1
    df[['food','age','color']]['age']
    
    Jane         30
    Niko          2
    Aaron        12
    Penelope      4
    Dean         32
    Christina    33
    Cornelia     69
    Name: age, dtype: int64
    

    +链式索引选择数据,示例2

    df.loc[:,['food','color','age']]['age']
    
    Jane         30
    Niko          2
    Aaron        12
    Penelope      4
    Dean         32
    Christina    33
    Cornelia     69
    Name: age, dtype: int64
    
    • 链式索引选择数据,示例3
    df.loc[:,['food','color','age']].loc[:,'age']
    
    Jane         30
    Niko          2
    Aaron        12
    Penelope      4
    Dean         32
    Christina    33
    Cornelia     69
    Name: age, dtype: int64
    
    • 链式索引选择数据,示例4
    a=['food','color','age']
    b=['age']
    dd=df[a][b]
    dd
    
    image.png

    注意:上例返回的Dataframe,而不是Series

    • 链式索引选择数据,示例5
    a=['Niko','Dean'],['food','color','age']
    b=['age','color']
    dd=df.loc[a][b]
    dd
    
    image.png
    • 链式索引选择数据,使用iloc,示例6
    df.iloc[2:5].iloc[:,-3:]
    
    image.png
    • 链式索引选择数据,示例7
    df[df['age']>16]['score']
    
    Jane         4.6
    Dean         1.8
    Christina    9.5
    Cornelia     2.2
    Name: score, dtype: float64
    

    注意:以上示例都是非习惯用法,复杂度增加了不少
    下面演示习惯用法

    • 习惯用法1
      选择1列,如下
    df['age']
    
    Jane         30
    Niko          2
    Aaron        12
    Penelope      4
    Dean         32
    Christina    33
    Cornelia     69
    Name: age, dtype: int64
    
    • 习惯用法2
    df.loc[['Niko','Cornelia'],['height','color']]
    # df.loc[['Niko', 'Cornelia'], ['state', 'height', 'color']][['height', 'color']] -非习惯用法,尽量不用
    
    • 习惯用法3
    df.iloc[2:5,-3:] # 前行,后列
    
    image.png
    • 习惯用法4
    df.loc['Niko':'Dean',['age','food']]
    
    image.png
    • 习惯用法5
    df.loc[df['age']>16,'score']
    
    Jane         4.6
    Dean         1.8
    Christina    9.5
    Cornelia     2.2
    Name: score, dtype: float64
    

    链式索引调用不好的原因:
    原因1:
    例如: df.loc[['Aaron', 'Dean', 'Christina']][['age', 'food']]
    这个要执行2个操作
    1.df.loc[['Aaron', 'Dean', 'Christina']]
    2.在1的基础上进行[['age', 'food']] 操作
    执行2次

    df.loc[['Aaron', 'Dean', 'Christina'], ['age', 'food']]
    只执行1次
    原因2:
    产生副本,例子如下

    df[df['age'] > 10]['score'] = 99
    
    E:\software\anaconda3\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: 
    A value is trying to be set on a copy of a slice from a DataFrame.
    Try using .loc[row_indexer,col_indexer] = value instead
    
    See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
      """Entry point for launching an IPython kernel.
    
    df
    
    image.png

    赋值没有效果,实际上发生了如下操作

    df_tm=df[df['age']>10]
    df_tm['score']=99
    
    E:\software\anaconda3\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning: 
    A value is trying to be set on a copy of a slice from a DataFrame.
    Try using .loc[row_indexer,col_indexer] = value instead
    
    See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
    
    df_tm
    
    image.png
    df
    
    image.png

    **实际上给副本赋值了
    正确方式 如下
    **

    df.loc[df['age']>10,'score']=99
    df
    
    image.png
    
    
    
    

    相关文章

      网友评论

          本文标题:pandas 中链式索引 选择数据1

          本文链接:https://www.haomeiwen.com/subject/puejdqtx.html