美文网首页
pandas dataframe中的数据选择4

pandas dataframe中的数据选择4

作者: 筝韵徽 | 来源:发表于2019-01-07 10:24 被阅读92次
    import pandas as pd
    import numpy as np
    
    df= pd.read_csv('data/sample_data.csv',index_col=0)
    df
    
             state  color   food    age height  score
    Jane      NY    blue    Steak   30  165     4.6
    Niko      TX    green   Lamb    2   70      8.3
    Aaron     FL    red     Mango   12  120     9.0
    Penelope  AL    white   Apple   4   80      3.3
    Dean      AK    gray    Cheese  32  180     1.8
    Christina  TX   black   Melon   33  172     9.5
    Cornelia    TX  red     Beans   69  150      2.2
    

    对比

    df[2:6]
    
    image.png

    上边的例子选择了行数字索引2--6的数据

    df['Jane':'Aaron']
    
    image.png

    上边的例子选择了行Jane--Aaron的数据

    df.iloc[2:6]
    
    image.png
    df.loc['Jane':'Aaron']
    
    image.png

    显然下边的用法更加清晰
    同理对比一下Series

    f=df['food']
    
    f
    
    Jane          Steak
    Niko           Lamb
    Aaron         Mango
    Penelope      Apple
    Dean         Cheese
    Christina     Melon
    Cornelia      Beans
    Name: food, dtype: object
    
    f[2:4]
    
    Aaron       Mango
    Penelope    Apple
    Name: food, dtype: object
    
    f.iloc[2:4]
    
    Aaron       Mango
    Penelope    Apple
    Name: food, dtype: object
    
    f['Niko':'Dean']
    
    Niko          Lamb
    Aaron        Mango
    Penelope     Apple
    Dean        Cheese
    Name: food, dtype: object
    
    f.loc['Niko':'Dean']
    
    Niko          Lamb
    Aaron        Mango
    Penelope     Apple
    Dean        Cheese
    Name: food, dtype: object
    
    f['Dean']
    
    'Cheese'
    
    f.loc['Dean']
    
    'Cheese'
    
    f[['Dean','Jane','Niko']]
    
    Dean    Cheese
    Jane     Steak
    Niko      Lamb
    Name: food, dtype: object
    
    f.loc[['Dean','Jane','Niko']]
    
    Dean    Cheese
    Jane     Steak
    Niko      Lamb
    Name: food, dtype: object
    

    说明pandas 是灵活的,但是为了记忆及使用方便,统一都使用iloc,loc
    默认情况下从csv读取数据生成的默认行索引

    df1=pd.read_csv('data/sample_data.csv')
    
    df1
    
    image.png
    df1.index
    
    RangeIndex(start=0, stop=7, step=1)
    
    df1.loc[[1,3,5],['food','state']]
    
    image.png
    df1.iloc[[1,3,5],[3,1]]
    
    image.png

    上边是使用名字索引,还是使用数字索引比较清晰,一看就知道,不用进行思考

    df1.iloc[:3]
    
    image.png
    df1.loc[:3]
    
    image.png

    注意上两个例子的区别,在进行切片选择是,iloc不包含最后一个,是[) 前闭后开区间, 而 loc 则是[] 闭区间

    相关文章

      网友评论

          本文标题:pandas dataframe中的数据选择4

          本文链接:https://www.haomeiwen.com/subject/cebyrqtx.html