美文网首页
Pandas数据分析-数据筛选Indexing/Selectio

Pandas数据分析-数据筛选Indexing/Selectio

作者: Mc杰夫 | 来源:发表于2022-05-16 16:13 被阅读0次

    (2022.05.16 Mon)
    Pandas的Series的选取需要根据index

    >> obj = pd.Series(np.arange(4.), index=['a', 'b', 'c', 'd'])
    >> obj
    a    0.0
    b    1.0
    c    2.0
    d    3.0
    dtype: float64
    

    Series可以通过index名字和序号两种方式索引。

    >> obj[3]
    3.0
    >> obj['a':'c']
    a    0.0
    b    1.0
    c    2.0
    dtype: float64
    >> obj[['b', 'a', 'd']] # 注意这里传递的是一个list
    b    1.0
    a    0.0
    d    3.0
    dtype: float64
    >> obj[[3, 1, 2]]
    d    3.0
    b    1.0
    c    2.0
    dtype: float64
    

    Pandas的DataFrame可以使用column和index number索引。

    >> data = pd.DataFrame(np.arange(16).reshape((4, 4)),
                           index=['Ohio', 'Colorado', 'Utah', 'New York'],
                           columns=['one', 'two', 'three', 'four'])
    >> data
               one two three four
    Ohio        0   1    2    3
    Colorado    4   5    6    7
    Utah        8   9    10   11
    New York    12  13   14   15
    

    指定DataFrame的某一/几列,使用该列的column name

    >> data['four']
    Ohio         3
    Colorado     7
    Utah        11
    New York    15
    Name: four, dtype: int64
    >> data[['four', 'one']]
             four one
    Ohio       3  0
    Colorado   7  4
    Utah      11  8
    New York  15  12
    

    选定行,可以使用index number

    >> data[:2]
    one two three   four
    Ohio    0   1   2   3
    Colorado    4   5   6   7
    >> data['three']>4
    Ohio        False
    Colorado     True
    Utah         True
    New York     True
    Name: three, dtype: bool
    >> data[data['three']>4]
               one two three four
    Colorado    4   5   6   7
    Utah        8   9   10  11
    New York    12  13  14  15
    

    也可以使用lociloc的方式索引,其中iloc表示用integer做索引筛选。注意最后一种条件索引方式。

    >> data.loc['Colorado', ['two', 'three']]
    two      5
    three    6
    Name: Colorado, dtype: int64
    >> data.iloc[[1,2], [3, 0, 1]]
        four    one two
    Colorado    7   4   5
    Utah    11  8   9
    >> data.iloc[2]
    one       8
    two       9
    three    10
    four     11
    Name: Utah, dtype: int64
    >> data[:'Utah', 'two']
    Ohio 0
    Colorado 5
    Utah 9
    Name: two, dtype: int64
    >> data.iloc[:, :3][data.three > 5] # *********
    one two three
    Colorado 0 5 6
    Utah 8 9 10
    New York 12 13 14
    

    Reference

    1 Python for Data Analysis, Wes McKinney

    相关文章

      网友评论

          本文标题:Pandas数据分析-数据筛选Indexing/Selectio

          本文链接:https://www.haomeiwen.com/subject/rjlwurtx.html