美文网首页
pandas多索引(MultiIndex)简介

pandas多索引(MultiIndex)简介

作者: python测试开发 | 来源:发表于2020-02-26 16:29 被阅读0次

    pandas通常具有“索引”,即用一列每一行提供名称。 它像数据库表中的主键一样工作。 Pandas还支持MultiIndex,其中行的索引是几列的复合键。

    从CSV文件创建未索引的DataFrame

    >>> import pandas, io
    >>> data = io.StringIO('''Fruit,Color,Count,Price
    ... Apple,Red,3,$1.29
    ... Apple,Green,9,$0.99
    ... Pear,Red,25,$2.59
    ... Pear,Green,26,$2.79
    ... Lime,Green,99,$0.39
    ... ''')
    >>> df_unindexed = pandas.read_csv(data)
    >>> df_unindexed
       Fruit  Color  Count  Price
    0  Apple    Red      3  $1.29
    1  Apple  Green      9  $0.99
    2   Pear    Red     25  $2.59
    3   Pear  Green     26  $2.79
    4   Lime  Green     99  $0.39
    >>> df = df_unindexed.set_index(['Fruit', 'Color'])
    >>> df
                 Count  Price
    Fruit Color
    Apple Red        3  $1.29
          Green      9  $0.99
    Pear  Red       25  $2.59
          Green     26  $2.79
    Lime  Green     99  $0.39
    >>>
    >>>
    >>> df.xs('Apple')
           Count  Price
    Color
    Red        3  $1.29
    Green      9  $0.99
    >>>
    >>> df.xs('Red', level='Color')
           Count  Price
    Fruit
    Apple      3  $1.29
    Pear      25  $2.59
    >>> df.loc['Apple', :]
           Count  Price
    Color
    Red        3  $1.29
    Green      9  $0.99
    >>>
    >>>
    >>> df.loc[('Apple', 'Red'), :]
    Count        3
    Price    $1.29
    Name: (Apple, Red), dtype: object
    >>>
    
    

    https://www.somebits.com/~nelson/pandas-multiindex-slice-demo.html

    pandas.DataFrame.xs

    此方法采用关键参数来选择MultiIndex特定级别的数据,实际上也适用于单列索引,用于通过索引的方式访问行,和loc类似。

    >>> d = {'num_legs': [4, 4, 2, 2],
    ...      'num_wings': [0, 0, 2, 2],
    ...      'class': ['mammal', 'mammal', 'mammal', 'bird'],
    ...      'animal': ['cat', 'dog', 'bat', 'penguin'],
    ...      'locomotion': ['walks', 'walks', 'flies', 'walks']}
    >>> df = pd.DataFrame(data=d)
    >>> df
       num_legs  num_wings   class   animal locomotion
    0         4          0  mammal      cat      walks
    1         4          0  mammal      dog      walks
    2         2          2  mammal      bat      flies
    3         2          2    bird  penguin      walks
    >>> df = df.set_index(['class', 'animal', 'locomotion'])
    >>> df
                               num_legs  num_wings
    class  animal  locomotion
    mammal cat     walks              4          0
           dog     walks              4          0
           bat     flies              2          2
    bird   penguin walks              2          2
    >>> df.xs('mammal')
                       num_legs  num_wings
    animal locomotion
    cat    walks              4          0
    dog    walks              4          0
    bat    flies              2          2
    >>> df.xs(('mammal', 'dog'))
    sys:1: PerformanceWarning: indexing past lexsort depth may impact performance.
                num_legs  num_wings
    locomotion
    walks              4          0
    >>> df.xs('cat', level=1)
                       num_legs  num_wings
    class  locomotion
    mammal walks              4          0
    >>> df.xs(('bird', 'walks'),level=[0, 'locomotion'])
             num_legs  num_wings
    animal
    penguin         2          2
    >>> df.xs('num_wings', axis=1)
    class   animal   locomotion
    mammal  cat      walks         0
            dog      walks         0
            bat      flies         2
    bird    penguin  walks         2
    Name: num_wings, dtype: int64
    

    相关文章

      网友评论

          本文标题:pandas多索引(MultiIndex)简介

          本文链接:https://www.haomeiwen.com/subject/htbvchtx.html