美文网首页
6种从Pandas 数据帧(DataFrame)中获取列名的方法

6种从Pandas 数据帧(DataFrame)中获取列名的方法

作者: python测试开发 | 来源:发表于2020-02-19 11:43 被阅读0次

    从CSV文件导入数据

    >>> import pandas as pd
    >>> df = pd.read_csv('UN98.csv', index_col=0)
    >>> df.head()
                    region   tfr  contraception  educationMale  ...  economicActivityMale  economicActivityFemale  illiteracyMale  illiteracyFemale
    Afghanistan       Asia  6.90            NaN            NaN  ...                  87.5                     7.2          52.800             85.00
    Albania         Europe  2.60            NaN            NaN  ...                   NaN                     NaN             NaN               NaN
    Algeria         Africa  3.81           52.0           11.1  ...                  76.4                     7.8          26.100             51.00
    American.Samoa    Asia   NaN            NaN            NaN  ...                  58.8                    42.4           0.264              0.36
    Andorra         Europe   NaN            NaN            NaN  ...                   NaN                     NaN             NaN               NaN
    
    [5 rows x 13 columns]
    

    注意index_col=0,表示用第一列作为index, 同时第一列不会出现在数据里面。'UN98.csv'可以在扣扣群630011153 144081101找到。

    获取列名的方法

    • df.columns
    >>> df.columns
    Index(['region', 'tfr', 'contraception', 'educationMale', 'educationFemale',
           'lifeMale', 'lifeFemale', 'infantMortality', 'GDPperCapita',
           'economicActivityMale', 'economicActivityFemale', 'illiteracyMale',
           'illiteracyFemale'],
          dtype='object')
    >>> 'tfr' in df.columns
    True
    
    • keys()
    >>> df.keys()
    Index(['region', 'tfr', 'contraception', 'educationMale', 'educationFemale',
           'lifeMale', 'lifeFemale', 'infantMortality', 'GDPperCapita',
           'economicActivityMale', 'economicActivityFemale', 'illiteracyMale',
           'illiteracyFemale'],
          dtype='object')
    
    • 通过迭代获取列名
    >>> for col_name in df.columns: 
    ...      print(col_name)
    ... 
    region
    tfr
    contraception
    educationMale
    educationFemale
    lifeMale
    lifeFemale
    infantMortality
    GDPperCapita
    economicActivityMale
    economicActivityFemale
    illiteracyMale
    illiteracyFemale
    
    
    • 通过获取列名为列表
    >>> list(df.columns)
    ['region', 'tfr', 'contraception', 'educationMale', 'educationFemale', 'lifeMale', 'lifeFemale', 'infantMortality', 'GDPperCapita', 'economicActivityMale', 'economicActivityFemale', 'illiteracyMale', 'illiteracyFemale']
    
    
    • tolist()转换列名为列表
    >>> df.columns.values.tolist()
    ['region', 'tfr', 'contraception', 'educationMale', 'educationFemale', 'lifeMale', 'lifeFemale', 'infantMortality', 'GDPperCapita', 'economicActivityMale', 'economicActivityFemale', 'illiteracyMale', 'illiteracyFemale']
    
    • sorted()可以获取字母排序的列名
    >>> sorted(df)
    ['GDPperCapita', 'contraception', 'economicActivityFemale', 'economicActivityMale', 'educationFemale', 'educationMale', 'illiteracyFemale', 'illiteracyMale', 'infantMortality', 'lifeFemale', 'lifeMale', 'region', 'tfr']
    

    参考资料

    根据列名获取列值

    >>> df['tfr'].values
    array([6.9 , 2.6 , 3.81,  nan,  nan, 6.69,  nan, 2.62, 1.7 , 1.89, 1.42,
           2.3 , 1.95, 2.97, 3.14, 1.73, 1.4 , 1.62, 3.66, 5.83, 5.89, 4.36,
           1.4 , 4.45, 2.17, 2.7 , 1.45, 6.57, 6.28, 4.5 , 5.3 , 1.61, 3.56,
           4.95, 5.51, 2.44, 1.8 , 2.69, 5.51, 5.87, 3.5 , 2.95, 1.6 , 1.55,
           2.31, 1.4 , 6.24, 1.82, 5.39,  nan, 2.8 , 4.32, 3.1 , 3.4 , 3.09,
           5.51, 5.34, 1.3 , 7.  , 2.76, 1.83, 1.63,  nan, 2.85, 5.4 , 5.2 ,
           8.  , 1.9 , 1.3 , 5.28, 1.38,  nan, 2.1 , 3.04, 4.9 , 6.61, 5.42,
           2.32, 4.6 , 4.3 , 1.32, 1.4 , 2.19, 3.07, 2.63, 4.77, 5.25, 1.8 ,
           2.75, 1.19, 5.1 , 2.44, 1.48, 5.13, 2.3 , 4.85, 3.8 , 2.1 , 1.65,
           2.77, 3.21, 6.69, 1.4 , 2.75, 4.86, 6.33, 5.92, 1.45, 1.5 , 1.76,
           1.6 , 1.9 , 5.65, 6.69, 3.24, 6.8 , 6.6 , 2.1 , 4.49, 2.  , 5.03,
           2.28, 2.75, 5.6 , 1.8 ,  nan, 3.27, 3.1 , 6.06, 3.3 , 4.9 , 4.95,
           1.55, 2.1 , 2.53, 2.02, 3.85, 7.1 , 5.97, 5.11, 1.88, 7.2 , 5.02,
           3.  , 2.63, 4.65, 4.17, 2.98, 3.62, 1.65, 1.48, 2.1 , 3.77, 2.1 ,
           1.4 , 1.35, 6.  , 2.63, 3.82, 3.8 ,  nan,  nan, 5.9 , 5.62, 2.59,
           6.06, 1.79, 1.5 , 1.3 , 4.98, 7.  , 3.81, 1.22, 2.1 , 3.86, 4.61,
           2.39, 4.46, 1.8 , 1.46, 4.  , 3.93, 5.48, 1.74, 6.08, 4.02, 2.1 ,
           2.92, 2.5 , 3.58,  nan, 7.1 , 1.38, 3.46, 1.72, 1.96, 2.25, 3.48,
           4.36, 2.98, 2.97, 3.03, 3.98, 7.6 , 1.8 , 5.49, 4.68])
    >>> list(df['tfr'].values)
    [6.9, 2.6, 3.81, nan, nan, 6.69, nan, 2.62, 1.7, 1.89, 1.42, 2.3, 1.95, 2.97, 3.14, 1.73, 1.4, 1.62, 3.66, 5.83, 5.89, 4.36, 1.4, 4.45, 2.17, 2.7, 1.45, 6.57, 6.28, 4.5, 5.3, 1.61, 3.56, 4.95, 5.51, 2.44, 1.8, 2.69, 5.51, 5.87, 3.5, 2.95, 1.6, 1.55, 2.31, 1.4, 6.24, 1.82, 5.39, nan, 2.8, 4.32, 3.1, 3.4, 3.09, 5.51, 5.34, 1.3, 7.0, 2.76, 1.83, 1.63, nan, 2.85, 5.4, 5.2, 8.0, 1.9, 1.3, 5.28, 1.38, nan, 2.1, 3.04, 4.9, 6.61, 5.42, 2.32, 4.6, 4.3, 1.32, 1.4, 2.19, 3.07, 2.63, 4.77, 5.25, 1.8, 2.75, 1.19, 5.1, 2.44, 1.48, 5.13, 2.3, 4.85, 3.8, 2.1, 1.65, 2.77, 3.21, 6.69, 1.4, 2.75, 4.86, 6.33, 5.92, 1.45, 1.5, 1.76, 1.6, 1.9, 5.65, 6.69, 3.24, 6.8, 6.6, 2.1, 4.49, 2.0, 5.03, 2.28, 2.75, 5.6, 1.8, nan, 3.27, 3.1, 6.06, 3.3, 4.9, 4.95, 1.55, 2.1, 2.53, 2.02, 3.85, 7.1, 5.97, 5.11, 1.88, 7.2, 5.02, 3.0, 2.63, 4.65, 4.17, 2.98, 3.62, 1.65, 1.48, 2.1, 3.77, 2.1, 1.4, 1.35, 6.0, 2.63, 3.82, 3.8, nan, nan, 5.9, 5.62, 2.59, 6.06, 1.79, 1.5, 1.3, 4.98, 7.0, 3.81, 1.22, 2.1, 3.86, 4.61, 2.39, 4.46, 1.8, 1.46, 4.0, 3.93, 5.48, 1.74, 6.08, 4.02, 2.1, 2.92, 2.5, 3.58, nan, 7.1, 1.38, 3.46, 1.72, 1.96, 2.25, 3.48, 4.36, 2.98, 2.97, 3.03, 3.98, 7.6, 1.8, 5.49, 4.68]
    
    

    重命名列名

    >>> df.rename(columns={'tfr': 'TFR'})
                    region   TFR  contraception  educationMale  ...  economicActivityMale  economicActivityFemale  illiteracyMale  illiteracyFemale
    Afghanistan       Asia  6.90            NaN            NaN  ...                  87.5                     7.2          52.800            85.000
    Albania         Europe  2.60            NaN            NaN  ...                   NaN                     NaN             NaN               NaN
    Algeria         Africa  3.81           52.0           11.1  ...                  76.4                     7.8          26.100            51.000
    American.Samoa    Asia   NaN            NaN            NaN  ...                  58.8                    42.4           0.264             0.360
    Andorra         Europe   NaN            NaN            NaN  ...                   NaN                     NaN             NaN               NaN
    ...                ...   ...            ...            ...  ...                   ...                     ...             ...               ...
    Western.Sahara  Africa  3.98            NaN            NaN  ...                   NaN                     NaN             NaN               NaN
    Yemen             Asia  7.60            7.0            NaN  ...                  80.6                     1.9          32.406            69.552
    Yugoslavia      Europe  1.80            NaN            NaN  ...                   NaN                     NaN           1.782             9.072
    Zambia          Africa  5.49           25.0            7.9  ...                   NaN                     NaN          14.400            28.700
    Zimbabwe        Africa  4.68           48.0            NaN  ...                  77.7                    46.7           9.600            20.100
    
    [207 rows x 13 columns]
    
    

    相关文章

      网友评论

          本文标题:6种从Pandas 数据帧(DataFrame)中获取列名的方法

          本文链接:https://www.haomeiwen.com/subject/ykaofhtx.html