美文网首页Pandas 教程集合我爱编程
Pandas.DataFrame插入列和行

Pandas.DataFrame插入列和行

作者: 16926b49840e | 来源:发表于2016-08-17 15:05 被阅读64099次

    以csv实例文件操作插入DataFrame的列和行
    文件名:example.csv
    内容:

    date spring summer autumn winter
    2000 12.233881 16.907301 15.692383 14.085962
    2001 12.847481 16.750469 14.514066 13.503746
    2002 13.558175 17.203393 15.699948 13.233652
    2003 12.654725 16.894915 15.661465 12.843479
    2004 13.253730 17.046967 15.209054 14.364791
    2005 13.444305 16.745982 16.622188 11.610823
    2006 13.505696 16.833579 15.497928 12.199344
    2007 13.488526 16.667733 15.817014 13.743822
    2008 13.151532 16.486507 15.729573 12.932336
    2009 13.457715 16.639238 18.260180 12.653159
    2010 13.194548 16.728689 15.426353 13.883358
    2011 14.347794 16.689421 14.176580 12.366542
    2012 13.605087 17.130568 14.717968 13.292552
    2013 13.027908 17.386193 16.203455 13.186121
    2014 12.746682 16.544287 14.736768 12.870651
    2015 13.465904 16.506123 12.442437 11.018138

    插入列

    先把数据按列分割,然后再把分出去的列重新插入原数据块中。

    In [1]:
    import numpy as np
    import pandas as pd
     
    table = pd.read_csv('example.csv')
    table
    
    Out[1]:
    date    spring  summer  autumn  winter
    0   2000    12.233881   16.907301   15.692383   14.085962
    1   2001    12.847481   16.750469   14.514066   13.503746
    2   2002    13.558175   17.203393   15.699948   13.233652
    3   2003    12.654725   16.894915   15.661465   12.843479
    4   2004    13.253730   17.046967   15.209054   14.364791
    5   2005    13.444305   16.745982   16.622188   11.610823
    6   2006    13.505696   16.833579   15.497928   12.199344
    7   2007    13.488526   16.667733   15.817014   13.743822
    8   2008    13.151532   16.486507   15.729573   12.932336
    9   2009    13.457715   16.639238   18.260180   12.653159
    10  2010    13.194548   16.728689   15.426353   13.883358
    11  2011    14.347794   16.689421   14.176580   12.366542
    12  2012    13.605087   17.130568   14.717968   13.292552
    13  2013    13.027908   17.386193   16.203455   13.186121
    14  2014    12.746682   16.544287   14.736768   12.870651
    15  2015    13.465904   16.506123   12.442437   11.018138
    
    In [2]:
    date = table.pop('date')
    date
    
    Out[2]:
    0     2000
    1     2001
    2     2002
    3     2003
    4     2004
    5     2005
    6     2006
    7     2007
    8     2008
    9     2009
    10    2010
    11    2011
    12    2012
    13    2013
    14    2014
    15    2015
    Name: date, dtype: int64
    
    In [3]:
    summer = table.pop('summer')
    summer
    
    Out[3]:
    0     16.907301
    1     16.750469
    2     17.203393
    3     16.894915
    4     17.046967
    5     16.745982
    6     16.833579
    7     16.667733
    8     16.486507
    9     16.639238
    10    16.728689
    11    16.689421
    12    17.130568
    13    17.386193
    14    16.544287
    15    16.506123
    Name: summer, dtype: float64
    
    In [4]:
    winter = table.pop('winter')
    winter
    
    Out[4]:
    0     14.085962
    1     13.503746
    2     13.233652
    3     12.843479
    4     14.364791
    5     11.610823
    6     12.199344
    7     13.743822
    8     12.932336
    9     12.653159
    10    13.883358
    11    12.366542
    12    13.292552
    13    13.186121
    14    12.870651
    15    11.018138
    Name: winter, dtype: float64
    
    In [5]:
    table
    
    Out[5]:
    spring  autumn
    0   12.233881   15.692383
    1   12.847481   14.514066
    2   13.558175   15.699948
    3   12.654725   15.661465
    4   13.253730   15.209054
    5   13.444305   16.622188
    6   13.505696   15.497928
    7   13.488526   15.817014
    8   13.151532   15.729573
    9   13.457715   18.260180
    10  13.194548   15.426353
    11  14.347794   14.176580
    12  13.605087   14.717968
    13  13.027908   16.203455
    14  12.746682   14.736768
    15  13.465904   12.442437
    

    分割完毕,现在要把各列重新插入,除在最右侧插入用标签直接创建外,其他列用.insert()方法进行插入。

    In [6]:
    table.insert(0,'date',date)
    table
    
    Out[6]:
    date    spring  autumn
    0   2000    12.233881   15.692383
    1   2001    12.847481   14.514066
    2   2002    13.558175   15.699948
    3   2003    12.654725   15.661465
    4   2004    13.253730   15.209054
    5   2005    13.444305   16.622188
    6   2006    13.505696   15.497928
    7   2007    13.488526   15.817014
    8   2008    13.151532   15.729573
    9   2009    13.457715   18.260180
    10  2010    13.194548   15.426353
    11  2011    14.347794   14.176580
    12  2012    13.605087   14.717968
    13  2013    13.027908   16.203455
    14  2014    12.746682   14.736768
    15  2015    13.465904   12.442437
    
    In [7]:
    table.insert(2,'summer',summer)
    table
    
    Out[7]:
    date    spring  summer  autumn
    0   2000    12.233881   16.907301   15.692383
    1   2001    12.847481   16.750469   14.514066
    2   2002    13.558175   17.203393   15.699948
    3   2003    12.654725   16.894915   15.661465
    4   2004    13.253730   17.046967   15.209054
    5   2005    13.444305   16.745982   16.622188
    6   2006    13.505696   16.833579   15.497928
    7   2007    13.488526   16.667733   15.817014
    8   2008    13.151532   16.486507   15.729573
    9   2009    13.457715   16.639238   18.260180
    10  2010    13.194548   16.728689   15.426353
    11  2011    14.347794   16.689421   14.176580
    12  2012    13.605087   17.130568   14.717968
    13  2013    13.027908   17.386193   16.203455
    14  2014    12.746682   16.544287   14.736768
    15  2015    13.465904   16.506123   12.442437
    
    In [8]:
    table['winter'] = winter
    table
    
    Out[8]:
    date    spring  summer  autumn  winter
    0   2000    12.233881   16.907301   15.692383   14.085962
    1   2001    12.847481   16.750469   14.514066   13.503746
    2   2002    13.558175   17.203393   15.699948   13.233652
    3   2003    12.654725   16.894915   15.661465   12.843479
    4   2004    13.253730   17.046967   15.209054   14.364791
    5   2005    13.444305   16.745982   16.622188   11.610823
    6   2006    13.505696   16.833579   15.497928   12.199344
    7   2007    13.488526   16.667733   15.817014   13.743822
    8   2008    13.151532   16.486507   15.729573   12.932336
    9   2009    13.457715   16.639238   18.260180   12.653159
    10  2010    13.194548   16.728689   15.426353   13.883358
    11  2011    14.347794   16.689421   14.176580   12.366542
    12  2012    13.605087   17.130568   14.717968   13.292552
    13  2013    13.027908   17.386193   16.203455   13.186121
    14  2014    12.746682   16.544287   14.736768   12.870651
    15  2015    13.465904   16.506123   12.442437   11.018138
    

    插入行

    目前来说我还没有找到一个直接插入行的函数或方法,所以用的方法是先切割,再拼接。

    创建一个DataFrame准备插入odata中第2行与第3行之间,将odata分割为上下两段,利用append方法将它们拼接起来,注意参数中的ignore_index=True,如果不把这个参数设为True,新排的数据块索引不会重新排列。

    In [9]:
    insertRow = pd.DataFrame([[0.,0.,0.,0.,0.]],columns=['date','spring','summer','autumne','winter'])
    above = table.loc[:2]
    below = table.loc[3:]
    newdata = above.append(insertRow,ignore_index=True).append(below,ignore_index=True)
    newdata
    
    Out[9]:
    date    spring  summer  autumne winter
    0   2000    12.233881   16.907301   15.692383   14.085962
    1   2001    12.847481   16.750469   14.514066   13.503746
    2   2002    13.558175   17.203393   15.699948   13.233652
    3   0   0.000000    0.000000    0.000000    0.000000
    4   2003    12.654725   16.894915   15.661465   12.843479
    5   2004    13.253730   17.046967   15.209054   14.364791
    6   2005    13.444305   16.745982   16.622188   11.610823
    7   2006    13.505696   16.833579   15.497928   12.199344
    8   2007    13.488526   16.667733   15.817014   13.743822
    9   2008    13.151532   16.486507   15.729573   12.932336
    10  2009    13.457715   16.639238   18.260180   12.653159
    11  2010    13.194548   16.728689   15.426353   13.883358
    12  2011    14.347794   16.689421   14.176580   12.366542
    13  2012    13.605087   17.130568   14.717968   13.292552
    14  2013    13.027908   17.386193   16.203455   13.186121
    15  2014    12.746682   16.544287   14.736768   12.870651
    16  2015    13.465904   16.506123   12.442437   11.018138
    

    也可以用.concat()的方法来进行拼接,注意ignore_index=True

    In [10]:
    newdata2=pd.concat([above,insert,below],ignore_index=True)
    newdata2
    
    Out[10]:
    date    spring  summer  autumne winter
    0   2000    12.233881   16.907301   15.692383   14.085962
    1   2001    12.847481   16.750469   14.514066   13.503746
    2   2002    13.558175   17.203393   15.699948   13.233652
    3   0   0.000000    0.000000    0.000000    0.000000
    4   2003    12.654725   16.894915   15.661465   12.843479
    5   2004    13.253730   17.046967   15.209054   14.364791
    6   2005    13.444305   16.745982   16.622188   11.610823
    7   2006    13.505696   16.833579   15.497928   12.199344
    8   2007    13.488526   16.667733   15.817014   13.743822
    9   2008    13.151532   16.486507   15.729573   12.932336
    10  2009    13.457715   16.639238   18.260180   12.653159
    11  2010    13.194548   16.728689   15.426353   13.883358
    12  2011    14.347794   16.689421   14.176580   12.366542
    13  2012    13.605087   17.130568   14.717968   13.292552
    14  2013    13.027908   17.386193   16.203455   13.186121
    15  2014    12.746682   16.544287   14.736768   12.870651
    16  2015    13.465904   16.506123   12.442437   11.018138
    

    相关文章

      网友评论

        本文标题:Pandas.DataFrame插入列和行

        本文链接:https://www.haomeiwen.com/subject/hbpmsttx.html