以csv实例文件操作插入DataFrame的列和行
文件名:example.csv
内容:
date | spring | summer | autumn | winter |
---|---|---|---|---|
2000 | 12.233881 | 16.907301 | 15.692383 | 14.085962 |
2001 | 12.847481 | 16.750469 | 14.514066 | 13.503746 |
2002 | 13.558175 | 17.203393 | 15.699948 | 13.233652 |
2003 | 12.654725 | 16.894915 | 15.661465 | 12.843479 |
2004 | 13.253730 | 17.046967 | 15.209054 | 14.364791 |
2005 | 13.444305 | 16.745982 | 16.622188 | 11.610823 |
2006 | 13.505696 | 16.833579 | 15.497928 | 12.199344 |
2007 | 13.488526 | 16.667733 | 15.817014 | 13.743822 |
2008 | 13.151532 | 16.486507 | 15.729573 | 12.932336 |
2009 | 13.457715 | 16.639238 | 18.260180 | 12.653159 |
2010 | 13.194548 | 16.728689 | 15.426353 | 13.883358 |
2011 | 14.347794 | 16.689421 | 14.176580 | 12.366542 |
2012 | 13.605087 | 17.130568 | 14.717968 | 13.292552 |
2013 | 13.027908 | 17.386193 | 16.203455 | 13.186121 |
2014 | 12.746682 | 16.544287 | 14.736768 | 12.870651 |
2015 | 13.465904 | 16.506123 | 12.442437 | 11.018138 |
插入列
先把数据按列分割,然后再把分出去的列重新插入原数据块中。
In [1]:
import numpy as np
import pandas as pd
table = pd.read_csv('example.csv')
table
Out[1]:
date spring summer autumn winter
0 2000 12.233881 16.907301 15.692383 14.085962
1 2001 12.847481 16.750469 14.514066 13.503746
2 2002 13.558175 17.203393 15.699948 13.233652
3 2003 12.654725 16.894915 15.661465 12.843479
4 2004 13.253730 17.046967 15.209054 14.364791
5 2005 13.444305 16.745982 16.622188 11.610823
6 2006 13.505696 16.833579 15.497928 12.199344
7 2007 13.488526 16.667733 15.817014 13.743822
8 2008 13.151532 16.486507 15.729573 12.932336
9 2009 13.457715 16.639238 18.260180 12.653159
10 2010 13.194548 16.728689 15.426353 13.883358
11 2011 14.347794 16.689421 14.176580 12.366542
12 2012 13.605087 17.130568 14.717968 13.292552
13 2013 13.027908 17.386193 16.203455 13.186121
14 2014 12.746682 16.544287 14.736768 12.870651
15 2015 13.465904 16.506123 12.442437 11.018138
In [2]:
date = table.pop('date')
date
Out[2]:
0 2000
1 2001
2 2002
3 2003
4 2004
5 2005
6 2006
7 2007
8 2008
9 2009
10 2010
11 2011
12 2012
13 2013
14 2014
15 2015
Name: date, dtype: int64
In [3]:
summer = table.pop('summer')
summer
Out[3]:
0 16.907301
1 16.750469
2 17.203393
3 16.894915
4 17.046967
5 16.745982
6 16.833579
7 16.667733
8 16.486507
9 16.639238
10 16.728689
11 16.689421
12 17.130568
13 17.386193
14 16.544287
15 16.506123
Name: summer, dtype: float64
In [4]:
winter = table.pop('winter')
winter
Out[4]:
0 14.085962
1 13.503746
2 13.233652
3 12.843479
4 14.364791
5 11.610823
6 12.199344
7 13.743822
8 12.932336
9 12.653159
10 13.883358
11 12.366542
12 13.292552
13 13.186121
14 12.870651
15 11.018138
Name: winter, dtype: float64
In [5]:
table
Out[5]:
spring autumn
0 12.233881 15.692383
1 12.847481 14.514066
2 13.558175 15.699948
3 12.654725 15.661465
4 13.253730 15.209054
5 13.444305 16.622188
6 13.505696 15.497928
7 13.488526 15.817014
8 13.151532 15.729573
9 13.457715 18.260180
10 13.194548 15.426353
11 14.347794 14.176580
12 13.605087 14.717968
13 13.027908 16.203455
14 12.746682 14.736768
15 13.465904 12.442437
分割完毕,现在要把各列重新插入,除在最右侧插入用标签直接创建外,其他列用.insert()方法进行插入。
In [6]:
table.insert(0,'date',date)
table
Out[6]:
date spring autumn
0 2000 12.233881 15.692383
1 2001 12.847481 14.514066
2 2002 13.558175 15.699948
3 2003 12.654725 15.661465
4 2004 13.253730 15.209054
5 2005 13.444305 16.622188
6 2006 13.505696 15.497928
7 2007 13.488526 15.817014
8 2008 13.151532 15.729573
9 2009 13.457715 18.260180
10 2010 13.194548 15.426353
11 2011 14.347794 14.176580
12 2012 13.605087 14.717968
13 2013 13.027908 16.203455
14 2014 12.746682 14.736768
15 2015 13.465904 12.442437
In [7]:
table.insert(2,'summer',summer)
table
Out[7]:
date spring summer autumn
0 2000 12.233881 16.907301 15.692383
1 2001 12.847481 16.750469 14.514066
2 2002 13.558175 17.203393 15.699948
3 2003 12.654725 16.894915 15.661465
4 2004 13.253730 17.046967 15.209054
5 2005 13.444305 16.745982 16.622188
6 2006 13.505696 16.833579 15.497928
7 2007 13.488526 16.667733 15.817014
8 2008 13.151532 16.486507 15.729573
9 2009 13.457715 16.639238 18.260180
10 2010 13.194548 16.728689 15.426353
11 2011 14.347794 16.689421 14.176580
12 2012 13.605087 17.130568 14.717968
13 2013 13.027908 17.386193 16.203455
14 2014 12.746682 16.544287 14.736768
15 2015 13.465904 16.506123 12.442437
In [8]:
table['winter'] = winter
table
Out[8]:
date spring summer autumn winter
0 2000 12.233881 16.907301 15.692383 14.085962
1 2001 12.847481 16.750469 14.514066 13.503746
2 2002 13.558175 17.203393 15.699948 13.233652
3 2003 12.654725 16.894915 15.661465 12.843479
4 2004 13.253730 17.046967 15.209054 14.364791
5 2005 13.444305 16.745982 16.622188 11.610823
6 2006 13.505696 16.833579 15.497928 12.199344
7 2007 13.488526 16.667733 15.817014 13.743822
8 2008 13.151532 16.486507 15.729573 12.932336
9 2009 13.457715 16.639238 18.260180 12.653159
10 2010 13.194548 16.728689 15.426353 13.883358
11 2011 14.347794 16.689421 14.176580 12.366542
12 2012 13.605087 17.130568 14.717968 13.292552
13 2013 13.027908 17.386193 16.203455 13.186121
14 2014 12.746682 16.544287 14.736768 12.870651
15 2015 13.465904 16.506123 12.442437 11.018138
插入行
目前来说我还没有找到一个直接插入行的函数或方法,所以用的方法是先切割,再拼接。
创建一个DataFrame准备插入odata中第2行与第3行之间,将odata分割为上下两段,利用append方法将它们拼接起来,注意参数中的ignore_index=True,如果不把这个参数设为True,新排的数据块索引不会重新排列。
In [9]:
insertRow = pd.DataFrame([[0.,0.,0.,0.,0.]],columns=['date','spring','summer','autumne','winter'])
above = table.loc[:2]
below = table.loc[3:]
newdata = above.append(insertRow,ignore_index=True).append(below,ignore_index=True)
newdata
Out[9]:
date spring summer autumne winter
0 2000 12.233881 16.907301 15.692383 14.085962
1 2001 12.847481 16.750469 14.514066 13.503746
2 2002 13.558175 17.203393 15.699948 13.233652
3 0 0.000000 0.000000 0.000000 0.000000
4 2003 12.654725 16.894915 15.661465 12.843479
5 2004 13.253730 17.046967 15.209054 14.364791
6 2005 13.444305 16.745982 16.622188 11.610823
7 2006 13.505696 16.833579 15.497928 12.199344
8 2007 13.488526 16.667733 15.817014 13.743822
9 2008 13.151532 16.486507 15.729573 12.932336
10 2009 13.457715 16.639238 18.260180 12.653159
11 2010 13.194548 16.728689 15.426353 13.883358
12 2011 14.347794 16.689421 14.176580 12.366542
13 2012 13.605087 17.130568 14.717968 13.292552
14 2013 13.027908 17.386193 16.203455 13.186121
15 2014 12.746682 16.544287 14.736768 12.870651
16 2015 13.465904 16.506123 12.442437 11.018138
也可以用.concat()的方法来进行拼接,注意ignore_index=True
In [10]:
newdata2=pd.concat([above,insert,below],ignore_index=True)
newdata2
Out[10]:
date spring summer autumne winter
0 2000 12.233881 16.907301 15.692383 14.085962
1 2001 12.847481 16.750469 14.514066 13.503746
2 2002 13.558175 17.203393 15.699948 13.233652
3 0 0.000000 0.000000 0.000000 0.000000
4 2003 12.654725 16.894915 15.661465 12.843479
5 2004 13.253730 17.046967 15.209054 14.364791
6 2005 13.444305 16.745982 16.622188 11.610823
7 2006 13.505696 16.833579 15.497928 12.199344
8 2007 13.488526 16.667733 15.817014 13.743822
9 2008 13.151532 16.486507 15.729573 12.932336
10 2009 13.457715 16.639238 18.260180 12.653159
11 2010 13.194548 16.728689 15.426353 13.883358
12 2011 14.347794 16.689421 14.176580 12.366542
13 2012 13.605087 17.130568 14.717968 13.292552
14 2013 13.027908 17.386193 16.203455 13.186121
15 2014 12.746682 16.544287 14.736768 12.870651
16 2015 13.465904 16.506123 12.442437 11.018138
网友评论