作者:HaigLee
https://www.jianshu.com/u/67ec21fb270d
本文由 @HaigLee 原创发布。未经许可,禁止转载。
- 导入库:
import pandas as pd
- 读取xlsx文件:
people = pd.read_excel("./data/People.xlsx")
- 打印数据表纬度:行、列:
people.shape
(19972, 6)
- 查询数据表的列名:
people.columns
Index(['ID', 'Type', 'Title', 'FirstName', 'MiddleName', 'LastName'], dtype='object')
- 列出数据表前三行数据(不包括列名):
people.head(3)
![](https://img.haomeiwen.com/i9039829/787bf8664170086a.png)
- 列出数据后5行数据:
people.tail(5)
![](https://img.haomeiwen.com/i9039829/8b68b8030ce55a3b.png)
- 读取文件列名 展示, xlsx文件添加一行随意数据:
people_copy = pd.read_excel("./data/People_copy.xlsx")
people_copy.head()
![](https://img.haomeiwen.com/i9039829/471a1636c47d097b.png)
- 打印列名,发现默认输出索引为0的数据作为列名称:
people_copy.columns
Index(['djsalf', 'dfalmklmm;mlm', 'Unnamed: 2', 'jknknkl', ' njknkjnk',
'mklmmk'],
dtype='object')
- 指定列名索引为1:
people_copy = pd.read_excel("./data/People_copy.xlsx",header=1)
people_copy.head()
![](https://img.haomeiwen.com/i9039829/6e73dcb4e906c654.png)
- 打印列名称:
people_copy.columns
Index(['ID', 'Type', 'Title', 'FirstName', 'MiddleName', 'LastName'], dtype='object')
- 删除 数据中的列名称,读取时,header设定为None:
people_copy_2 = pd.read_excel("./data/People_copy_2.xlsx",header=None)
people_copy_2.head()
![](https://img.haomeiwen.com/i9039829/8326684a38f942b0.png)
- 输出默认索引为:从0开始的数字索引:
people_copy_2.columns
Int64Index([0, 1, 2, 3, 4, 5], dtype='int64')
- 可以人工进行列索引名称设置:
people_copy_2.columns = ['ID', 'Type', 'Title', 'FirstName', 'MiddleName', 'LastName']
people_copy_2.head()
![](https://img.haomeiwen.com/i9039829/73ac773e9c3a469f.png)
- 修改默认索引,将ID设置为索引列:
people_copy_2.set_index("ID",inplace=True)
people_copy_2.head()
![](https://img.haomeiwen.com/i9039829/dca12a951a036570.png)
- 输出dataframe列名称:
people_copy_2.columns
Index(['Type', 'Title', 'FirstName', 'MiddleName', 'LastName'], dtype='object')
注:发现这里的coloums没有了ID,说明DataFrame是把index和columns区别对待的!
- 将dataframe写出到xlsx文件中:
people_copy_2.to_excel("./data/People_copy_2_output.xlsx")
注:带索引读取文件,如果不加index_col,则读取时会自动创建索引
- 读取文件时直接指定索引列:
df = pd.read_excel('./data/testOutputIndex.xlsx',index_col='id')
df.head()
![](https://img.haomeiwen.com/i9039829/5399d9de0ec5c9f1.png)
- 文件写出,会把dataframe自动写出到文件,索引和dataframe保持一致,但是空值要重新设定,否则写出是空字符串:
df.to_excel('./data/testOutputIndex2.xlsx',na_rep='NaN')
作者:HaigLee
https://www.jianshu.com/u/67ec21fb270d
本文由 @HaigLee 原创发布。未经许可,禁止转载。
网友评论