pandas教程系列二：读取文件

作者: HaigLee | 来源:发表于2019-11-30 17:51 被阅读0次

pandas教程系列二：读取文件
02. Pandas读取数据
pandas教程系列二（附加）：读取csv、tsv、txt文件
用python读写和处理csv文件
6 Pandas 读取数据
Pandas、Numpy和Matplotlib(知识点小结）
Pandas学习小结【2】解析CSV文件
pandas3
Python3分析CSV数据
python3 基于pandas读写Excel

作者：HaigLee
https://www.jianshu.com/u/67ec21fb270d
本文由 @HaigLee 原创发布。未经许可，禁止转载。

导入库：

import pandas as pd

读取xlsx文件：

people = pd.read_excel("./data/People.xlsx")

打印数据表纬度：行、列：

people.shape

(19972, 6)

查询数据表的列名：

people.columns

Index(['ID', 'Type', 'Title', 'FirstName', 'MiddleName', 'LastName'], dtype='object')

列出数据表前三行数据（不包括列名）：

people.head(3)

image.png

列出数据后5行数据：

people.tail(5)

image.png

读取文件列名展示, xlsx文件添加一行随意数据：

people_copy = pd.read_excel("./data/People_copy.xlsx")
people_copy.head()

image.png

打印列名，发现默认输出索引为0的数据作为列名称：

people_copy.columns

Index(['djsalf', 'dfalmklmm;mlm', 'Unnamed: 2', 'jknknkl', ' njknkjnk',
'mklmmk'],
dtype='object')

指定列名索引为1：

people_copy = pd.read_excel("./data/People_copy.xlsx",header=1)
people_copy.head()

image.png

打印列名称：

people_copy.columns

Index(['ID', 'Type', 'Title', 'FirstName', 'MiddleName', 'LastName'], dtype='object')

删除数据中的列名称，读取时，header设定为None：

people_copy_2 = pd.read_excel("./data/People_copy_2.xlsx",header=None)
people_copy_2.head()

image.png

输出默认索引为：从0开始的数字索引：

people_copy_2.columns

Int64Index([0, 1, 2, 3, 4, 5], dtype='int64')

可以人工进行列索引名称设置：

people_copy_2.columns = ['ID', 'Type', 'Title', 'FirstName', 'MiddleName', 'LastName']
people_copy_2.head()

image.png

修改默认索引，将ID设置为索引列：

people_copy_2.set_index("ID",inplace=True)
people_copy_2.head()

image.png

输出dataframe列名称：

people_copy_2.columns

Index(['Type', 'Title', 'FirstName', 'MiddleName', 'LastName'], dtype='object')

注：发现这里的coloums没有了ID，说明DataFrame是把index和columns区别对待的！

将dataframe写出到xlsx文件中：

people_copy_2.to_excel("./data/People_copy_2_output.xlsx")

注：带索引读取文件,如果不加index_col，则读取时会自动创建索引

读取文件时直接指定索引列：

df = pd.read_excel('./data/testOutputIndex.xlsx',index_col='id')
df.head()

image.png

文件写出，会把dataframe自动写出到文件，索引和dataframe保持一致，但是空值要重新设定，否则写出是空字符串：

df.to_excel('./data/testOutputIndex2.xlsx',na_rep='NaN')

作者：HaigLee
https://www.jianshu.com/u/67ec21fb270d
本文由 @HaigLee 原创发布。未经许可，禁止转载。

网友评论

本文标题：pandas教程系列二：读取文件

本文链接：https://www.haomeiwen.com/subject/qinowctx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

pandas教程系列二：读取文件

相关文章

pandas教程系列二：读取文件

02. Pandas读取数据

pandas教程系列二（附加）：读取csv、tsv、txt文件

用python读写和处理csv文件

6 Pandas 读取数据

Pandas、Numpy和Matplotlib(知识点小结）

Pandas学习小结【2】解析CSV文件

pandas3

Python3分析CSV数据

python3 基于pandas读写Excel

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读