美文网首页通过python看世界
Python之pandas导入导出数据

Python之pandas导入导出数据

作者: Brendansmisle | 来源:发表于2020-03-18 15:23 被阅读0次
1.导入pandas模块
>>> import pandas as pd
2.导入CSV表格数据
>>> titanic = pd.read_csv(r'C:\Users\Administrator\Desktop\titanic.csv')

pandas支持许多不同的文件格式或数据源(csv,excel,sql,json,parquet等),每种格式都有前缀read_*,将文件的数据读入pandas的DataFrame

3.查看导入数据,显示时DataFrame,默认情况下将显示前5行
>>> titanic
     PassengerId  Survived  Pclass  ...     Fare Cabin  Embarked
0              1         0       3  ...   7.2500   NaN         S
1              2         1       1  ...  71.2833   C85         C
2              3         1       3  ...   7.9250   NaN         S
3              4         1       1  ...  53.1000  C123         S
4              5         0       3  ...   8.0500   NaN         S
..           ...       ...     ...  ...      ...   ...       ...
886          887         0       2  ...  13.0000   NaN         S
887          888         1       1  ...  30.0000   B42         S
888          889         0       3  ...  23.4500   NaN         S
889          890         1       1  ...  30.0000  C148         C
890          891         0       3  ...   7.7500   NaN         Q
4.查看DataFrame的前8行,不指定行数,默认情况下将显示前5行
>>> titanic.head(8)
   PassengerId  Survived  Pclass  ...     Fare Cabin  Embarked
0            1         0       3  ...   7.2500   NaN         S
1            2         1       1  ...  71.2833   C85         C
2            3         1       3  ...   7.9250   NaN         S
3            4         1       1  ...  53.1000  C123         S
4            5         0       3  ...   8.0500   NaN         S
5            6         0       3  ...   8.4583   NaN         Q
6            7         0       1  ...  51.8625   E46         S
7            8         0       3  ...  21.0750   NaN         S

[8 rows x 12 columns]

查看末尾多少行,titanic.tail(10) 将返回DataFrame的最后10行

5.查看每列数据类型属性
>>> titanic.dtypes
PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

数据类型DataFrame为整数(int64),浮点数(float63)和字符串(object)

6.将数据存储到Excel文件中
>>> titanic.to_excel(r'C:\Users\Administrator\Desktop\titanic.xlsx',sheet_name='passengers',index=False)

sheet_name若不指定名称,则使用默认的Sheet1。通过设置 index=False行索引标签不会保存在电子表格中

7.导入Excel表格数据
>>> titanic = pd.read_excel(r'C:\Users\Administrator\Desktop\titanic.xlsx')

若表格中有多个sheet,则需要使用参数sheet_name='xxxx'指定

8.查看DataFrame的详细信息
>>> titanic.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 13 columns):
Unnamed: 0     891 non-null int64
PassengerId    891 non-null int64
Survived       891 non-null int64
Pclass         891 non-null int64
Name           891 non-null object
Sex            891 non-null object
Age            714 non-null float64
SibSp          891 non-null int64
Parch          891 non-null int64
Ticket         891 non-null object
Fare           891 non-null float64
Cabin          204 non-null object
Embarked       889 non-null object
dtypes: float64(2), int64(6), object(5)
memory usage: 90.6+ KB

相关文章

网友评论

    本文标题:Python之pandas导入导出数据

    本文链接:https://www.haomeiwen.com/subject/evoqyhtx.html