读取数据
读取csv
df = pd.read_csv('x.csv')
读取压缩包
import zipfile
with zipfile.ZipFile('x.csv.zip', 'r') as z:
f = z.open('x.csv')
df = pd.read_csv(f, header=0)
保存为csv
out_df.to_csv('predict_result.csv', encoding='utf-8', index=0, header=None)
# index=0即不要序号的列,header=None即不要表头
查看数据
查看数据类型
df2.dtypes
Out[30]:
A float64
B datetime64[ns]
C float32
D int32
E category
F object
dtype: object
查看head和tail
df.head(1)
df.tail(3)
判断是否为空
df.empty
查看index、column和数据
获取表头df.columns
df.index
df.columns
df.values
df.count
显示数据的快速统计
df.describe()
Out[19]:
A B C D
count 6.000000 6.000000 6.000000 6.000000
mean 0.073711 -0.431125 -0.687758 -0.233103
std 0.843157 0.922818 0.779887 0.973118
min -0.861849 -2.104569 -1.509059 -1.135632
25% -0.611510 -0.600794 -1.368714 -1.076610
50% 0.022070 -0.228039 -0.767252 -0.386188
75% 0.658444 0.041933 -0.034326 0.461706
max 1.212112 0.567020 0.276232 1.071804
找缺失值
https://blog.csdn.net/u012387178/article/details/52571725
print(df_base_dpd[df_base_dpd.isnull().values == True])
网友评论