pivot_table可以实现Excel数据透视表的功能,但是经过pivot_table函数处理后的dataframe的列是多层索引,需要转化成单层索引,具体步骤如下:
1、使用pivot_table函数,建立数据透视表
df_1为数据源,包含5列,amount列为数值列,用于聚合;
创建数据透视表,index=['group_series', 'buy_no'],columns=['xmonth'],NA值用0填充,结果如下:
data:image/s3,"s3://crabby-images/d28e8/d28e855a411ca928cb84350d6febf7f53200cf02" alt=""
df_1_out=pd.pivot_table(df_1, index=['group_series', 'buy_no'], columns=['xmonth'], values=['amount'], aggfunc='sum', fill_value=0)
df_1_out
data:image/s3,"s3://crabby-images/2619d/2619dc74e4927d7df932acaf7a3e12b0ab23e408" alt=""
2、列索引处理
# 查看列索引
df_1_out.columns
data:image/s3,"s3://crabby-images/6f8c0/6f8c007b144d79c191ef934bb4a9cade005bcb40" alt=""
# 查看一级列索引
df_1_out.columns.levels[0]
# 查看二级列索引
df_1_out.columns.levels[1]
# 过滤一级列索引
df_1_out.columns=df_1_out.columns.droplevel(level=0)
data:image/s3,"s3://crabby-images/343b2/343b2b7e5abecb9adc5303fcf7d54cafdf0e4e8e" alt=""
data:image/s3,"s3://crabby-images/c1b66/c1b66faf13ac68d2623d6392012b750c388a9fb6" alt=""
data:image/s3,"s3://crabby-images/f575c/f575cd57891e34a00de4f4024095e83059cf2778" alt=""
3、行索引处理
# 查看行索引
df_1_out.index
data:image/s3,"s3://crabby-images/e5086/e5086aeab52f86fafa023bd4e23b60cdd91ef242" alt=""
# 重置行索引
df_1_out.reset_index(inplace=True)
df_1_out.index
data:image/s3,"s3://crabby-images/0f8b9/0f8b93eefd8aead1a58d27e42e48a4db8f5e20ee" alt=""
最终数据结果如下:
data:image/s3,"s3://crabby-images/6269f/6269fef506fcdfa4750dcf73ef655003f34d4d6a" alt=""
pivot_table()简介:
DataFrame.pivot_table(values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')
values:要聚合的列或列的列表
index:数据透视的index,从原始数据的列中筛选
columns:数据透视表的columns,从原始数据的列中筛选
aggfunc:用于聚合的函数,默认为numpy,mean,支持numpy计算方法
fill_value:填充NA值,默认不填充
margins:添加行列的总计,默认不显示
dropna:如果整行都为NA值,则进行丢弃,默认丢弃
margins_name:在margins参数为true时,用来修改margins的名称
网友评论