美文网首页
Walmet Sales Prediction(updating

Walmet Sales Prediction(updating

作者: Brian_mingzhi | 来源:发表于2020-03-26 23:51 被阅读0次

    参考kaggle notebook:
    keras

    一、题目

    1.项目题目:沃尔玛销量预测

    预测沃尔玛未来28天的销量

    2.评分标准:RMSSE

    RMSSE.png

    n为40341训练样本量,h为28天,Yt为实际销量值,Yt^为预测销量值

    3.数据描述

    数据有3049种产品,共3大类,7个部门,在3个洲的10个商场里销售

    sales_train.csv:这是主要的训练集,含有每个从2011-1-29到2016-5-22的1941天每天的(不包括到2016-6-19的28天)销量,含商品的ID,部门,分类,商店,洲.
    sell_prices.csv:商店的商品每周均价
    calendar.csv:日期的星期、月份、年和该洲是否允许用食品券(food stamp,低收入家庭的补助)购买

    二、正文

    1.导入数据

    #导入库
    import pandas as pd
    import seaborn as sns
    import lightgbm as lgb
    import numpy as np
    
    #导入数据 import data
    calendar = pd.read_csv('calendar.csv')
    sample_submission = pd.read_csv('sample_submission.csv')
    sales_train_validation = pd.read_csv('sales_train_validation.csv')
    sell_prices = pd.read_csv('sell_prices.csv')
    
    #减小内存占用 reduce the memory usage
    def reduce_mem_usage(df, verbose=True):
        numerics = ["int16", "int32", "int64", "float16", "float32", "float64"]
        start_mem = df.memory_usage().sum() / 1024 ** 2
        for col in df.columns:
            col_type = df[col].dtypes
            if col_type in numerics:
                c_min = df[col].min()
                c_max = df[col].max()
                if str(col_type)[:3] == "int":
                    if c_min > np.iinfo(np.int8).min and c_max < np.iinfo(np.int8).max:
                        df[col] = df[col].astype(np.int8)
                    elif c_min > np.iinfo(np.int16).min and c_max < np.iinfo(np.int16).max:
                        df[col] = df[col].astype(np.int16)
                    elif c_min > np.iinfo(np.int32).min and c_max < np.iinfo(np.int32).max:
                        df[col] = df[col].astype(np.int32)
                    elif c_min > np.iinfo(np.int64).min and c_max < np.iinfo(np.int64).max:
                        df[col] = df[col].astype(np.int64)
                else:
                    if (
                        c_min > np.finfo(np.float16).min
                        and c_max < np.finfo(np.float16).max
                    ):
                        df[col] = df[col].astype(np.float16)
                    elif (
                        c_min > np.finfo(np.float32).min
                        and c_max < np.finfo(np.float32).max
                    ):
                        df[col] = df[col].astype(np.float32)
                    else:
                        df[col] = df[col].astype(np.float64)
        end_mem = df.memory_usage().sum() / 1024 ** 2
        if verbose:
            print(
                "Mem. usage decreased to {:5.2f} Mb ({:.1f}% reduction)".format(
                    end_mem, 100 * (start_mem - end_mem) / start_mem
                )
            )
        return df
    
    #减小dataframe占用内存
    print("缩小前占用内存为:",sell_prices.memory_usage().sum() / (1024 ** 2), "MB")
    calendar = reduce_mem_usage(calendar)
    sample_submission = reduce_mem_usage(sample_submission)
    sales_train_validation = reduce_mem_usage(sales_train_validation)
    sell_prices = reduce_mem_usage(sell_prices)
    print("缩小后占用内存为:",sell_prices.memory_usage().sum() / (1024 ** 2), "MB")
    
    sales_train_validation.head()
    calendar.head()
    
    image.png

    相关文章

      网友评论

          本文标题:Walmet Sales Prediction(updating

          本文链接:https://www.haomeiwen.com/subject/nmonehtx.html