美文网首页
怎么one-hot/dummy编码?

怎么one-hot/dummy编码?

作者: 小幸运Q | 来源:发表于2020-06-29 13:14 被阅读0次

    import pandas as pd  
    df = pd.DataFrame([  
                ['green', 'M', 10.1, 'class1'],   
                ['red', 'L', 13.5, 'class2'],   
                ['blue', 'XL', 15.3, 'class1']])  
      
    df.columns = ['color', 'size', 'prize', 'class label']
    
    >>> print(df)
       color size  prize class label
    0  green    M   10.1      class1
    1    red    L   13.5      class2
    2   blue   XL   15.3      class1
    
    size_mapping = {  
               'XL': 3,  
               'L': 2,  
               'M': 1}  
    df['size'] = df['size'].map(size_mapping)  
      
    class_mapping = {label:idx for idx,label in enumerate(set(df['class label']))}  
    df['class label'] = df['class label'].map(class_mapping)  
    
    >>> print(df)
       color  size  prize  class label
    0  green     1   10.1            1
    1    red     2   13.5            0
    2   blue     3   15.3            1
    

    其实可以通过get_dummies生成同样的结果:

    >>> pd.get_dummies(df)
       size  prize  class label  color_blue  color_green  color_red
    0     1   10.1            1           0            1          0
    1     2   13.5            0           0            0          1
    2     3   15.3            1           1            0          0
    

    dummy会用全零代表一类,onehot不会

    相关文章

      网友评论

          本文标题:怎么one-hot/dummy编码?

          本文链接:https://www.haomeiwen.com/subject/jwslnctx.html