One-Hot Encoding

作者: ramblelily | 来源:发表于2017-12-27 16:57 被阅读42次

    参考:
    What is One Hot Encoding? Why And When do you have to use it?
    preprocessing categorical features

    1. 一组数据经过 One-Hot Encoding 处理后的结果,可以清楚的看出One-Hot Encoding 具体的做的事情。
    One-Hot Encoding 处理前
    One-Hot Encoding 处理后

    这个过程可以用这句话概括:

    This estimator transforms each categorical feature with m possible values into m binary features, with only one active.

    2. 为什么需要 One-Hot Encoding

    对于类别,在向量化的时候会编码成数字,由于类别之间没有明确的数值关系,编码产生的数字,会默认给类编加上数值关系,如下所述:

    Let me explain: What this form of organization presupposes is VW > Acura > Honda based on the categorical values. Say supposing your model internally calculates average, then accordingly we get, 1+3 = 4/2 =2. This implies that: Average of VW and Honda is Acura. This is definitely a recipe for disaster. This model’s prediction would have a lot of errors.

    One-Hot Encoding 实际将类别信息二进制化, 如果属于相应类别,相应值为 1, 否则为 0, 这样避在编码类别时,引入无关的数值关系。

    相关文章

      网友评论

        本文标题:One-Hot Encoding

        本文链接:https://www.haomeiwen.com/subject/qbbugxtx.html