独热编码 One Hot Encode with scikit-

作者: lvmao | 来源:发表于2018-06-05 17:07 被阅读0次

独热编码 One Hot Encode with scikit-
tf.one_hot()用法独热编码
通俗理解word2vec
Pytorch基本使用（2）Tensor转成one-hot形式
词嵌入（word embedding）简记
离散特征one-hot编码
独热编码（One-Hot Encoding）
独热编码（One-Hot Encoding）
独热编码(one-hot encoding)
One-hot与Word2Vec

关于one hot编码的由来、好处以及我们为什么要在机器学习中使用，可以在网上很多地方找到说明。我们这里看看怎么样使用scikit-learn来完成one hot 编码。
第一种方法

  from sklearn.preprocessing import LabelEncoder
  from sklearn.preprocessing import OneHotEncoder

首先需要使用 LabelEncoder() 将data 转换成数字的标签，然后使用 OneHotEncoder()

  data = ['北京', '上海', '广州', '成都', '杭州', '深圳']
  label_encoder = LabelEncoder()
  label_encoded = label_encoder.fit_transform(data)
  print(label_encoded)
  one_hot_encoder = OneHotEncoder()
  one_hot_encoded = one_hot_encoder.fit_transform(label_encoded.reshape(-1, 1)).toarray()
  print(one_hot_encoded)

输出的结果:

image.png

第二种方法：
scikit-learn 提供了第二种一步到位的方法。使用LabelBinarizer

  from sklearn.preprocessing import LabelBinarizer

  data = ['北京', '上海', '广州', '成都', '杭州', '深圳']
  label_binarizer = LabelBinarizer()
  one_hot_encoded = label_binarizer.fit_transform(data)
  print(one_hot_encoded)

输出结果:

image.png

我们也可以通过 one-hot encoded vector 找到原来的文本类别

  beijing = one_hot_encoded[[0]]
  print(label_binarizer.inverse_transform(beijing))

网友评论

本文标题：独热编码 One Hot Encode with scikit-

本文链接：https://www.haomeiwen.com/subject/ajkmsftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

独热编码 One Hot Encode with scikit-

相关文章

独热编码 One Hot Encode with scikit-

tf.one_hot()用法独热编码

通俗理解word2vec

Pytorch基本使用（2）Tensor转成one-hot形式

词嵌入（word embedding）简记

离散特征one-hot编码

独热编码（One-Hot Encoding）

独热编码（One-Hot Encoding）

独热编码(one-hot encoding)

One-hot与Word2Vec

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读