Data preparation

作者: 瑶瑶_2930 | 来源:发表于2020-07-01 15:54 被阅读0次

Data preparation
Python-110 Two Sample T-Test in
Python-111 Welch’s t-test in Pyt
Regression-房价预测-(3/4)-模型测试
PySpark NoteBook-9:GLM
理解wav2letter++ tutorial
基因家族专题(3)：基因家族成员的鉴定
准备
preparation
how to make a sandwich

image.png
很好的UCI数据集使用指南
三个UCI数据集整理代码资源下载链接
148个整理好的UCI数据集及代码下载链接
如果有需要的时候可以去看看有没有合适的
UCI下载的数据集都需要自己转换成其他格式，显得有点乱，可以去kaggle上找找有没有同名的数据集，是整理好了的
csv.文件（逗号分隔值文件格式）

Useful functions

type() called on any Python object describes the type of the object
dataframe[4:7] pulls out rows 4, 5, 6 in a Pandas dataframe
dataframe[['mycol1', 'mycol2']] pulls out the two requested columns into a new Pandas dataframe
dataframe['mycol'] returns a Pandas series -- not a dataframe!
dataframe.describe() prints out statistics for each dataframe column

Analyze the following code
categorical_feature_names = list(set(feature_names) - set(numeric_feature_names) - set([LABEL]))
feature_names is list type.
因此是从feature_namesz中减去numeric_feature_names再减去label。然后转换为原来的list

set()

Definition: unordered collection; set element is unique, immutable.
{xx, xx, xx} curly braces
python set operations eg. difference A-B

Use Pandas to inspect the data and manually curate a list of numeric_feature_names and categorical_feature_names

image.png

把字符串的和数字的分开
尽管csv文件中是数字，但好像还是字符串的数字，所以需要转变一下

for feature_name in numeric_feature_names + [LABEL]:
  car_data[feature_name] = pd.to_numeric(car_data[feature_name], errors='coerce')

这也是一个很好的学习资料
鸢尾花分类

有一些函数看得不是很懂，以及一些架构，希望后面好好看看

Normalization

image.png

Add categorical data and numerical data

See my github

网友评论

本文标题：Data preparation

本文链接：https://www.haomeiwen.com/subject/dbsfqktx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Data preparation

Useful functions

set()

Use Pandas to inspect the data and manually curate a list of numeric_feature_names and categorical_feature_names

Normalization

Add categorical data and numerical data

相关文章

Data preparation

Python-110 Two Sample T-Test in

Python-111 Welch’s t-test in Pyt

Regression-房价预测-(3/4)-模型测试

PySpark NoteBook-9:GLM

理解wav2letter++ tutorial

基因家族专题(3)：基因家族成员的鉴定

准备

preparation

how to make a sandwich

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读