Machine Learning笔记第01周

作者: 我的名字叫清阳 | 来源:发表于2016-01-23 12:15 被阅读839次

sklearn笔记1
Machine learning booooks
00 Machine Learning Introduction
【ML】Machine learning model
机器学习开篇
Machine Learning @ Python
机器学习概要 - supervised learning
The Fundamentals of Machine Lear
Coursera.MachineLearning.Week10
周志华推荐阅读材料

本学期学习Machine Learning。本课程在Udacity上有免费版本，并提供详细的笔记。本文的笔记是超级脱水版，目的是自用。

Week 01 tasks

Lectures: Decision Trees, Regression and Classification, and Neural Networks.
Reading: Chapters 1, 3, and 4 of Mitchell

SL1 Decision Trees

detailed notes

ID3 Algorithm for Decision Trees

Classification and regression

Classification is simply the process of taking some kind of input,
x, and mapping it to some discrete label.
And regression is mapping from some input space to some real
number

Quiz1: Supervised Learning

Classification learning terms

Instances is input;
Concept is the function to generate labels
Target concept: answer the question
Hypothesis: all possible concepts?
Sample: the training set
candidate: all concepts of the target concept
testing set
- TEsting set should never be the same as the training set to show generalization.

Decision trees

start with the root node
edges represent different choice
leaf is final output

Quiz 2: Representation

20 Question

20 Question algorithm

Quiz 3: Best Attribute

Decision Trees Expressiveness AND

Decision Trees Expressiveness OR

Decision Trees Expressiveness XOR

A and B are commutative so that switching the position of A and B on the trees above is OK

Decision Trees Expressiveness Or and XOR generalization

Linear tree: number of nodes = number of attributes
exponential tree: number of nodes grows exponentially to the number of attributes

Decision Tree Expressiveness Quiz 4 and 5

the number possible trees can be huge

ID3 algorithm

information gain:
What is Entropy? - a measure of randomness。 ∑P(v) log(P(v))

ID3 Bias

The inductive Bias of ID3 is (Preference bias)
- good splits at top
- correct over incorrect
- shorter trees

Quiz 6: Decision Trees Other Considerations

For continuous-valued Attributes, we can group them by a range
It does not make sense to repeat a discrete-valued attribute, but continuous attribute could be repeated if a different question is asked.

Decision Trees Other Considerations

all attributes are correctly classified: this can't happen if data is noisy.
Run out of attributes: if attributes are continuous, will never run out.
No overfitting: pruning.

Wrap up

SL2: Regression and Classification

Lesson 2 Notes

Linear regression overview

Recap: Supervised learning: learn from pairs of input and output, then given a new set of input, predict the output. This is mapping input to output. If the output is discrete, it's classification. If the output is continuous, it is Regression.

Quiz 1:

Originally, regression means regress to the mean

Regression and function approximation

regression now means the find the function to represent to the relationship of 2 variables.

Regression, the best line

the green line is the best fit linear line, but is it the best line?

Quiz 2: how to find the best line

Quiz 2: answer

the best constant function is the mean of Y.

Order of Polynomial

in the case, the error for order = 8 is zero

Order of PolyNomial: Error function

Problem: overfitting

Quiz 3: find best function

Polynomial Regression

here shows the polynomial regression represented by matrix and vectors.
the coefficient is computable

Errors

Sources of error

Sensor Error: The actual reading was 10, but a moth landed on the sensor so it read 0 instead.
Malicious Error: An intelligent malicious agent got in between the measurement and the receiver of the data, they edited the data to say what they wanted it to say, rather than what it actually was.
Transcription Error: A machine copied a number from one place to another and it flattened all of the E notation floats to a bare integer. Or a program cast a UTF16 hieroglyphic to a Unicode pile of poo.
Unmodeled influences: Suppose we are predicting house prices based on square footage and the number of bathrooms. The house price sold for very low value and the reason was that of an unmodeled influence, that there was mold in the attic and walls. The unmodeled influence caused the Machine Learning to fail at predicting a low house price.

Cross Validation

the goal is to generalize to the world, not to fit certain training or testing data set perfectly.
we need the training and testing data to be IID: independently identically distributed (Fundamental assumption)
We can split the data into folds and training while leaving one out and use the one for testing. then we average the error of all combinations. Pick the model with the lowest cross-validation error.

Fitting curve

Other input spaces

Recap Regression

SL3 Neural Networks

Lesson 3 Notes

Neural Networks

Gradient Descent

Neuro Networks

A neuron will get input if all the input reach the firing threshold, it will fire

Perceptron

X₁,X₂,... are inputs
w₁,w₂,... are weights
if the sum of all the weighted inputs is activation, if it passes a threshold θ, the output y=1; if not, y=0.

Quiz 1: output is given inputs and weights

How Powerful is a Perceptron Unit

weight matters a lot when deciding the line to split the plane.

Quiz 2: Neural network can represent AND

When X₁=0 and X₂=0, y=0
When X₁=0 and X₂=1, y=0
When X₁=1 and X₂=0, y=0
When X₁=1 and X₂=1, y=1; so y represents AND

Quiz 3: Neural network can represent OR

Quiz 4: Neural network can represent NOT

Quiz 5: Neural network can represent XOR

XOR = OR - 2 * AND

Perceptron Training

the Perceptron Rule updates weight with weight_change ( Δw_i). ( Δw_i) is defined by the learning rate, the difference between target and output and input.
if the data is linearly separable, Perceptron rule will find it, in finite iterations ( but it's hard to know how many iterations are needed)

Gradient Descent

Quiz 6: Comparison of Learning Rules Quiz

Sigmoid

Neural Network

Optimizing Weights

Restriction Bias

Preference Bias

Preference bias tells you something about the algorithm that you are using to learn.
Prefer simpler explanations
do not multiply unnecessarily ( not fitting the data).

Summary

这些内容本该是Jan 11 – 17, 2016之间完成的，但是因为准备面试，拖到了一周之后。于是不得不一周补两周的内容。现在去看本周内容了……下次不要拖了，一拖就压力陡增啊。

2016-01-21 看到 Cross Validation in SL2
2016-01-22 继续SL3，初稿完成。

网友评论

二十五岁的老奶奶:弱弱地问一句：这些东西怎么应用呢？应该在什么地方呢？
我的名字叫清阳:@二十五岁的老奶奶机器学习应用很广泛的，很多基金，风投都用它来预测投资风险。天气预报，人工智能，机器人都是机器学习的用武之地。医疗诊断上面也有应用。
二十五岁的老奶奶:@我的名字叫清阳你这么一解释好容易明白👍 那这个用模式推测行为好像很有用哦
我的名字叫清阳:@二十五岁的老奶奶用在数据分析上面。这些方法可以从数据中发现模式，并建立模型，通过模型预测行为。比如Amazon会根据你买过的东西向你推介新商品，就是从她的用户的消费行为中发现规律，然后根据你的行为，通过应用规律找到你可能心水的东西。

本文标题：Machine Learning笔记第01周

本文链接：https://www.haomeiwen.com/subject/ysuckttx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Machine Learning笔记第01周

SL1 Decision Trees