美文网首页
Classification- Decision Tree(1)

Classification- Decision Tree(1)

作者: 钱晓缺 | 来源:发表于2019-04-07 23:32 被阅读0次

1.  Decision Tree:

It is similar to the tree structure in the flow chart. Every node is a testing of one attribute, every branch is a output of one attribute. The top of the tree is root node.

2. Entropy

That means we need more information if a question is more uncertainty.

Bit is used to measure the amount of information.

More uncertainty, more entropy.

3. ID3(决策树归纳算法)

Information gain: Gain(A)=info(D)-infor_A(D)

Info(D) is the original data set, and infor_A(D) is the data set when make A as a node to classify.

If you find the Gain(X) is the max value of these attributes, then the X is the next node which classifies those data. And repeat this process,until all results are same in the one group.

4. How to deal with the continuous value.

Change them into Discrete values.

5. How to avoid over-fitting (The tree is too deep)

Tree pruning:

1)pruning first (pruning when classifying)

2)pruning late (pruning when finish classifying)

相关文章

网友评论

      本文标题:Classification- Decision Tree(1)

      本文链接:https://www.haomeiwen.com/subject/hlisiqtx.html