![](https://img.haomeiwen.com/i13669885/09f674f67a7b0fb0.png)
衡量标准——熵
熵:熵是表示随机变量不确定性的度量(解释:说明物体内部的混乱程度)
公式:H(X) = - ΣPi * logPi , i = 1,2……n
![](https://img.haomeiwen.com/i13669885/432a14ea26314cd5.png)
![](https://img.haomeiwen.com/i13669885/57b816fd33620407.png)
![](https://img.haomeiwen.com/i13669885/b3738650c1604ef1.png)
![](https://img.haomeiwen.com/i13669885/f636c1d13b057302.png)
![](https://img.haomeiwen.com/i13669885/4febdcf685b2e57a.png)
![](https://img.haomeiwen.com/i13669885/559901c82c4fbccd.png)
剪枝处理——防止过拟合
- 预剪枝:是指决策树生成过程中,对每个节点在划分前进行估计,若当前节点的划分不能带来决策树返话费能力的提升,则停止划分并将当前节点标记为叶结点————>缺点(有可能欠拟合)
-
后剪枝:先从训练集生成一棵完整的决策树,然后自底而上地对非叶节点进行考察,若将该点对应的子树换为叶节点能带来决策树泛化能力的提升,则该子树替换为叶节点————>缺点(训练时间长)
image.png
集成算法
![](https://img.haomeiwen.com/i13669885/bfd7980959f35206.png)
![](https://img.haomeiwen.com/i13669885/4fd092141ae352d2.png)
![](https://img.haomeiwen.com/i13669885/7ae29f7c4f4a9f03.png)
![](https://img.haomeiwen.com/i13669885/1bd831b8095ef393.png)
![](https://img.haomeiwen.com/i13669885/19d47b4a9cd1d396.png)
![](https://img.haomeiwen.com/i13669885/cd343f5c8ee1eb70.png)
网友评论