美文网首页蚂蚁支付风险比赛日志
lightgbm中的参数min_sum_hessian_in_l

lightgbm中的参数min_sum_hessian_in_l

作者: 欠我的都给我吐出来 | 来源:发表于2018-06-11 14:35 被阅读10次

    这个参数和xgboost中的min_child_weight类似。

    min_child_weight

    先说说xgboost中的min_child_weight参数。在官方文档中,解释如下
    minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be.
    这意味着这个参数决定了叶子节点中,样本的权重之和,如果在一次分裂中,叶子结点上所有樣本的權重和小于min_child_weight則停止分裂,能夠有效的防止過擬合,防止學到特殊樣本。

    For a regression, the loss of each point in a node is 线性回归问题的损失函数

    The second derivative of this expression with respect to yiyi is 1. So when you sum the second derivative over all points in the node, you get the number of points in the node. Here, min_child_weight means something like "stop trying to split once your sample size in a node goes below a given threshold".

    For a binary logistic regression, the hessian for each point in a node is going to contain terms like 逻辑回归问题的损失函数

    where σσ is the sigmoid function. Say you're at a pure node (e.g., all of the training examples in the node are 1's). Then all of the yi^ yi^'s will probably be large positive numbers, so all of the σ(yi^) σ(yi^)'s will be near 1, so all of the hessian terms will be near 0. Similar logic holds if all of the training examples in the node are 0. Here, min_child_weight means something like "stop trying to split once you reach a certain degree of purity in a node and your model can fit it".

    The Hessian's a sane thing to use for regularization and limiting tree depth. For regression, it's easy to see how you might overfit if you're always splitting down to nodes with, say, just 1 observation. Similarly, for classification, it's easy to see how you might overfit if you insist on splitting until each node is pure.
    总结来说,海森矩陣是一個明智的選擇,能夠起到正則化并且限制樹的深度防止過擬合的作用。对于回归,很容易看出如果你总是用一个观察值分解到节点,你可能会过度拟合。 同样,对于分类,如果你坚持分裂直到每个节点都是纯粹的,那麽也會導致過擬合的狀態。

    min_sum_hessian_in_leaf

    这个参数也是这个作用,默认的参数是1e-3,这个参数越大,则泛化能力越好;越小则叶子结点越纯粹,越容易过拟合。

    最后附带一个xgboost和lightgbm的参数对比


    xgboost和lightgbm的参数对比

    参考的链接

    Xgboost与Lightgbm参数对比
    Explanation of min_child_weight in xgboost algorithm

    相关文章

      网友评论

        本文标题:lightgbm中的参数min_sum_hessian_in_l

        本文链接:https://www.haomeiwen.com/subject/rzbceftx.html