PolyLoss

作者: Valar_Morghulis | 来源:发表于2022-06-15 09:32 被阅读0次

    PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions

    ICLR2022

    https://arxiv.org/abs/2204.12511

    https://openreview.net/forum?id=gSdSJoenupI

    arxiv: 26 April, 2022

    openreview: 29 Sept 2021

    Authors: Zhaoqi Leng, Mingxing Tan, Chenxi Liu, Ekin Dogus Cubuk, Xiaojie Shi, Shuyang Cheng, Dragomir Anguelov

    Abstract: Cross-entropy loss and focal loss are the most common choices when training deep neural networks for classification problems. Generally speaking, however, a good loss function can take on much more flexible forms, and should be tailored for different tasks and datasets. Motivated by how functions can be approximated via Taylor expansion, we propose a simple framework, named PolyLoss, to view and design loss functions as a linear combination of polynomial functions. Our PolyLoss allows the importance of different polynomial bases to be easily adjusted depending on the targeting tasks and datasets, while naturally subsuming the aforementioned cross-entropy loss and focal loss as special cases. Extensive experimental results show that the optimal choice within the PolyLoss is indeed dependent on the task and dataset. Simply by introducing one extra hyperparameter and adding one line of code, our Poly-1 formulation outperforms the cross-entropy loss and focal loss on 2D image classification, instance segmentation, object detection, and 3D object detection tasks, sometimes by a large margin.

    摘要:交叉熵损失和焦点损失是训练深层神经网络解决分类问题时最常见的选择。然而,一般来说,一个好的损失函数可以采用更加灵活的形式,并且应该针对不同的任务和数据集进行定制。基于如何通过泰勒展开近似函数,我们提出了一个简单的框架,称为PolyLoss,将损失函数视为多项式函数的线性组合并进行设计。我们的PolyLoss允许根据目标任务和数据集轻松调整不同多项式基的重要性,同时自然地将上述交叉熵损失和焦点损失作为特殊情况纳入其中。大量的实验结果表明,在PolyLoss中的最优选择确实取决于任务和数据集。只需引入一个额外的超参数并添加一行代码,我们的Poly-1公式在二维图像分类、实例分割、对象检测和三维对象检测任务上的性能就超过了交叉熵损失和焦点损失,有时甚至超过了很大一部分。

    评审最终意见:意识到交叉熵损失和焦点损失被广泛用于深度学习模型的训练,但缺乏对这些损失的数学理解和探索,作者提出了一个简单的框架PolyLoss,将损失函数表示为多项式函数的线性组合。

    在该框架中,上述交叉熵损失和焦点损失是PolyLoss的特例,可以根据目标任务和数据集轻松调整不同多项式基的重要性。PolyLoss的最终版本Poly-1公式非常简单,只需一行代码和一个额外的超参数,但在二维图像分类、实例分割、目标检测和三维目标检测任务上,它的性能优于交叉熵损失和焦点损失,有时甚至会大幅度提高。

    本文以多项式展开的新观点为出发点。该方法新颖、实现简单、实用性强。作者与评论员进行了深入彻底的讨论,大多数问题都得到了很好的解决。在反驳和讨论之后,评审人员提高了分数,所有人都同意接受。AC检查了文件和所有相关信息,并找到了足够的理由进行验收。

    相关文章

      网友评论

          本文标题:PolyLoss

          本文链接:https://www.haomeiwen.com/subject/netxvrtx.html