日更80

作者: 深度学习模型优化 | 来源:发表于2019-07-04 06:00 被阅读0次

    今天偷个懒,看下TGS找盐比赛中的一些好的idea。

    代码看这里: https://github.com/SeuTao/Kaggle_TGS2018_4th_solution

    Solution development:

    1.单模型设计:

    1. input: 101 random pad to 128*128, random LRflip;
    2. encoder: resnet34, se-resnext50, resnext101_ibna, se-resnet101, se-resnet152, se resnet154;
    3. decoder: scse, hypercolumn (not used in network with resnext101ibna, seresnext101 backbone), ibn block, dropout;
    4. Deep supervision structure with Lovasz softmax (a great idea from Heng);
    5. We designed 6 single models for the final submission;

    2. 模型训练:

    • SGD: momentum -- 0.9, weight decay -- 0.0002, lr -- from 0.01 to
      0.001 (changed in each epoch);
    • LR schedule: cosine annealing with snapshot ensemble (shared by
      Peter), 50 epochs/cycle, 7cycles/fold ,10fold;

    3.模型集成: +0.001 in public LB/+0.001 in private LB

    • voting across all cycles

    4. Post processing: +0.010 in public LB/+0.001 in private LB

    According to the 2D and 3D jigsaw results (amazing ideas and great job from @CHAN), we applied around 10 handcraft rules that gave a 0.010~0.011 public LB boost and 0.001 private LB boost.

    5.Data distill (Pseudo Labeling): +0.002 in public LB/+0.002 in private LB

    We started to do this part since the middle of the competetion. As Heng posts, pseudo labeling is pretty tricky and has the risk of overfitting. I am not sure whether it would boost the private LB untill the result is published. I just post our results here https://github.com/SeuTao/Kaggle_TGS2018_4th_solution
    , the implementation details will be updated.

    6.Ideas that hadn't tried:

    • mean teacher: We have no time to do this experiment. I think mean
      teacher + jigsaw + pseudo labeling is promising.

    7. Ideas that didn't work:

    • oc module: The secret weapon of @alex's team. Can't get it work.

    Related papers:

    相关文章

      网友评论

          本文标题:日更80

          本文链接:https://www.haomeiwen.com/subject/lgartctx.html