日更80

作者: 深度学习模型优化 | 来源:发表于2019-07-04 06:00 被阅读0次

日更80
日更.80
日更80
日更80
原理 80/日更
日更80天
日更80天
建德——日更80
日更80无题
日更80天

今天偷个懒，看下TGS找盐比赛中的一些好的idea。

代码看这里: https://github.com/SeuTao/Kaggle_TGS2018_4th_solution

Solution development:

1.单模型设计:

input: 101 random pad to 128*128, random LRflip;
encoder: resnet34, se-resnext50, resnext101_ibna, se-resnet101, se-resnet152, se resnet154;
decoder: scse, hypercolumn (not used in network with resnext101ibna, seresnext101 backbone), ibn block, dropout;
Deep supervision structure with Lovasz softmax (a great idea from Heng);
We designed 6 single models for the final submission;

2. 模型训练:

SGD: momentum -- 0.9, weight decay -- 0.0002, lr -- from 0.01 to
0.001 (changed in each epoch);
LR schedule: cosine annealing with snapshot ensemble (shared by
Peter), 50 epochs/cycle, 7cycles/fold ，10fold;

3.模型集成: +0.001 in public LB/+0.001 in private LB

voting across all cycles

4. Post processing: +0.010 in public LB/+0.001 in private LB

According to the 2D and 3D jigsaw results (amazing ideas and great job from @CHAN), we applied around 10 handcraft rules that gave a 0.010~0.011 public LB boost and 0.001 private LB boost.

5.Data distill (Pseudo Labeling): +0.002 in public LB/+0.002 in private LB

We started to do this part since the middle of the competetion. As Heng posts, pseudo labeling is pretty tricky and has the risk of overfitting. I am not sure whether it would boost the private LB untill the result is published. I just post our results here https://github.com/SeuTao/Kaggle_TGS2018_4th_solution
, the implementation details will be updated.

6.Ideas that hadn't tried:

mean teacher: We have no time to do this experiment. I think mean
teacher + jigsaw + pseudo labeling is promising.

7. Ideas that didn't work:

oc module: The secret weapon of @alex's team. Can't get it work.

Related papers:

https://arxiv.org/abs/1608.03983 LR schedule
https://arxiv.org/abs/1411.5752 Hypercolumns
https://arxiv.org/abs/1712.04440 Data distillation
https://link.springer.com/chapter/10.1007/978-3-030-01225-0_29 IBN
https://arxiv.org/abs/1705.08790 Lovasz
https://arxiv.org/abs/1803.02579 Squeeze and excitation

网友评论

本文标题：日更80

本文链接：https://www.haomeiwen.com/subject/lgartctx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

日更80

Solution development:

1.单模型设计:

2. 模型训练:

3.模型集成: +0.001 in public LB/+0.001 in private LB

4. Post processing: +0.010 in public LB/+0.001 in private LB

5.Data distill (Pseudo Labeling): +0.002 in public LB/+0.002 in private LB

Related papers:

相关文章

日更80

日更.80

日更80

日更80

原理 80/日更

日更80天

日更80天

建德——日更80

日更80无题

日更80天

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读