尝试gpu1:
base loss + no param loss + no L2 regularization loss: 77.42
1) param loss + L2 regularization loss 76.60
2) 77.21, weight 0.0001 77.43.
3) param loss + regularization+two different tf.layer.dense parameters (这个好像会overfit)
4) bpr loss + base loss 最高可以达到79.07,但是马上就下降了。
5) bpr loss + base loss + l2 regularzation loss + param loss
6) bpr loss + base loss + l2 regularzation loss + param loss + vse loss
我发现在不用0.001乘以loss时,收敛速度比较慢。E3在15epoch,才只有73%auc。
网友评论