UnsupervisedMT 架构的理解

作者: a711df54486d | 来源:发表于2018-12-18 22:15 被阅读0次

Hi, Mr Kim. Recently, I re-read the two papers
<Unsupervised Machine Translation Using Monolingual Corpora Only>
and
<Phrase based & Neural Unsupervised Machine Translation>
I have some ideas may not right and some questions.

The improvement from the first paper to the second paper comes from:
i) adding language model training before and during the MT training process, since the lm consists of encoder and decoder, so the better encoder and decoder parameters can help the translation results get more smooth.

ii) adding on-the-fly back translation

image.png

MT will be improved iteratively

Question:
i) in lm.py, the language model using shared layers (which is LSTM layer)of the encoder or decoder, so this means there is no implementation for Transformer-LM. Why? any reference says the RNN-LM works betters than Transformer-LM?

ii) in transformer.py

image.png

one_hot=True has not been implemented for transformer, why? I think Transformer should also use one-hot for loss and training

iii) in trainer.py

image.png
there are three ways to train the encoder/decoder LM, I did not see the reason why we need train lm_enc_rev, and also here we do not add_noise for LM training， which is different from

image.png

iiii) add_noise is only called in autoencoder training, not in lm training as written in main.py, though they may work the same but I think the authors should better notice this in this paper.

What's your thought on my ideas and questions, if you have comments, tell me please~

网友评论

本文标题：UnsupervisedMT 架构的理解

本文链接：https://www.haomeiwen.com/subject/ardlkqtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

UnsupervisedMT 架构的理解

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读