美文网首页
UnsupervisedMT 架构的理解

UnsupervisedMT 架构的理解

作者: a711df54486d | 来源:发表于2018-12-18 22:15 被阅读0次

    Hi, Mr Kim. Recently, I re-read the two papers
    <Unsupervised Machine Translation Using Monolingual Corpora Only>
    and
    <Phrase based & Neural Unsupervised Machine Translation>
    I have some ideas may not right and some questions.

    The improvement from the first paper to the second paper comes from:
    i) adding language model training before and during the MT training process, since the lm consists of encoder and decoder, so the better encoder and decoder parameters can help the translation results get more smooth.


    ii) adding on-the-fly back translation


    image.png

    MT will be improved iteratively


    Question:
    i) in lm.py, the language model using shared layers (which is LSTM layer)of the encoder or decoder, so this means there is no implementation for Transformer-LM. Why? any reference says the RNN-LM works betters than Transformer-LM?


    ii) in transformer.py


    image.png

    one_hot=True has not been implemented for transformer, why? I think Transformer should also use one-hot for loss and training


    iii) in trainer.py

    image.png
    there are three ways to train the encoder/decoder LM, I did not see the reason why we need train lm_enc_rev, and also here we do not add_noise for LM training, which is different from
    image.png

    iiii) add_noise is only called in autoencoder training, not in lm training as written in main.py, though they may work the same but I think the authors should better notice this in this paper.

    What's your thought on my ideas and questions, if you have comments, tell me please~

    相关文章

      网友评论

          本文标题:UnsupervisedMT 架构的理解

          本文链接:https://www.haomeiwen.com/subject/ardlkqtx.html