test

作者: 北有深巷 | 来源:发表于2018-10-20 14:20 被阅读0次

Download and prepare the dataset

We'll use a language dataset provided by http://www.manythings.org/anki/. This dataset contains language translation pairs in the format:

May I borrow this book? ¿Puedo tomar prestado este libro?

There are a variety of languages available, but we'll use the English-Spanish dataset. For convenience, we've hosted a copy of this dataset on Google Cloud, but you can also download your own copy. After downloading the dataset, here are the steps we'll take to prepare the data:

  1. Add a start and end token to each sentence.
  2. Clean the sentences by removing special characters.
  3. Create a word index and reverse word index (dictionaries mapping from word → id and id → word).
  4. Pad each sentence to a maximum length.

相关文章

  • 泰斯拓

    TEST test Test TEST test test test test test test test

  • makedown test

    test test test test test test test ####### test test test...

  • 无标题文章

    test test test test test test test test

  • 2019-01-14

    test test test test test test test test

  • test2

    test test test test test test

  • 简书

    简书 test test test test test test

  • Test

    Test test Test Test Test

  • 无标题文章

    test test test test test

  • 此处为标题?

    测试test测试test测试test测试test测试test测试test测试test测试test测试test测试t...

  • Mardown

    Mardown test+test+test+test+test

网友评论

      本文标题:test

      本文链接:https://www.haomeiwen.com/subject/xjcmzftx.html