美文网首页
语料库与术语库

语料库与术语库

作者: windfunkey | 来源:发表于2021-11-06 10:06 被阅读0次

    在线语料库(国内)

    1. 语料库:http://yulk.org/
    2. BCC语料库:http://bcc.blcu.edu.cn/
    3. 语料库在线:http://www.cncorpus.org/
    4. 北京大学中国语言学研究中心:http://ccl.pku.edu.cn/corpus.asp
    5. 北外语料库语言学:http://www.bfsu-corpus.org/
    6. 现代汉语平衡语料库:http://www.sinica.edu.tw/SinicaCorpus/
    7. 古汉语语料库:http://www.sinica.edu.tw/ftms-bin/ftmsw
    8. 近代汉语标记语料库:http://www.sinica.edu.tw/Early_Mandarin/
    9. 树图数据库:http://treebank.sinica.edu.tw/
    10. 搜文解字:http://words.sinica.edu.tw/
    11. 汉籍电子文献:http://www.sinica.edu.tw/~tdbproj/handy1/
    12. 中国传媒大学文本语料库检索系统:http://ling.cuc.edu.cn/RawPub/
    13. 哈工大信息检索研究室对外共享语料库资源:http://ir.hit.edu.cn/demo/ltp/Sharing_Plan.htm
    14. 香港教育学院语言资讯科学中心及其语料库实验室:http://www.livac.org/index.php?lang=sc
    15. 中文语言资源联盟:http://www.chineseldc.org/

    在线语料库(国外)

    1. BNC——英国国家语料库(British National Corpus):http://www.natcorp.ox.ac.uk/
    2. BOE——柯林斯英语语料库(the Bank of English):http://www.collinslanguage.com/language-resources/dictionary-datasets/
    3. ANC——美国国家语料库(American National Corpus):http://www.anc.org/
    4. 兰开斯特汉语语料库 (LCMC):http://ota.oucs.ox.ac.uk/scripts/download.php?otaid=2474
    5. SKETCH ENGINE多语言语料库:http://www.sketchengine.co.uk
    6. BASE——英国学术口语语料库(British Academic Spoken English Corpus):http://www2.warwick.ac.uk/fac/soc/celte/research/base/
    7. Lextutor:http://www.lextutor.ca/
    8. My Memory:https://mymemory.translated.net/
    9. TAUS:http://www.tausdata.org/index.php/language-search-engine
    10. TTMEM:https://www.ttmem.com/terminology/download-translation-memory/
    11. TinyTM:http://tinytm.sourceforge.net/
    12. DGT Translation Memory:https://magmatranslation.com/en/free-translation-memory/
    13. European Parliament Proceedings Parallel Corpus 1996-2011:http://statmt.org/europarl/
    14. University of Maryland Parallel Corpus Project: The Bible:http://users.umiacs.umd.edu/~resnik/parallel/bible.html
    15. Aligned Hansards of the 36th Parliament of Canada:https://www.isi.edu/natural-language/download/hansard/
    16. EU Publication Offices:https://publications.europa.eu/en/web/general-publications/publications
    17. Wikimedia Downloads:https://dumps.wikimedia.org/backup-index.html
    18. Open Subtitles:https://www.opensubtitles.org/en/search/subs
    19. United Nations Parallel Corpus:https://cms.unov.org/UNCorpus/
    20. European language pairs:http://www.statmt.org/wmt13/translation-task.html#download
    21. parallel corpus search:http://paralela.clarin-pl.eu/#
    22. UM-Corpus: A Large English-Chinese Parallel Corpus:http://nlp2ct.cis.umac.mo/um-corpus/um-corpus-license.html
    23. Clarin Parallel corpora:https://www.clarin.eu/resource-families/parallel-corpora
    24. The PKU 863 Chinese-English Parallel Corpus:https://www.lancaster.ac.uk/fass/projects/corpus/863parallel/
    25. 《红楼梦》汉英平行语料库:http://corpus.usx.edu.cn/hongloumeng/images/shiyongshuoming.htm
    26. 中央研究院近代汉语标记语料库:http://lingcorpus.iis.sinica.edu.tw/early/
    27. BYU corpora: https://corpus.byu.edu/

    其他子语料库

    1. Books – A collection of translated literature
    2. DGT – A collection of EU Translation Memories provided by the JRC
    3. DOGC – Documents from the Catalan Goverment
    4. ECB – European Central Bank corpus
    5. EMEA – European Medicines Agency documents
    6. The EU bookshop corpus
    7. EUconst – The European constitution
    8. EUROPARL v7 – European Parliament Proceedings
    9. giga-fren – French-English Gigal-Word Corpus
    10. GNOME – GNOME localization files
    11. Global Voices – News stories in various languages
    12. The Croatian – English WaC corpus
    13. JRC-Acquis- legislative EU texts
    14. KDE4 – KDE4 localization files (v.2)
    15. KDEdoc – the KDE manual corpus
    16. MBS – Belgisch Staatsblad corpus
    17. memat – Xhosa/English parallel data
    18. MontenegrinSubs – Montenegrin movie subtitles
    19. MultiUN – Translated UN documents
    20. News Commentary, v9.0, v9.1
    21. OfisPublik – Breton – French parallel texts
    22. OO – the OpenOffice.org corpus
    23. OpenOffice.org 3 corpus
    24. OpenSubtitles – the opensubtitles.org corpus
    25. OpenSubtitles2011, OpenSubtitles2012, OpenSubtitles2013
    26. OpenSubtitles2016 – snapshot from 2016
    27. OpenSubtitles2018 – new complete version
    28. ParaCrawl corpus
    29. ParCor – A Parallel Pronoun-Coreference Corpus
    30. PHP – the PHP manual corpus
    31. Regeringsförklaringen – a tiny example corpus
    32. SETIMES – A parallel corpus of the Balkan languages
    33. SPC – Stockholm Parallel Corpora
    34. Tatoeba – A DB of translated sentences
    35. TedTalks hr-en
    36. TED Talks 2013
    37. Tanzil – A collection of Quran translations
    38. TEP – The Tehran English-Persian subtitle corpus
    39. Ubuntu – Ubuntu localization files
    40. UN – Translated UN documents
    41. Wikipedia – translated sentences from Wikipedia
    42. WikiSource – (small en-sv sample only
    43. WMT News Test Sets
    44. The Xhosa – English Navy corpus

    在线术语库

    1. 中国关键词:http://www.china.org.cn/chinese/china_key_words/
    2. 中国特色话语对外翻译标准化术语库:http://210.72.20.108/index/index.jsp
    3. 中国核心词汇:https://www.cnkeywords.net/index
    4. 中国思想文化术语:http://www.chinesethought.cn/TermBase.aspx
    5. 联合国术语库:https://unterm.un.org/UNTERM/portal/welcome
    6. 术语在线:http://termonline.cn/index.htm
    7. 国家教育研究院术语库:http://terms.naer.edu.tw/download/
    8. 区块链相关术语:http://8btc.com/thread-17286-16-1.html
    9. 明代职官中英辞典: https://escholarship.org/uc/item/2bz3v185
    10. 中国规范术语: http://shuyu.cnki.net/index.aspx
    11. Grand Dictionnaire Terminologique: http://www.granddictionnaire.com/
    12. TERMIUM: http://www.btb.termiumplus.gc.ca/tpv2alpha/alpha-eng.html?lang=eng
    13. 语帆术语宝:http://termbox.lingosail.com/
    14. 微软术语库:https://www.microsoft.com/zh-cn/language
    15. 世界卫生组织术语库:http://www.who.int/substance_abuse/terminology/zh/
    16. 电子工程术语表:https://www.maximintegrated.com/cn/glossary/definitions.mvp/terms/all
    17. Mdict 100GB超大离线词库下载:https://downloads.freemdict.com/
    18. 一本词典:http://www.onedict.com/
    19. 国家标准《物流术语》 :http://zizhan.mot.gov.cn/zhuantizhuanlan/gonglujiaotong/shoufeigongluzmk/zhengcefagui/201508/t20150814_1863913.html
    20. 冬奥会术语查询网站:http://owgt.lingosail.com/
    21. 音乐术语查询:http://dictionary.t-classical.com/
    22. European Union Language and terminology:https://europa.eu/european-union/documents-publications/language-and-terminology_en
    23. IATE (Interactive Terminology for Europe) EU’s terminology database:https://iate.europa.eu/home
    24. 香港法律中英术语:https://www.elegislation.gov.hk/glossary/chi
    25. Magic Search:http://magicsearch.org
    26. Microsoft Language Portal:https://www.microsoft.com/en-us/language
    27. Linguee:https://www.linguee.com/
    28. The Free Dictionary:http://www.thefreedictionary.com/
    29. Glosbe:https://glosbe.com/tmem/

    相关文章

      网友评论

          本文标题:语料库与术语库

          本文链接:https://www.haomeiwen.com/subject/vqdzaltx.html