nltk

作者: EricLee_1900 | 来源:发表于2020-03-16 21:47 被阅读0次

    LookupError:

    **********************************************************************

    Resource punkt not found.

    Please use the NLTK Downloader to obtain the resource:

    >>> import nltk

    >>> nltk.download('punkt')

    Attempted to load tokenizers/punkt/english.pickle

    Searched in:

    - '/home/lixujian/nltk_data'

    - '/home/lixujian/.local/share/virtualenvs/yuanpei-PJn4iQdv/nltk_data'

    - '/home/lixujian/.local/share/virtualenvs/yuanpei-PJn4iQdv/share/nltk_data'

    - '/home/lixujian/.local/share/virtualenvs/yuanpei-PJn4iQdv/lib/nltk_data'

    - '/usr/share/nltk_data'

    - '/usr/local/share/nltk_data'

    - '/usr/lib/nltk_data'

    - '/usr/local/lib/nltk_data'

    - u''

    **********************************************************************

    [nltk_data] Error loading punkt:

    [nltk_data] violation of protocol (_ssl.c:618)>

    >>> import nltk

    >>> nltk.download()

    NLTK Downloader

    ---------------------------------------------------------------------------

    d) Download   l) List    u) Update   c) Config   h) Help   q) Quit

    ---------------------------------------------------------------------------

    Downloader> d

    Download which package (l=list; x=cancel)?

    Identifier> l

    Collections:

    [-] all-corpora......... All the corpora

    [-] all-nltk............ All packages available on nltk_data gh-pages

    branch

    [-] all................. All packages

    [-] book................ Everything used in the NLTK Book

    [P] popular............. Popular packages

    [P] tests............... Packages for running tests

    [ ] third-party......... Third-party data packages

    ([*] marks installed packages; [-] marks out-of-date or corrupt packages;

    [P] marks partially installed collections)

    Download which package (l=list; x=cancel)?

    Identifier> all-nltk

    Downloading collection 'all-nltk'

    。。。。

    |   Unzipping stemmers/porter_test.zip.

    | Downloading package wmt15_eval to /users4/zsun/nltk_data...

    |   Unzipping models/wmt15_eval.zip.

    | Downloading package mwa_ppdb to /users4/zsun/nltk_data...

    |   Unzipping misc/mwa_ppdb.zip.

    |

    Done downloading collection all-nltk

    成功。

    https://www.jianshu.com/p/9c48e8edc7aa

    1、第一次尝试

    import nltk

    nltk.download('punkt')

    2、第二次尝试成功

    第一步:在控制台直接下载会很慢,用以下代码下载,或者到百度云下载

    第一次使用时:需要下载,添加以下代码,nltk.download('punkt'),其中punkt可以换成自己所需要的数据目录

    import nltk

    import ssl

    try:

    _create_unverified_https_context = ssl._create_unverified_context

    except AttributeError:

    pass

    else:

    ssl._create_default_https_context = _create_unverified_https_context

    nltk.download('punkt')

    https://blog.csdn.net/ie_Jeton/article/details/82527216

    第二步:

    按照以下语句,倒入已经下载好的依赖包的目录

    from nltk import data

    data.path.append('/home/lixujian/yuanpei/data/nltk_data')

    Packages:

    [ ] abc................. Australian Broadcasting Commission 2006

    [ ] alpino.............. Alpino Dutch Treebank

    [ ] averaged_perceptron_tagger Averaged Perceptron Tagger

    [ ] averaged_perceptron_tagger_ru Averaged Perceptron Tagger (Russian)

    [ ] basque_grammars..... Grammars for Basque

    [ ] biocreative_ppi..... BioCreAtIvE (Critical Assessment of Information

    Extraction Systems in Biology)

    [ ] bllip_wsj_no_aux.... BLLIP Parser: WSJ Model

    [ ] book_grammars....... Grammars from NLTK Book

    [ ] brown............... Brown Corpus

    [ ] brown_tei........... Brown Corpus (TEI XML Version)

    [ ] cess_cat............ CESS-CAT Treebank

    [ ] cess_esp............ CESS-ESP Treebank

    [ ] chat80.............. Chat-80 Data Files

    [ ] city_database....... City Database

    [ ] cmudict............. The Carnegie Mellon Pronouncing Dictionary (0.6)

    [ ] comparative_sentences Comparative Sentence Dataset

    [ ] comtrans............ ComTrans Corpus Sample

    [ ] conll2000........... CONLL 2000 Chunking Corpus

    [ ] conll2002........... CONLL 2002 Named Entity Recognition Corpus

    [ ] conll2007........... Dependency Treebanks from CoNLL 2007 (Catalan

    and Basque Subset)

    [ ] crubadan............ Crubadan Corpus

    [ ] dependency_treebank. Dependency Parsed Treebank

    [ ] dolch............... Dolch Word List

    [ ] europarl_raw........ Sample European Parliament Proceedings Parallel

    Corpus

    [ ] floresta............ Portuguese Treebank

    [ ] framenet_v15........ FrameNet 1.5

    [ ] framenet_v17........ FrameNet 1.7

    [ ] gazetteers.......... Gazeteer Lists

    [ ] genesis............. Genesis Corpus

    [ ] gutenberg........... Project Gutenberg Selections

    [ ] ieer................ NIST IE-ER DATA SAMPLE

    [ ] inaugural........... C-Span Inaugural Address Corpus

    [ ] indian.............. Indian Language POS-Tagged Corpus

    [ ] jeita............... JEITA Public Morphologically Tagged Corpus (in

    ChaSen format)

    [ ] kimmo............... PC-KIMMO Data Files

    [ ] knbc................ KNB Corpus (Annotated blog corpus)

    [ ] large_grammars...... Large context-free and feature-based grammars

    for parser comparison

    [ ] lin_thesaurus....... Lin's Dependency Thesaurus

    [ ] mac_morpho.......... MAC-MORPHO: Brazilian Portuguese news text with

    part-of-speech tags

    [ ] machado............. Machado de Assis -- Obra Completa

    [ ] masc_tagged......... MASC Tagged Corpus

    [ ] maxent_ne_chunker... ACE Named Entity Chunker (Maximum entropy)

    [ ] maxent_treebank_pos_tagger Treebank Part of Speech Tagger (Maximum entropy)

    [ ] moses_sample........ Moses Sample Models

    [ ] movie_reviews....... Sentiment Polarity Dataset Version 2.0

    [ ] mte_teip5........... MULTEXT-East 1984 annotated corpus 4.0

    [ ] mwa_ppdb............ The monolingual word aligner (Sultan et al.

    2015) subset of the Paraphrase Database.

    [ ] names............... Names Corpus, Version 1.3 (1994-03-29)

    [ ] nombank.1.0......... NomBank Corpus 1.0

    [ ] nonbreaking_prefixes Non-Breaking Prefixes (Moses Decoder)

    [ ] nps_chat............ NPS Chat

    [ ] omw................. Open Multilingual Wordnet

    [ ] opinion_lexicon..... Opinion Lexicon

    [ ] panlex_swadesh...... PanLex Swadesh Corpora

    [ ] paradigms........... Paradigm Corpus

    [ ] pe08................ Cross-Framework and Cross-Domain Parser

    [ ] perluniprops........ perluniprops: Index of Unicode Version 7.0.0

    character properties in Perl

    [ ] pil................. The Patient Information Leaflet (PIL) Corpus

    [ ] pl196x.............. Polish language of the XX century sixties

    [ ] porter_test......... Porter Stemmer Test Files

    [ ] ppattach............ Prepositional Phrase Attachment Corpus

    [ ] problem_reports..... Problem Report Corpus

    [ ] product_reviews_1... Product Reviews (5 Products)

    [ ] product_reviews_2... Product Reviews (9 Products)

    [ ] propbank............ Proposition Bank Corpus 1.0

    [ ] pros_cons........... Pros and Cons

    [ ] ptb................. Penn Treebank

    [ ] punkt............... Punkt Tokenizer Models

    [ ] qc.................. Experimental Data for Question Classification

    [ ] reuters............. The Reuters-21578 benchmark corpus, ApteMod

    version

    [ ] rslp................ RSLP Stemmer (Removedor de Sufixos da Lingua

    Portuguesa)

    [ ] rte................. PASCAL RTE Challenges 1, 2, and 3

    [ ] sample_grammars..... Sample Grammars

    [ ] semcor.............. SemCor 3.0

    [ ] senseval............ SENSEVAL 2 Corpus: Sense Tagged Text

    [ ] sentence_polarity... Sentence Polarity Dataset v1.0

    [ ] sentiwordnet........ SentiWordNet

    [ ] shakespeare......... Shakespeare XML Corpus Sample

    [ ] sinica_treebank..... Sinica Treebank Corpus Sample

    [ ] smultron............ SMULTRON Corpus Sample

    [ ] snowball_data....... Snowball Data

    [ ] spanish_grammars.... Grammars for Spanish

    [ ] state_union......... C-Span State of the Union Address Corpus

    [ ] stopwords........... Stopwords Corpus

    [ ] subjectivity........ Subjectivity Dataset v1.0

    [ ] swadesh............. Swadesh Wordlists

    [ ] switchboard......... Switchboard Corpus Sample

    [ ] tagsets............. Help on Tagsets

    [ ] timit............... TIMIT Corpus Sample

    [ ] toolbox............. Toolbox Sample Files

    [ ] treebank............ Penn Treebank Sample

    [ ] twitter_samples..... Twitter Samples

    [ ] udhr2............... Universal Declaration of Human Rights Corpus

    (Unicode Version)

    [ ] udhr................ Universal Declaration of Human Rights Corpus

    [ ] all-corpora......... All the corpora

    [ ] all-nltk............ All packages available on nltk_data gh-pages

    branch

    [ ] all................. All packages

    [ ] book................ Everything used in the NLTK Book

    [ ] popular............. Popular packages

    [ ] tests............... Packages for running tests

    [ ] third-party......... Third-party data packages

    相关文章

      网友评论

          本文标题:nltk

          本文链接:https://www.haomeiwen.com/subject/eeooehtx.html