美文网首页
wordcloud使用总结

wordcloud使用总结

作者: hayao650 | 来源:发表于2021-06-16 16:42 被阅读0次

    在使用wordcloud过程中遇到一个error

    ValueError: We need at least 1 word to plot a word cloud, got 0.
    这个问题是在用wordcloud处理中文的时候遇到的,但是用jieba分词后的内容,传给wordcloud.generate()时,却没有问题。单独把jieba分词后的结果打印出来,发现时unicode,所以尝试把内容转换为unicode后传给generate()函数,错误消失。

    下面是代码:

    #! coding=utf-8
    import os
    import jieba
    
    from os import path
    from wordcloud import WordCloud
    import numpy as np
    from PIL import Image
    from os import path
    
    d = path.dirname(__file__)
    
    
    words = [
             u'古天乐',
             u'郭富城',
             u'刘德华',
             u'周杰伦',
             ]
    
    mask = np.array(Image.open(path.join(d, "xxx.jpg")))
    
    font = r'C:\Windows\Fonts\simfang.ttf'
    wordcloud = WordCloud(font_path=font,width=500, height=600 margin=5, background_color="white").generate(" ".join(words))
    wordcloud.to_file(path.join(d, "sb_mask.png"))
    
    
    效果如图 sb_mask.png

    默认效果是生成一张矩形图片,你也可以自己找一张背景图,来生成背景图案中的形状,需要注意的是,背景图案中除形状所需部分,必须是纯白(255,255,255)

    mask = np.array(Image.open(path.join(d, "xxx.jpg")))
    

    将mask传给wordcloud,将生成mask形状的图案。

    wordcloud的参数介绍

    --text
    specify file of words to build the word cloud (default: stdin)
    
    Default: -
    
    --regexp
    override the regular expression defining what constitutes a word
    
    --stopwords
    specify file of stopwords (containing one word per line) to remove from the given text after parsing
    
    --imagefile
    file the completed PNG image should be written to (default: stdout)
    
    Default: -
    
    --fontfile
    path to font file you wish to use (default: DroidSansMono)
    
    --mask
    mask to use for the image form
    
    --colormask
    color mask to use for image coloring
    
    --contour_width
    if greater than 0, draw mask contour (default: 0)
    
    Default: 0
    
    --contour_color
    use given color as mask contour color - accepts any value from PIL.ImageColor.getcolor
    
    Default: “black”
    
    --relative_scaling
    scaling of words by frequency (0 - 1)
    
    Default: 0
    
    --margin
    spacing to leave around words
    
    Default: 2
    
    --width
    define output image width
    
    Default: 400
    
    --height
    define output image height
    
    Default: 200
    
    --color
    use given color as coloring for the image - accepts any value from PIL.ImageColor.getcolor
    
    --background
    use given color as background color for the image - accepts any value from PIL.ImageColor.getcolor
    
    Default: “black”
    
    --no_collocations
    do not add collocations (bigrams) to word cloud (default: add unigrams and bigrams)
    
    Default: True
    
    --include_numbers
    include numbers in wordcloud?
    
    Default: False
    
    --min_word_length
    only include words with more than X letters
    
    Default: 0
    
    --prefer_horizontal
    ratio of times to try horizontal fitting as opposed to vertical
    
    Default: 0.9
    
    --scale
    scaling between computation and drawing
    
    Default: 1
    
    --colormap
    matplotlib colormap name
    
    Default: “viridis”
    
    --mode
    use RGB or RGBA for transparent background
    
    Default: “RGB”
    
    --max_words
    maximum number of words
    
    Default: 200
    
    --min_font_size
    smallest font size to use
    
    Default: 4
    
    --max_font_size
    maximum font size for the largest word
    
    --font_step
    step size for the font
    
    Default: 1
    
    --random_state
    random seed
    
    --no_normalize_plurals
    whether to remove trailing ‘s’ from words
    
    Default: True
    
    --repeat
    whether to repeat words and phrases
    
    Default: False
    
    --version
    show program’s version number and exit
    

    相关文章

      网友评论

          本文标题:wordcloud使用总结

          本文链接:https://www.haomeiwen.com/subject/epcfyltx.html