在使用wordcloud过程中遇到一个error
ValueError: We need at least 1 word to plot a word cloud, got 0.
这个问题是在用wordcloud处理中文的时候遇到的,但是用jieba分词后的内容,传给wordcloud.generate()时,却没有问题。单独把jieba分词后的结果打印出来,发现时unicode,所以尝试把内容转换为unicode后传给generate()函数,错误消失。
下面是代码:
#! coding=utf-8
import os
import jieba
from os import path
from wordcloud import WordCloud
import numpy as np
from PIL import Image
from os import path
d = path.dirname(__file__)
words = [
u'古天乐',
u'郭富城',
u'刘德华',
u'周杰伦',
]
mask = np.array(Image.open(path.join(d, "xxx.jpg")))
font = r'C:\Windows\Fonts\simfang.ttf'
wordcloud = WordCloud(font_path=font,width=500, height=600 margin=5, background_color="white").generate(" ".join(words))
wordcloud.to_file(path.join(d, "sb_mask.png"))
效果如图
sb_mask.png
默认效果是生成一张矩形图片,你也可以自己找一张背景图,来生成背景图案中的形状,需要注意的是,背景图案中除形状所需部分,必须是纯白(255,255,255)
mask = np.array(Image.open(path.join(d, "xxx.jpg")))
将mask传给wordcloud,将生成mask形状的图案。
wordcloud的参数介绍
--text
specify file of words to build the word cloud (default: stdin)
Default: -
--regexp
override the regular expression defining what constitutes a word
--stopwords
specify file of stopwords (containing one word per line) to remove from the given text after parsing
--imagefile
file the completed PNG image should be written to (default: stdout)
Default: -
--fontfile
path to font file you wish to use (default: DroidSansMono)
--mask
mask to use for the image form
--colormask
color mask to use for image coloring
--contour_width
if greater than 0, draw mask contour (default: 0)
Default: 0
--contour_color
use given color as mask contour color - accepts any value from PIL.ImageColor.getcolor
Default: “black”
--relative_scaling
scaling of words by frequency (0 - 1)
Default: 0
--margin
spacing to leave around words
Default: 2
--width
define output image width
Default: 400
--height
define output image height
Default: 200
--color
use given color as coloring for the image - accepts any value from PIL.ImageColor.getcolor
--background
use given color as background color for the image - accepts any value from PIL.ImageColor.getcolor
Default: “black”
--no_collocations
do not add collocations (bigrams) to word cloud (default: add unigrams and bigrams)
Default: True
--include_numbers
include numbers in wordcloud?
Default: False
--min_word_length
only include words with more than X letters
Default: 0
--prefer_horizontal
ratio of times to try horizontal fitting as opposed to vertical
Default: 0.9
--scale
scaling between computation and drawing
Default: 1
--colormap
matplotlib colormap name
Default: “viridis”
--mode
use RGB or RGBA for transparent background
Default: “RGB”
--max_words
maximum number of words
Default: 200
--min_font_size
smallest font size to use
Default: 4
--max_font_size
maximum font size for the largest word
--font_step
step size for the font
Default: 1
--random_state
random seed
--no_normalize_plurals
whether to remove trailing ‘s’ from words
Default: True
--repeat
whether to repeat words and phrases
Default: False
--version
show program’s version number and exit
网友评论