美文网首页
python统计一个word文档里单词的频率

python统计一个word文档里单词的频率

作者: 潇湘demi | 来源:发表于2020-05-06 18:36 被阅读0次

#coding:utf-8

#import string

# -*- coding: utf-8 -*-

import docx

import re

dict = {}

file_path ='dear.docx'

doc = docx.Document(file_path)

for paragraphin doc.paragraphs:

s1 = paragraph.text

s2 = re.sub(r'[,.""?!]'," ",s1).lower()

for wordin s2.split():

#print word

        dict.setdefault(word,0)

if wordin dict:

dict[word] +=1

print dict

相关文章

网友评论

      本文标题:python统计一个word文档里单词的频率

      本文链接:https://www.haomeiwen.com/subject/eifgghtx.html