美文网首页鸟语数析
python 统计文件中单词出现的频率2

python 统计文件中单词出现的频率2

作者: SkTj | 来源:发表于2019-07-30 09:23 被阅读0次

    import sys
    import re

    WORD_RE = re.compile('\w+')

    index = {}
    with open(sys.argv[1], encoding='utf-8') as fp:
    for line_no, line in enumerate(fp, 1):
    for match in WORD_RE.finditer(line):
    word = match.group()
    column_no = match.start()+1
    location = (line_no, column_no)
    index.setdefault(word, []).append(location) # <1>

    print in alphabetical order

    for word in sorted(index, key=str.upper):
    print(word, index[word])

    END INDEX

    相关文章

      网友评论

        本文标题:python 统计文件中单词出现的频率2

        本文链接:https://www.haomeiwen.com/subject/mubylctx.html