十九大报告词频分析

Posted 2021-11-25 justlikecode

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了十九大报告词频分析相关的知识，希望对你有一定的参考价值。

1.准备工作

环境要求：Jupyter、python3.7，jieba库

2.python代码

#! python3
# -*- coding: utf-8 -*-
import os, codecs
import jieba
from collections import Counter
 
def get_words(txt):
    seg_list = jieba.cut(txt)   #对文本进行分词
    c = Counter()
    for x in seg_list:          #进行词频统计
        if len(x)>1 and x != ‘\\r\\n‘:
            c[x] += 1
    print(‘常用词频度统计结果‘)
    for (k,v) in c.most_common(20):      #遍历输出高频词
        print(‘%s%s %s  %d‘ % (‘  ‘*(5-len(k)), k, ‘*‘*int(v/3), v))
 
if __name__ == ‘__main__‘:
    with codecs.open(‘19d.txt‘, ‘r‘, ‘utf8‘) as f:
        txt = f.read()
    get_words(txt)

3.显示效果

技术图片

input

引用

https://blog.csdn.net/onestab/article/details/78307765

https://nbviewer.jupyter.org/github/windard/Python_Lib/blob/master/code/%E4%BD%BF%E7%94%A8%20wordcloud%20%E7%94%9F%E6%88%90%E8%AF%8D%E4%BA%91.ipynb

以上是关于十九大报告词频分析的主要内容，如果未能解决你的问题，请参考以下文章