基于python的tagcloud
Posted 小卒子0624
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了基于python的tagcloud相关的知识,希望对你有一定的参考价值。
setp1: 安装jieba,pytagcloud
pip install jieba
apt-get install python-pygame
pip install simplejson
pip install pytagcloud
step2:下载中文字体文件比如simhei.ttf
- 找到pytagcloud包的字体文件(/usr/local/lib/python2.7/dist-packages/pytagcloud/fonts)
- 复制字体文件到pytagcloud中 cp simhei.ttf /usr/local/lib/python2.7/dist-packages/pytagcloud/fonts
- 编辑fonts.json vim fonts.json (如下图)
[ 2 { 3 "name":"SimHei", 4 "ttf":"simhei.ttf", 5 "web":"none" 6 }, 7 { 8 "name": "Nobile", 9 "ttf": "nobile.ttf", 10 "web": "http://fonts.googleapis.com/css?family=Nobile" 11 }, 12 { 13 "name": "Old Standard TT", 14 "ttf": "OldStandard-Regular.ttf", 15 "web": "http://fonts.googleapis.com/css?family=Old+Standard+TT" 16 },
step3:爬取文本
step4:生成tagcloud
1 # -*- coding:utf-8 -*- 2 import jieba 3 import jieba.analyse 4 import pytagcloud 5 from pytagcloud import create_tag_image,make_tags 6 from pytagcloud.lang.counter import get_tag_counts 7 fp=open(‘sent.txt‘,‘r‘) 8 content = fp.read() 9 words = jieba.cut(content) 10 top = jieba.analyse.extract_tags(content,topK=100,withWeight=True) 11 tagcloud={} 12 for i in xrange(len(top)): 13 tagcloud[top[i][0]]=int(top[i][1]) 14 print tagcloud 15 from operator import itemgetter 16 swd = sorted(tagcloud.iteritems(),key=itemgetter(1),reverse=True) 17 tags = make_tags(swd, minsize=20,maxsize=60) 18 #print tags 19 create_tag_image(tags, ‘cloud_large.png‘,background=(0,0,0,255),size=(900, 600),fontname=‘SimHei‘) 20 import webbrowser 21 webbrowser.open(‘cloud_large.png‘)
以上是关于基于python的tagcloud的主要内容,如果未能解决你的问题,请参考以下文章