文件方式实现完整的英文词频统计实例

Posted 塨槟

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了文件方式实现完整的英文词频统计实例相关的知识,希望对你有一定的参考价值。

 

1.读入待分析的字符串

  

str=\'\'\'We don\'t talk anymore
We don\'t talk anymore
We don\'t talk anymore
Like we used to do 
We don\'t laugh anymore
What was all of it for? 
We don\'t talk anymore 
Like we used to do
I just heard you found the one you\'ve been lookin\'
The one you been looking for
I wish i would\'ve konwn that wasn\'t me 
Cause even after all this time i still wonder
Why i can\'t move on? 
Just the way you dance so easliy 
Don\'t wanna know 
The kinda dress you\'re wearin\' tonight
If he\'s holdin\' onto you so tight
The way i did before
I overdosed 
Should\'ve known your love was game
Now I can\'t get\'cha out of my brain
Ooh it\'s such a shame
We don\'t talk anymore
We don\'t talk anymore 
We don\'t talk anymore
Like we used to do 
We don\'t laugh anymore
What was all of it for?
We don\'t talk anymore 
Like we used to do
I just hope you\'r lyin\' next to somebody
Know it\'s hard to love ya like me
Must be a good reason that you\'re gone
Every now and then
I think you might want me to come show up your door
But I\'m just too afraid that i\'ll be worng
Don\'t wanna know 
If you\'ra lookin\' into her eyes
If she\'s holdin\' onto you so tight
The way i did before
I overdosed
Should\'ve know your love was a game 
Now I can\'t get\'cha out of my brain
Ooh it\'s such a shame
We don\'t talk anymore
We don\'t talk anymore
We don\'t talk anymore
Like we used to do 
We don\'t laugh anymore
What was all of it for? 
We don\'t talk anymore
Like we used to do
Like we used to do
Don\'t wanna know
The kinda dress you\'re wearin\' tonight
If he\'s givin\' it to you just right
The way i did before
I overdosed
Should\'ve know your love was a game 
Now I can\'t get\'cha out of my brain
Ooh it\'s such a shame
We don\'t talk anymore
We don\'t talk anymore
We don\'t talk anymore
Like we used to do 
We don\'t laugh anymore
What was all of it for? 
We don\'t talk anymore
Like we used to do
We don\'t talk anymore
The way did before
We don\'t talk anymore
Ooh 
Woo
Ooh it\'s such a shame
We don\'t talk anymore\'\'\'

2.分解提取单词 

3.计数字典

4.排除语法型词汇

5.排序

6.输出TOP(20)

 

fo=open(\'1.txt\',\'r\')
str=fo.read()

str=str.lower() #转换为小写
for i in \',.?\':
    str=str.replace(i,\' \') #用空格代替标点符号
    
words=str.split(\' \')  #分解提取单词

exc={\'to\',\'a\',\'of\',\'it\',}  #选择高频且无效的关键词


dic={} 
keys=set(words) #出现过的单词的集合
keys=keys-exc
print(words)#排除语法型词汇


for i in keys:
    dic[i]=words.count(i) #计数字典
print(dic)

wc=list(dic.items()) #列表

wc.sort(key=lambda x:x[1],reverse=True)#排序
print(wc)

for i in range(20): #输出TOP(20)
    print(wc[i])

运行结果:

 

以上是关于文件方式实现完整的英文词频统计实例的主要内容,如果未能解决你的问题,请参考以下文章

文件方式实现完整的英文词频统计实例

文件方式实现完整的英文词频统计实例

文件方式实现完整的英文词频统计实例

文件方式实现完整的英文词频统计实例

文件方式实现完整的英文词频统计实例

文件方式实现完整的英文词频统计实例