组合数据类型练习,英文词频统计实例
Posted zhoujinpeng
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了组合数据类型练习,英文词频统计实例相关的知识,希望对你有一定的参考价值。
1、列表实例:由字符串创建一个作业评分列表,做增删改查询统计遍历操作。例如,查询第一个3分的下标,统计1分的同学有多少个,3分的同学有多少个等。
m=list(‘123223121321312‘) print(‘成绩:‘,m) m.append(‘3‘) print(‘增加:‘,m) m.pop() print(‘删除:‘,m) m.insert(2,‘2‘) print(‘插入:‘,m) m[2]=‘1‘ print(‘修改:‘,m) print(‘第一个3分index:‘,m.index(‘3‘)) print(‘1分人数:‘,m.count(‘1‘)) print(‘3分人数:‘,m.count(‘3‘))
2、字典实例:建立学生学号成绩字典,做增删改查遍历操作。
a={‘周周‘:98,‘张四‘:93,‘李三‘:87,‘李五‘:92,‘周六‘:96} print(‘学生成绩字典‘,a) a[‘吴沟‘]=78 print(‘增加一个学生‘) print(a) a.pop(‘孙十一‘) print(‘删除孙十一‘) print(a) a[‘吴沟‘]=87 print(‘修改吴沟的成绩‘) print(a) print(‘查找周周的成绩:‘,a.get(‘周周‘))
3、列表,元组,字典,集合的遍历。
总结列表,元组,字典,集合的联系与区别。
m = list(‘123484123413216‘) n = tuple(‘161231313535‘) i = {‘01‘:12,‘03‘:546,‘03‘:123456,‘04‘:8524,‘05‘:1546,‘06‘:679} j = {1, 2, 3, 4, 5} print("列表遍历:",m) print("元组遍历:",n) print("字典遍历:",i) print("集合遍历:",j)
列表:可读可修改,符号为[],可进行增删改查等操作。
元组:只读不可修改,符号为()。
字典:有键-值组,无序,符号为{}。
集合:可通过set函数实现集合,无序,可修改,符号为{}。
4、英文词频统计实例
待分析字符串分解提取单词
- 待分析字符串
- 分解提取单词
- 大小写 txt.lower()
- 分隔符‘.,:;?!-_’
- 计数字典
-
排除语法型词汇,代词、冠词、连词
-
- 排序list.sort()
- 输出TOP(10)
news = ‘‘‘For years, British explorer William Lindesay’s inquiries about a possible extension of the Great Wall in Mongolia turned up nothing, but the researcher recently had a breakthrough. Seeking insight from Professor Baasan Tudevin, a lauded but hard-to-find expert on the region, Lindesay posted an advertisement in a local newspaper. It was a long shot, but the two connected and the Mongolian geographer said he knew of several such structures in the Gobi desert, the Telegraph reports. Lindesay formed an expedition in August and with two Land Cruisers, 44 gallons of water, 12 gallons of extra gasoline and a lead from Google Earth, began poking around about 25 miles from the sensitive Chinese-Mongolian border. Two days into the exploration, his team discovered what is thought to be the first section of the Great Wall to exist outside of China. Lost for nearly 1,000 years, the wall’s 62-mile-long arm is made mostly of shrubs and dirt. Lindesay told the Telegraph much of the wall is about shin-level, but there is also a stretch that reaches up to his shoulders.‘‘‘ exc ={‘‘,‘the‘,‘of‘,‘a‘,‘but‘,‘two‘,‘about‘,‘in‘,‘is‘} news = news.lower() for i in ‘,.‘: news = news.replace(i,‘ ‘) words = news.split(‘ ‘) dic = {} keys = set(words) for w in exc: keys.remove(w) for i in keys: dic[i]=words.count(i) wc = list(dic.items()) wc.sort(key=lambda x:x[1],reverse=True) for i in range(10): print(wc[i])
5、文本操作
fo=open(‘/Users/Administrator/Desktop/test.txt‘,‘r‘) news=fo.read() fo.close() exc={‘‘,‘the‘,‘of‘,‘a‘,‘but‘,‘two‘,‘about‘,‘in‘,‘is‘} news =news.lower() for i in ‘‘‘,.?!"‘‘‘: news=news.replace(i,‘ ‘) print(news) words=news.split(‘ ‘) print(words) d={} keys = set(words) for r in exc: keys.remove(r) for i in keys: d[i]=words.count(i) wc=list(d.items()) wc.sort(key=lambda x:x[1],reverse=True) for i in range(10): print(wc[i])
以上是关于组合数据类型练习,英文词频统计实例的主要内容,如果未能解决你的问题,请参考以下文章