第一个爬虫和测试
Posted zhangsijie
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了第一个爬虫和测试相关的知识,希望对你有一定的参考价值。
完善球赛程序并测试
#羽毛球比赛分析及测试 #每局双方打到20平后,一方领先2分即算该局获胜;若双方打成29平后,一方领先1分,即算该局取胜。 from random import random def printIntro(): print("这个程序模拟两个选手A和B的羽毛球比赛") print("程序运行需要A和B的能力值(以0到1之间的小数表示)") def getInputs(): a = eval(input("请输入选手A的能力值(0-1): ")) b = eval(input("请输入选手B的能力值(0-1): ")) n = eval(input("模拟比赛的场次: ")) return a, b, n def simNGames(n, probA, probB): winsA, winsB = 0, 0 for i in range(n): scoreA, scoreB = simOneGame(probA, probB) if scoreA > scoreB: winsA += 1 else: winsB += 1 return winsA, winsB try: simNGames(0.55) #测试 except: print("simNGames Error") def gameOver(a,b): if a>=20 and b>=20: if abs(a-b)==2: return True if a>29 or b>29: if a==30 or b==30: return True else: return False def simOneGame(probA, probB): scoreA, scoreB = 0, 0 serving = "A" while not gameOver(scoreA, scoreB): if serving == "A": if random() < probA: scoreA += 1 else: serving="B" else: if random() < probB: scoreB += 1 else: serving="A" return scoreA, scoreB try: simOneGame(0.54) #测试 except: print("simNGame Error") def printSummary(winsA, winsB): n = winsA + winsB print("竞技分析开始,共模拟{}场比赛".format(n)) print("选手A获胜{}场比赛,占比{:0.1%}".format(winsA, winsA/n)) print("选手B获胜{}场比赛,占比{:0.1%}".format(winsB, winsB/n)) def main(): print(48) printIntro() probA, probB, n = getInputs() winsA, winsB = simNGames(n, probA, probB) printSummary(winsA, winsB) main()
用requests的get0函数访问搜狗搜索主页20次,打印返回状态,text0内容,计算 text0属性和content属性所返回网页内容的长度。
import requests def gethtmlText(url): try: r = requests.get(url, timeout=30) r.raise_for_status() r.encoding = ‘utf-8‘ return r.text except: return "" url = "http://www.sogou.com/" print(getHTMLText(url))
爬取中国大学排名网站
import requests from bs4 import BeautifulSoup allUniv = [] def getHTMLText(url): try: r = requests.get(url, timeout=30) r.raise_for_status() r.encoding = ‘utf-8‘ return r.text except: return "" def fillUnivList(soup): data = soup.find_all(‘tr‘) for tr in data: ltd = tr.find_all(‘td‘) if len(ltd)==0: continue singleUniv = [] for td in ltd: singleUniv.append(td.string) allUniv.append(singleUniv) def printUnivList(num): print("{1:^2}{2:{0}^10}{3:{0}^6}{4:{0}^4}{5:{0}^10}".format(chr(12288),"排名","学校名称","省市","总分","年费")) for i in range(num): u=allUniv[i] print("{1:^4}{2:{0}^10}{3:{0}^5}{4:{0}^8.1f}{5:{0}^11}".format(chr(12288),u[0],u[1],u[2],eval(u[3]),u[11])) def main(): url = ‘http://www.zuihaodaxue.com/zuihaodaxuepaiming2018.html‘ html = getHTMLText(url) soup = BeautifulSoup(html, "html.parser") fillUnivList(soup) printUnivList(10) main()
以上是关于第一个爬虫和测试的主要内容,如果未能解决你的问题,请参考以下文章
Python练习册 第 0013 题: 用 Python 写一个爬图片的程序,爬 这个链接里的日本妹子图片 :-),(http://tieba.baidu.com/p/2166231880)(代码片段