python 使用xpath获取HTML和过滤器
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python 使用xpath获取HTML和过滤器相关的知识,希望对你有一定的参考价值。
#!/usr/bin/python
# -*- coding: utf-8 -*-
from lxml import html
import requests
import re
# CEX HTML REQUEST
gameList = []
gameList.append(["Deadlight: Director's Cut", "12", "5035228121522"])
gameList.append(["Devil May Cry Definitive Edition", "12", "5055060930755"])
gameList.append(["Just Cause 3", "18", "5021290069770"])
gameList.append(["République", "20", "813633016542"])
gameList.append(["Until Dawn", "20", "711719874836"])
gameList.append(["Dying Light", "28", "5051892165280"])
gameList.append(["Uncharted 4", "28", "0711719454410"])
gameList.append(["Little Nightmares + Figure", "30", "3391891992473"])
gameList.append(["WipeOut Omega Collection", "30", "711719854463"])
gameList.append(["Rise of Tomb Raider", "32", "5021290074767"])
gameList.append(["Yakuza 0", "35", "5055277027996"])
gameList.append(["Hitman", "35", "5021290075863"])
gameList.append(["Nioh", "38", "711719819066"])
gameList.append(["NieR: Automata", "45", "5021290074484"])
gameList.append(["Bioshock: The Collection", "48", "5026555421898"])
newPricesHTML = ""
newPricesTXT = ""
for i in range(len(gameList)):
page = requests.get("https://pt.webuy.com/product-detail?id=" + gameList[i][2])
source = html.fromstring(page.content)
priceFullText = source.xpath('//td[@id="Asellprice"]/text()')
priceFullString = str(priceFullText)
priceStart = priceFullString.find("20ac") + 4
#print page.content
newPrice = int(priceFullString[priceStart : priceStart + 2])
if newPrice < int(gameList[i][1]):
newPricesHTML += gameList[i][0] + ": "
newPricesHTML += str(newPrice) + ": "
newPricesHTML += "https://pt.webuy.com/product.php?sku=" + gameList[i][2]
newPricesHTML += "<br>"
newPricesTXT += gameList[i][0] + ": "
newPricesTXT += str(newPrice) + ": "
newPricesTXT += "https://pt.webuy.com/product.php?sku=" + gameList[i][2]
newPricesTXT += "\n"
print(gameList[i][0] + " : YES! from " + gameList[i][1] + " to " + str(newPrice))
else:
print(gameList[i][0] + " : NOPE!")
if len(newPricesHTML) != 0 :
text_file = open("/Users/diogoqueiros/Downloads/CEX-Prices.txt", "w")
text_file.write("CEX-PricesDown:\n%s" % newPricesTXT)
text_file.close()
以上是关于python 使用xpath获取HTML和过滤器的主要内容,如果未能解决你的问题,请参考以下文章
Python爬虫——使用XPath和lxml库解析HTML
Python怎样获取XPath下的A标签的内容
Python 爬虫开发之xpath使用
使用 XPath 进行 Python XML 过滤 [重复]
python xpath 获取指定页面中指定区域的html代码
python网络数据采集之xpath