爬取糗事百科用户地理位置,详细坐标

Posted zhentaofrezt

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了爬取糗事百科用户地理位置,详细坐标相关的知识,希望对你有一定的参考价值。

代码:import requests
from lxml import etree
import csv
import json
fp = open(‘E:/map.csv‘,‘wt‘,newline=‘‘,encoding=‘utf-8‘)
writer = csv.writer(fp)
writer.writerow((‘address‘,‘longitude‘,‘latitude‘))
headers = {‘User-Agent‘:‘Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)‘}

def get_user_url(url):
url_part = ‘http://www.qiushibaike.com‘
res = requests.get(url,headers=headers)
selector = etree.html(res.text)
url_infos = selector.xpath(‘//div[@class="article block untagged mb15"]‘)
for url_info in url_infos:
uesr_part_urls = url_info.xpath(‘div[1]/a[1]/@href‘)
if len(user_part_urls) == 1:
user_part_url = user_part_urls[0]
get_user_address(url_part + user_part_url)
else:
pass

def get_user_address(url):
res = requests.get(url,headers=headers)
selector = etree.HTML(res.text)
if selector.xpath(‘//div[2]/div[3]/div[2]/ul/li[4]/text()‘):
address = selector.xpath(‘div[2]/div[3]/div[2]/ul/li[4]/text()‘)
get_geo(address[0].split(‘·‘)[0])
else:
pass

def get_geo(address):
par = {‘address‘:address,‘key‘:‘cb649a25c1f81c1451adbeca73623251‘}
api = ‘http://restapi.amap.com/v3/geocode/geo‘
res = requests.get(api,par)
json_data = json.load(res.text)
try:
geo = json_data[‘geocodes‘][0][‘location‘]
longitude = geo.splist(‘,‘)[0]
latitude = geo.splist(‘,‘)[1]
writer.writerow((address,longitude,latitude))
except IndexError:
pass

if __name__ == ‘__main__‘:
urls = [‘http://www.qiushibaike.com/text/page/{}/‘.format(str(i))for i in range(1,36)]
for url in urls:
get_user_url(url)
问题:生成的CSV文件文件内无任何内容 只有之前打的标题 爬取失败
解决方法:不知道啊!!!










































以上是关于爬取糗事百科用户地理位置,详细坐标的主要内容,如果未能解决你的问题,请参考以下文章

爬虫二:爬取糗事百科段子

Python selenium糗事百科

Python爬虫爬取糗事百科(xpath+re)

使用IP代理池和用户代理池爬取糗事百科文章

python—多协程爬取糗事百科热图

爬虫实战 爬取糗事百科