Python——爬取人口迁徙数据(以腾讯迁徙为例)

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Python——爬取人口迁徙数据(以腾讯迁徙为例)相关的知识,希望对你有一定的参考价值。

说明:

1.迁徙量是腾讯修改后的数值,无法确认真实性。

2.代码运行期间,腾讯迁徙未设置IP屏蔽和浏览器检测,因此下段代码仅能保证发布近期有效。

3.代码功能:爬取指定一天的四十个城市左右的迁徙量(含迁入、迁出)。

 1 import re
 2 import urllib.request
 3 import xlwt
 4 import xlrd
 5 
 6 date = "20171016"
 7 cityList = xlrd.open_workbook("E:/city.xls").sheet_by_index(0).col_values(0) # [‘city‘, ‘南昌‘, ‘景德镇‘, ‘萍乡‘, ...
 8 cityCodeList = xlrd.open_workbook("E:/city.xls").sheet_by_index(0).col_values(1) # [‘cityCode‘, ‘360100‘, ‘360200‘,...
 9 direction = ["0","1"]
10 header = ["from","to","number","car","train","plane"]
11 dInd = 0
12 for cityIndex in range(1,len(cityCodeList)):
13     for dInd in range(2):
14         url = "https://lbs.gtimg.com/maplbs/qianxi/" + date + "/" + cityCodeList[cityIndex] + direction[dInd] + "6.js" # "0 迁入": result-city,"1 迁出:city-result
15         workbook = xlwt.Workbook()
16         sheet = workbook.add_sheet("result")
17         for i in range(len(header)):
18             sheet.write(0,i,header[i])
19         ptRow = re.compile((\\[".*?\\]))
20         ptCity = re.compile("")
21         try:
22             data = urllib.request.urlopen(url).read().decode("utf8") # JSONP_LOADER&&JSONP_LOADER([["重庆",198867,0.000,0.300,0.700],["上海",174152,0.160,0.390,0.450],[...
23             dataList = re.findall(ptRow,data) # [‘["重庆",198867,0.000,0.300,0.700]‘, ‘["上海",174152,0.160,0.390,0.450]‘,[...
24             for i in range(len(dataList)):
25                 colList = str(dataList[i]).split(",") # colList[4] = 0.700]
26                 if direction[dInd] == "0":
27                     sheet.write(i + 1, len(header) - 6, str(colList[0]).replace("[","").replace(","")) # city
28                     sheet.write(i + 1, len(header) - 5, cityList[cityIndex])
29                 else:
30                     sheet.write(i + 1, len(header) - 6, cityList[cityIndex])
31                     sheet.write(i + 1, len(header) - 5, str(colList[0]).replace("[","").replace(","")) # city
32                 sheet.write(i + 1, len(header) - 4, colList[1]) # number
33                 sheet.write(i + 1, len(header) - 3, colList[2]) # car
34                 sheet.write(i + 1, len(header) - 2, colList[3]) # train
35                 sheet.write(i + 1, len(header) - 1, str(colList[4]).replace("]","")) # plane
36         except Exception as e:
37             print(e)
38         workbook.save("E:/qianxi/" + str(cityList[cityIndex]) + direction[dInd] + date + ".xls")
39 print("Done!")

结果展示:

技术分享

技术分享

 

以上是关于Python——爬取人口迁徙数据(以腾讯迁徙为例)的主要内容,如果未能解决你的问题,请参考以下文章

随笔一:大数据的特点来源与数据呈现方式

大数据下的龙港!

Python看春运,万条拼车数据背后的春节迁徙地图

mysql数据迁徙详解

echarts3地图实现数据迁徙,怎么实现画的迁徙的线路颜色不一致

echarts3地图实现数据迁徙,怎么实现画的迁徙的线路颜色不一致