爬虫实战12306购票抓包分析以及任务分解

Posted ZSYL

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了爬虫实战12306购票抓包分析以及任务分解相关的知识,希望对你有一定的参考价值。

12306购票抓包分析以及任务分解

前言

中国人离不开火车票,学会了就能买票回家!


内容

  • 12306购票抓包分析以及任务分解
  • 处理验证码并完成登陆
  • 解析车站信息以及车辆信息
  • 预定订单初始化、解析用户信息以及坐席信息
  • 构造时间参数以及下单购票
  • 测试运行以及完整代码

1. 抓包分析

使用谷歌浏览器或fiddler等抓包工具完成登陆以及购票操作,进行抓包,根据 具有业务作用被set-cookie 确定以下内容:

  1. url
  2. query
  3. method
  4. data
  5. Referer
  6. response.text / response.json()

1.1 https://www.12306.cn/index/

作用:获取cookies

  • GET

1.2 https://kyfw.12306.cn/otn/login/conf

作用:获取cookies

  • POST
  • data 无
  • Referer https://kyfw.12306.cn/otn/resources/login.html

1.3 https://kyfw.12306.cn/otn/index12306/getLoginBanner

作用:获取cookies

  • GET

  • Referer https://kyfw.12306.cn/otn/resources/login.html

  • response.json()

    {"validateMessagesShowId":"_validatorMessage","status":true,"httpstatus":200,"data":...
    

1.4 https://kyfw.12306.cn/passport/web/auth/uamtk-static

作用:获取cookies,检查用户是否登录

  • POST

  • data

    {'appid': 'otn'}
    
  • Referer https://kyfw.12306.cn/otn/resources/login.html

  • response.json()

    {"result_message":"用户未登录","result_code":1}
    

1.5 https://kyfw.12306.cn/passport/web/create-qr64

作用:获取cookies,获取登录二维码

  • POST

  • data

    {'appid': 'otn'}
    
  • Referer https://kyfw.12306.cn/otn/resources/login.html

  • response.json()

    {"result_message":"生成二维码成功","result_code":"0","image":"iVBO...","uuid":"..."}
    
    • uuid

1.6 https://kyfw.12306.cn/passport/web/checkqr

作用:获取cookies,查询二维码状态

  • POST

  • data

    {'appid': 'otn', 'uuid': uuid}
    
  • Referer https://kyfw.12306.cn/otn/resources/login.html

  • response.json()

    {"result_message":"二维码状态查询成功","result_code":"0"}
    

1.7 https://kyfw.12306.cn/passport/captcha/captcha-image64?login_site=E&module=login&rand=sjrand&_={}

作用:获取验证码图片

  • GET

  • query

  • Referer https://kyfw.12306.cn/otn/resources/login.html

  • response.json()

    {"result_message":"生成验证码成功","result_code":"0","image":"/9j..."}
    
    • image:base64 图片编码

1.8 https://kyfw.12306.cn/passport/captcha/captcha-check?answer={}&rand=sjrand&login_site=E&_={}

作用:验证图片验证码

  • GET

  • query

    • answer 图片验证码坐标
    • str(time.time() * 1000)[:-5] 13位时间戳
  • Referer https://kyfw.12306.cn/otn/resources/login.html

  • response.json()

    {'result_message': '验证码校验成功', 'result_code': '4'}
    

1.9 https://kyfw.12306.cn/passport/web/login

作用:登录

  • POST

  • data

    {'username': username,
    'password': password,
    'appid': 'otn',
    'answer': answer} # 图片验证码坐标 1.8小结
    
  • Referer https://kyfw.12306.cn/otn/resources/login.html

  • response.json()

    {"result_message":"登录成功","result_code":0,"uamtk":"mZflRtMjLW9f7tIDMvjiZR0qp5uuO3zJQBHeCHvMrxssd1210"}
    

1.10 https://kyfw.12306.cn/otn/login/userLogin

作用:302跳转

  • GET
  • Referer https://kyfw.12306.cn/otn/resources/login.html

1.11 https://kyfw.12306.cn/otn/passport?redirect=/otn/login/userLogin

作用:进入登陆页,获取cookie或行为验证

  • GET
  • Referer https://kyfw.12306.cn/otn/resources/login.html

1.12 https://kyfw.12306.cn/passport/web/auth/uamtk

作用:获取newapptk

  • POST

  • data

    {'appid': 'otn'}
    
  • Referer https://kyfw.12306.cn/otn/passport?redirect=/otn/login/userLogin

  • response.json()

    {'result_message': '验证通过', 'result_code': 0, 'apptk': None, 'newapptk': '9pHw2HZlkEEoUm_Nc4DJxl1QUk6NdAZqz4DXs3B2mHAga1210'}
    
    • newapptk

1.13 https://kyfw.12306.cn/otn/uamauthclient

作用:登录后验证

  • POST

  • data

    {'tk': newapptk}
    
  • Referer https://kyfw.12306.cn/otn/passport?redirect=/otn/login/userLogin

  • response.json()

    {'apptk': '9pHw2HZlkEEoUm_Nc4DJxl1QUk6NdAZqz4DXs3B2mHAga1210', 'result_code': 0, 'result_message': '验证通过', 'username': '赵艳秋'}
    

1.14 https://kyfw.12306.cn/otn/resources/js/framework/station_name.js?station_version=1.9076

作用:解析获取车站/城市信息字典

  • GET
  • query
    • station_version=1.9076 # 随着国家的发展,新增车站/城市,会新增版本号

1.15 https://kyfw.12306.cn/otn/leftTicket/log?leftTicketDTO.train_date=%s&leftTicketDTO.from_station=%s&leftTicketDTO.to_station=%s&purpose_codes=ADULT

作用:查询车量信息前置操作

  • GET

  • query

    • train_data 出行时间 固定格式 2018-12-03
    • from_station_code 出行车站/城市编码
    • to_station_code 到达车站/城市编码
  • Referer https://kyfw.12306.cn/otn/leftTicket/init

  • response.json()

    {"validateMessagesShowId":"_validatorMessage","status":true,"httpstatus":200,"messages":[],"validateMessages":{}}
    

1.16 https://kyfw.12306.cn/otn/leftTicket/query?leftTicketDTO.train_date=%s&leftTicketDTO.from_station=%s&leftTicketDTO.to_station=%s&purpose_codes=ADULT

作用:查询车量信息

  • GET

  • query

    • train_data 出行时间 固定格式 2018-12-03
    • from_station_code 出行车站/城市编码 from 车站/城市信息字典[from_station]
    • to_station_code 到达车站/城市编码 from 车站/城市信息字典[to_station]
  • Referer https://kyfw.12306.cn/otn/leftTicket/init

  • response.json()

    • 此处需要解析车辆信息
    • secretStr
    • leftTicket
    • train_location

1.17 https://kyfw.12306.cn/otn/login/checkUser

作用:检查用户是否保持登录状态

  • POST

  • data

    {'_json_att': ''}
    
  • Referer https://kyfw.12306.cn/otn/leftTicket/init

  • response.json()

    {'validateMessagesShowId': '_validatorMessage', 'status': True, 'httpstatus': 200, 'data': {'flag': True}, 'messages': [], 'validateMessages': {}}
    

1.18 https://kyfw.12306.cn/otn/leftTicket/submitOrderRequest

作用:点击预定车票

  • POST

  • data

    {
    'secretStr': secretStr,
    'train_date': train_date,
    'back_train_date': train_date,
    'tour_flag': 'dc',  # dc 单程 wf 往返
    'purpose_codes': 'ADULT',  # 成人
    'query_from_station_name': from_station, # 车站/城市信息字典[from_station]
    'query_to_station_name': to_station, # 车站/城市信息字典[to_station]
    'undefined': ''
    }
    
  • Referer https://kyfw.12306.cn/otn/leftTicket/init

  • response.json()

    {"validateMessagesShowId":"_validatorMessage","status":true,"httpstatus":200,"data":"N","messages":[],"validateMessages":{}}
    

1.19 https://kyfw.12306.cn/otn/confirmPassenger/initDc

作用:订单初始化 获取REPEAT_SUBMIT_TOKEN 和 key_check_isChange

  • POST

  • data

    {'_json_att': ''}
    
  • Referer https://kyfw.12306.cn/otn/leftTicket/init

  • response.text

    • repeat_submit_token 正则获取
    • key_check_isChange 正则获取

1.20 https://kyfw.12306.cn/otn/confirmPassenger/getPassengerDTOs

作用:获取用户信息

  • POST

  • data

    {'_json_att': '',
    'REPEAT_SUBMIT_TOKEN': repeat_submit_token} # 1.1.19获取
    
  • Referer https://kyfw.12306.cn/otn/confirmPassenger/initDc

  • response.json()

    • 需要解析并构造乘客信息

    • 需要解析坐席编码 seat_type

    • passengerTicketStr

      passengerTicketStr = '%s,0,1,%s,%s,%s,%s,N' % (
                  seat_type, passenger_info_dict['passenger_name'],
                  passenger_info_dict['passenger_id_type_code'],
                  passenger_info_dict['passenger_id_no'],
                  passenger_info_dict['passenger_mobile_no'])
      
    • oldPassengerStr

      oldPassengerStr = '%s,%s,%s,1_' % (
                  passenger_info_dict['passenger_name'],
                  passenger_info_dict['passenger_id_type_code'],
                  passenger_info_dict['passenger_id_no'])
      

1.21 https://kyfw.12306.cn/otn/confirmPassenger/checkOrderInfo

作用:检查选票人信息

  • POST

  • data

    {
    'cancel_flag': '2',  # 未知
    'bed_level_order_num': '000000000000000000000000000000',  # 未知
    'passengerTicketStr': passengerTicketStr.encode('utf-8'),  # O,0,1,靳文强,1,142303199512240614,18335456020,N
    'oldPassengerStr': oldPassengerStr.encode('utf-8'),  # 靳文强,1,142303199512240614,1_
    'tour_flag': 'dc',  # 单程
    'randCode': '',
    'whatsSelect': '1',
    '_json_att': '',
    'REPEAT_SUBMIT_TOKEN': repeat_submit_token # 1.1.19获取
    }
    
  • Referer https://kyfw.12306.cn/otn/confirmPassenger/initDc

  • response.json()

    {"validateMessagesShowId":"_validatorMessage","status":true,"httpstatus":200,"data":{"ifShowPassCode":"N","canChooseBeds":"N","canChooseSeats":"N","choose_Seats":"MOP9","isCanChooseMid":"N","ifShowPassCodeTime":"1","submitStatus":true,"smokeStr":""},"messages":[],"validateMessages":{}}
    

1.22 https://kyfw.12306.cn/otn/confirmPassenger/getQueueCount

作用:提交订单,并获取持票未付款的人数,和车票的真实余数

  • POST

  • data

    {
    'train_date': parseDate(train_date),  # Fri Nov 24 2017 00:00:00 GMT+0800 (中国标准时间)
    'train_no': train_info_dict['train_no'],  # 6c0000G31205
    'stationTrainCode': train_info_dict['stationTrainCode'],  # G312
    'seatType': seat_type,  # 席别
    'fromStationTelecode': train_info_dict['from_station'],  # one_train[6]
    'toStationTelecode': train_info_dict['to_station'],  # ? one_train[7]
    'leftTicket': train_info_dict['leftTicket'],  # one_train[12]
    'purpose_codes': '00',
    'train_location': train_info_dict['train_location'],  # one_train[15]
    '_json_att': '',
    'REPEAT_SUBMIT_TOKEN': repeat_submit_token # 1.1.19获取
    }
    
  • Referer https://kyfw.12306.cn/otn/confirmPassenger/initDc

  • response.json()

    {"validateMessagesShowId":"_validatorMessage","status":true,"httpstatus":200,"data":{"count":"0","ticket":"20,100","op_2":"false","countT":"0","op_1":"false"},"messages":[],"validateMessages":{}}
    
    • count 持票未付款的人数
    • ticket 车票的真实余数

1.23 https://kyfw.12306.cn/otn/confirmPassenger/confirmSingleForQueue

作用:确认订单,进行扣票

  • POST

  • data

    {
    'passengerTicketStr': passengerTicketStr.encode('utf-8'),
    'oldPassengerStr': oldPassengerStr.encode('utf-8'),
    'randCode': '',
    'purpose_codes': '00',
    'key_check_isChange': key_check_isChange,
    'leftTicketStr': leftTicket,
    'train_location': train_location,  # one_train[15]
    'choose_seats': '',  # 选择坐席 ABCDEF 上中下铺 默认为空不选
    'seatDetailType': '000',
    'whatsSelect': '1',
    'roomType': '00',
    'dwAll': 'N',  # ?
    '_json_att': '',
    'REPEAT_SUBMIT_TOKEN': repeat_submit_token # 1.1.19获取
    }
    
  • Referer https://kyfw.12306.cn/otn/confirmPassenger/initDc

  • response.json()

    {'validateMessagesShowId': '_validatorMessage', 'status': True, 'httpstatus': 200, 'data': {'submitStatus': True}, 'messages': [], 'validateMessages': {}}
    

1.24 https://kyfw.12306.cn/otn/confirmPassenger/queryOrderWaitTime?random=%s&tourFlag=dc&_json_att=&REPEAT_SUBMIT_TOKEN=%s

作用:排队等待 返回waittime(下次发送重复请求所需要的等待时间) 重复发送请求直到获取 requestID 和 orderID

  • GET

  • query

    • str(int(time.time() * 1000))
    • repeat_submit_token # 1.1.19获取
  • Referer https://kyfw.12306.cn/otn/confirmPassenger/initDc

  • response.json()

    {"validateMessagesShowId":"_validatorMessage","status":true,"httpstatus":200,"data":{"queryOrderWaitTimeStatus":true,"count":0,"waitTime":4,"requestId":6478994818521140385,"waitCount":1,"tourFlag":"dc","orderId":null},"messages":[],"validateMessages":{}}
    

1.25 https://kyfw.12306.cn/otn/confirmPassenger/resultOrderForDcQueue

作用:获取订单结果,如果提示“网络传输过程中数据丢失,请查看未完成订单,继续支付!”则购票成功!

  • POST

  • data

    {
    'orderSequence_no': orderID, # 1.1.24获取
    '_json_att': '',
    'REPEAT_SUBMIT_TOKEN': repeat_submit_token # 1.1.24获取
    }
    
  • Referer https://kyfw.12306.cn/otn/confirmPassenger/initDc

  • response.json()

    {"validateMessagesShowId":"_validatorMessage","status":true,"httpstatus":200,"data":{"errMsg":"网络传输过程中数据丢失,请查看未完成订单,继续支付!","submitStatus":false},"messages":[],"validateMessages":{}}
    

2. 项目任务分解


小结

  1. 了解 12306登陆购票抓包过程

以上是关于爬虫实战12306购票抓包分析以及任务分解的主要内容,如果未能解决你的问题,请参考以下文章

快过年了,Python实现12306查票以及自动购票....

网络爬虫之12306-验证码验证

12306购票测试运行以及完整代码

12306购票构造时间参数以及下单购票

12306购票处理验证码并完成登陆

12306抢票软件连接地址