第6篇Python爬虫实战-招聘网站工作岗位需求

Posted 2021-12-21 Roc-xb

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了第6篇Python爬虫实战-招聘网站工作岗位需求相关的知识，希望对你有一定的参考价值。

一、页面结构分析

通过页面分析，我们可以看出，通过关键字搜索之后，能够得到一个以分页列表的形式展现出相关岗位招聘信息，每页包含30条招聘岗位信息。

二、明确本次目标

本次目标，根据搜索关键词，将搜索出来的结果，全部写入到csv文件中保存。

三、编写程序代码

#!/usr/bin/python
# -*- coding: UTF-8 -*-
"""
@author: Roc-xb
"""

import requests
import re
import time


def parse(page):
    headers = 
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (Khtml, like Gecko) Chrome/95.0.4638.69 Safari/537.36',
        'Cookie': '登录后的cookie信息'
    print("正在获取第页的数据".format(page).center(100, "*"))
    html = requests.get('https://sou.zhaopin.com/?jl=801&p=1&kw=java&p='.format(page), headers=headers).text
    # 工作名称
    name = re.findall(r'"matchInfo":.*?"name":"(.*?)"', html)
    # 公司名称
    companyName = re.findall(r'"companyName":"(.*?)"', html)
    # 城市数据
    cityDistrict = re.findall(r'"cityDistrict":"(.*?)"', html)
    # 学历
    education = re.findall(r'"education":"(.*?)"', html)
    # 薪资
    salary60 = re.findall(r'"salary60":"(.*?)"', html)
    # 经验要求
    workingExp = re.findall(r'"workingExp":"(.*?)"', html)
    # 公司性质
    property = re.findall(r'"property":"(.*?)"', html)
    # 公司规模
    companySize = re.findall(r'"companySize":"(.*?)"', html)
    # 工作类型
    workType = re.findall(r'"workType":"(.*?)"', html)
    # 详情页链接
    positionURL = re.findall(r'"positionURL":"(.*?)"', html)
    items = 
    items['name'] = name
    items['companyName'] = companyName
    items['cityDistrict'] = cityDistrict
    items['education'] = education
    items['salary60'] = salary60
    items['workingExp'] = workingExp
    items['property'] = property
    items['companySize'] = companySize
    items['workType'] = workType
    items['positionURL'] = positionURL
    # 休息5秒钟
    time.sleep(5)
    write_csv_file(items)


# 写入CSV文件
def write_csv_file(items):
    f = open('zlzp.csv', 'a+', encoding='utf-8-sig')
    for i in range(30):
        line = f'items["name"][i],items["companyName"][i], items["cityDistrict"][i],items["education"][i], items["salary60"][i], items["workingExp"][i],items["property"][i], items["companySize"][i], items["workType"][i],items["positionURL"][i]' + "\\n"
        print(line)
        f.write(line)
    f.close()


if __name__ == '__main__':
    for page in range(1, 34):
        parse(page)

四、程序运行结果

以上是关于第6篇Python爬虫实战-招聘网站工作岗位需求的主要内容，如果未能解决你的问题，请参考以下文章