Python实战—CSDN自动登录及评论
Posted 白玉梁
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Python实战—CSDN自动登录及评论相关的知识,希望对你有一定的参考价值。
首先,打开CSDN登录页:https://passport.csdn.net/login?code=account,我们以账号密码登录为例:
使用selenium打开url:
browser = webdriver.Firefox(executable_path=r"C:\\geckodriver.exe")
browser.get("https://passport.csdn.net/login?code=account")
既然是自动登录,那么就不需要我们手动输入用户名密码以及点击登录按钮,全部可以通过程序来做,我们需要做的是,分析出账号和密码输入框,并自动填入账号密码,以及获取到登录按钮,自动点击!
按F12检查元素:
获取到控件后,书写程序:
browser = webdriver.Firefox(executable_path=r"C:\\geckodriver.exe")
browser.get("https://passport.csdn.net/login?code=account")
account = browser.find_element_by_id('all')
pwd = browser.find_element_by_id('password-number')
btn = browser.find_element_by_class_name('form-submit').find_element_by_tag_name('button')
account.send_keys("150xxxxxxxx")
pwd.send_keys("*********")
btn.click()
注意,账号密码是你自己的,然后执行程序,观测浏览器,正常情况下,会直接登录成功!
异常情况下,就会跳出验证框:
此时,我们就必须要通过selenium进行模拟滑动了!在此之前,我们仍然需要F12,定位出滑块元素:
滑动滑块关键点:
- 获取滑块(上述方法);
- 获取需要滑动距离:
- 生成滑动轨迹;
- 滑动时先加速后减速(模拟人为);
以下生成滑动轨迹方法,来源于网络:
def get_tracks(distance):
# 移动轨迹
tracks = []
# 当前位移
current = 0
# 减速阈值
mid = distance * 4 / 5
# 计算间隔
t = 0.2
# 初速度
v = 200
while current < distance:
if current < mid:
# 加速度为正2
a = 5
else:
# 加速度为负3
a = -3
# 初速度v0
v0 = v
# 当前速度
v = v0 + a * t
# 移动距离
move = v0 * t + 1 / 2 * a * t * t
# 当前位移
current += move
# 加入轨迹
tracks.append(round(move))
return tracks
模拟滑块滑动,我们需要用到ActionChains,模拟人类行为动作:
try:
wait = WebDriverWait(driver, 10)
# 获取滑块滑动区域
slide = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'div#nc_1__scale_text > span.nc-lang-cnt')))
# 获取滑块
slide_button = wait.until(EC.element_to_be_clickable((By.ID, 'nc_1_n1z')))
# 获取滑块滑动距离
width = slide.size['width'] - slide_button.size['width']
# 生成滑动轨迹
tracks = get_tracks(width)
# 滑动滑块
action = ActionChains(driver)
action.click_and_hold(slide_button).perform()
for distance in tracks:
action.move_by_offset(xoffset=distance, yoffset=0).perform()
time.sleep(0.5)
action.release().perform()
except Exception as e:
print(str(e))
WebDriverWait的作用,就是等待某个element加载出来后,再做处理,区别与time.sleep,time.sleep只能设置固定等待时间,缺点就是设置时间过短但加载时间过长,或设置时间过长但加载时间很短!
理论上,这样已经可以了,但实际上,网站方基本都做了反爬处理,原理就是检测你是否使用了webdriver,一旦检测到,无论你模拟的有多像,都无法通过验证,会提示错误:
所以,早之前网络上的解决办法就是绕过检测,让检测机制检测不出webdriver,网络上查到的有几种办法:
1:
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"""
})
2:
option.add_argument('--disable-blink-features=AutomationControlled')
3:
option.add_argument("start-maximized")
option.add_argument("--disable-blink-features=AutomationControlled")
option.add_experimental_option("excludeSwitches", ["enable-automation"])
option.add_experimental_option("useAutomationExtension", False)
4:
# stealth.min.js生成方法(需安装nodejs):npx extract-stealth-evasions
with open('../stealth.min.js') as f:
js = f.read()
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": js
})
但遗憾的是,经过本人两天的各种搜索测试,都无法解决问题,仅有的效果是不再报错了,却一直弹滑动框!
该问题有待解决,如果有最新的破解办法,还请大家不吝赐教…
继续吧~假如上一步登录成功,那么就自动进入博客详情页,并自动填写评论后提交:
driver.execute_script("window.location.href='https://baiyuliang.blog.csdn.net/article/details/120473414'")
textarea = driver.find_element_by_id('comment_content')
# send_keys填写评论内容
textarea.send_keys("777")
comment_form = driver.find_element_by_id('commentform')
comment_form.submit()
如果直接获取发表评论按钮是获取不到的,因为填写评论前它是不可见的,这里我就直接获取了输入框的form表单,然后提交:
整个py代码:
import time
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.chrome.options import Options
option = Options()
# 指定浏览器路径
option.binary_location = r'D:\\Program Files\\Chrome\\chrome.exe'
driver = webdriver.Chrome(executable_path=r"C:chromedriver.exe", options=option)
def get_tracks(distance):
# 移动轨迹
tracks = []
# 当前位移
current = 0
# 减速阈值
mid = distance * 4 / 5
# 计算间隔
t = 0.2
# 初速度
v = 200
while current < distance:
if current < mid:
# 加速度为正2
a = 5
else:
# 加速度为负3
a = -3
# 初速度v0
v0 = v
# 当前速度
v = v0 + a * t
# 移动距离
move = v0 * t + 1 / 2 * a * t * t
# 当前位移
current += move
# 加入轨迹
tracks.append(round(move))
return tracks
def move_to_gap(slider, tracks):
# 模拟滑动滑块
action = ActionChains(driver)
action.click_and_hold(slider).perform()
# action.reset_actions() # 清除之前的action
for i in tracks:
action.move_by_offset(xoffset=i, yoffset=0).perform()
time.sleep(0.1)
action.release().perform()
driver.get("https://passport.csdn.net/login?code=account")
account = driver.find_element_by_id('all')
pwd = driver.find_element_by_id('password-number')
btn = driver.find_element_by_class_name('form-submit').find_element_by_tag_name('button')
account.send_keys("你的手机号")
pwd.send_keys("你的密码")
btn.click()
try:
wait = WebDriverWait(driver, 10)
# 获取滑块滑动区域
slide = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'div#nc_1__scale_text > span.nc-lang-cnt')))
# 获取滑块
slide_button = wait.until(EC.element_to_be_clickable((By.ID, 'nc_1_n1z')))
# 获取滑块滑动距离
width = slide.size['width'] - slide_button.size['width']
# 生成滑动轨迹
tracks = get_tracks(width)
# 滑动滑块
action = ActionChains(driver)
action.click_and_hold(slide_button).perform()
for distance in tracks:
action.move_by_offset(xoffset=distance, yoffset=0).perform()
time.sleep(0.5)
action.release().perform()
except Exception as e:
print(str(e))
time.sleep(3) # 等待3s
driver.execute_script("window.location.href='https://baiyuliang.blog.csdn.net/article/details/120473414'")
textarea = driver.find_element_by_id('comment_content')
textarea.send_keys("777")
comment_form = driver.find_element_by_id('commentform')
comment_form.submit()
time.sleep(3) # 等待3s
comment_list_box = driver.find_element_by_css_selector('div.comment-list-box')
comment_list = comment_list_box.find_element_by_class_name('comment-list')
comment_line_box = comment_list.find_elements_by_class_name('comment-line-box')
for comment in comment_line_box:
span_text = comment.find_element_by_class_name('new-comment').text
print(span_text)
以上是关于Python实战—CSDN自动登录及评论的主要内容,如果未能解决你的问题,请参考以下文章