selenium-webdriver(python) (十四) -- webdriver原理(转载虫师自动化)
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了selenium-webdriver(python) (十四) -- webdriver原理(转载虫师自动化)相关的知识,希望对你有一定的参考价值。
selenium-webdriver(python) (十四) -- webdriver原理
2013-08-22 12:55 by 虫师, 13926 阅读, 12 评论, 收藏, 编辑
之前看乙醇视频中提到,selenium 的ruby 实现有一个小后门,在代码中加上$DEBUG=1 ,再运行脚本的过程中,就可以看到客户端请求的信息与服务器端返回的数据;觉得这个功能很强大,可以帮助理解webdriver的运行原理。
后来查了半天,python并没有提供这样一个方便的后门,不过我们可以通过代理的方式获得这些交互信息;
一、需要安装java 虚拟机与selenium-server-standalone ,参考 《selenium + python自动化测试环境搭建》第7、8操作:
二、通过下面命令启动服务:
C:\\selenium>java -jar selenium-server-standalone-2.33.0.jar
在命令结尾加 >d:\\log.txt 可以将命令信息存入文件,但信息很少。
然后运行下面的自动化脚本:
#coding = utf-8
import time
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
driver = webdriver.Remote(desired_capabilities=DesiredCapabilities.CHROME)
driver.get("http://www.youdao.com")
driver.find_element_by_name("q").send_keys("hello")
driver.find_element_by_name("q").send_keys("key.ENTER")
driver.close()
webdriver原理:
1. WebDriver 启动目标浏览器,并绑定到指定端口。该启动的浏览器实例,做为web driver的remote server。
2. Client 端通过CommandExcuter 发送HTTPRequest 给remote server 的侦听端口(通信协议: the webriver wire protocol)
3. Remote server 需要依赖原生的浏览器组件(如:IEDriver.dll,chromedriver.exe),来转化转化浏览器的native调用。
查看命令提示符下的运行日志:
咋一看很乱,慢慢分析一下就发现很有意思!结合上面的脚本分析
---------------------------------------------------------------------------------------
启动代理进入监听状态
C:\\selenium>java -jar selenium-server-standalone-2.33.0.jar
八月 22, 2013 10:19:48 上午 org.openqa.grid.selenium.GridLauncher main
INFO: Launching a standalone server
10:19:48.734 INFO - Java: Oracle Corporation 23.21-b01
10:19:48.734 INFO - OS: Windows XP 5.1 x86
10:19:48.734 INFO - v2.33.0, with Core v2.33.0. Built from revision 4e90c97
10:19:48.843 INFO - RemoteWebDriver instances should connect to: http://127.0.0.
1:4444/wd/hub
10:19:48.843 INFO - Version Jetty/5.1.x
10:19:48.843 INFO - Started HttpContext[/selenium-server/driver,/selenium-server
/driver]
10:19:48.843 INFO - Started HttpContext[/selenium-server,/selenium-server]
10:19:48.843 INFO - Started HttpContext[/,/]
10:19:48.890 INFO - Started [email protected]176343
e
10:19:48.890 INFO - Started HttpContext[/wd,/wd]
10:19:48.906 INFO - Started SocketListener on 0.0.0.0:4444
10:19:48.906 INFO - Started [email protected]
--------------------------------------------------------------------------------------
创建新session
10:20:38.593 INFO - Executing: [new session: {platform=ANY, javascriptEnabled=tr
ue, browserName=chrome, version=}] at URL: /session)
10:20:38.593 INFO - Creating a new session for Capabilities [{platform=ANY, java
scriptEnabled=true, browserName=chrome, version=}]
webdrivr通过GET方式发送请求
[0.921][INFO]: received Webriver request: GET /status
向webdrver返回响应,返回码200表示成功
[0.921][INFO]: sending Webriver response: 200 {
"sessionId": "",
"status": 0,
"value": {
"build": {
"version": "alpha"
},
"os": {
"arch": "x86",
"name": "Windows NT",
"version": "5.1 SP3"
}
}
}
webdriver 再次以POST方式发送请求,并启动浏览器相关信息
[0.984][INFO]: received Webriver request: POST /session {
"desiredCapabilities": {
"browserName": "chrome",
"javascriptEnabled": true,
"platform": "ANY",
"version": ""
}
}
[0.984][INFO]: Launching chrome: "C:\\ocuments and Settings\\Administrator\\Local S
ettings\\Application ata\\Google\\Chrome\\Application\\chrome.exe" --remote-debugging
-port=4223 --no-first-run --enable-logging --logging-level=1 --user-data-dir="C:
\\OCUME~1\\AMINI~1\\LOCALS~1\\Temp\\scoped_dir1808_7550" --load-extension="C:\\OCUME~1
\\AMINI~