当URL错误时,PhantomJS不会返回错误(python)

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了当URL错误时,PhantomJS不会返回错误(python)相关的知识,希望对你有一定的参考价值。

我在python中使用selenium,尤其是PhantomJS问题是当我发送错误的URL时,firefox驱动程序捕获错误时没有错误

from selenium import webdriver
from selenium.common.exceptions import TimeoutException, 
NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.PhantomJS()
driver.get("drr,.gh")

我想知道我的程序如何识别错误的URL

这是我在这里的第一篇文章,所以如果我犯了错误,我道歉,我也为我的英语道歉

谢谢

答案

当我们通过Malformed URLget()提供GeckoDriver作为ChromeDriver方法的参数时,我们确实看到一个正确的异常显示为:

selenium.common.exceptions.WebDriverException: Message: Malformed URL: drr,.gh is not a valid URL.

但是当通过Malformed URL从日志中提供get()作为PhantomJSDriver方法的参数时,似乎缺少正确格式化URL的验证,并且PhantomJSDriver继续尝试浏览它从未成功的URL,如下所示:

[INFO  - 2017-12-14T08:13:37.981Z] GhostDriver - Main - running on port 2585
[INFO  - 2017-12-14T08:13:39.482Z] Session [b0debdb0-e0a6-11e7-ad4b-79a57b4a1a11] - page.settings - {"XSSAuditingEnabled":false,"javascriptCanCloseWindows":true,"javascriptCanOpenWindows":true,"javascriptEnabled":true,"loadImages":true,"localToRemoteUrlAccessEnabled":false,"userAgent":"Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.1 Safari/538.1","webSecurityEnabled":true}
[INFO  - 2017-12-14T08:13:39.482Z] Session [b0debdb0-e0a6-11e7-ad4b-79a57b4a1a11] - page.customHeaders:  - {}
[INFO  - 2017-12-14T08:13:39.482Z] Session [b0debdb0-e0a6-11e7-ad4b-79a57b4a1a11] - Session.negotiatedCapabilities - {"browserName":"phantomjs","version":"2.1.1","driverName":"ghostdriver","driverVersion":"1.2.0","platform":"windows-8-32bit","javascriptEnabled":true,"takesScreenshot":true,"handlesAlerts":false,"databaseEnabled":false,"locationContextEnabled":false,"applicationCacheEnabled":false,"browserConnectionEnabled":false,"cssSelectorsEnabled":true,"webStorageEnabled":false,"rotatable":false,"acceptSslCerts":false,"nativeEvents":true,"proxy":{"proxyType":"direct"}}
[INFO  - 2017-12-14T08:13:39.482Z] SessionManagerReqHand - _postNewSessionCommand - New Session Created: b0debdb0-e0a6-11e7-ad4b-79a57b4a1a11

方案:

作为一个解决方案,我们可以推广Malformed URL作为get()方法作为Unreachable Destination的参数的传递,你可以如下诱导set_page_load_timeout(seconds)

from selenium import webdriver

driver = webdriver.PhantomJS(executable_path=r'C:\Utility\phantomjs-2.1.1-windows\bin\phantomjs.exe')
driver.set_page_load_timeout(2)
driver.get("drr,.gh")

如果发生page_load_timeout,您将看到以下日志消息:

[ERROR - 2017-12-14T08:25:43.994Z] RouterReqHand - _handle.error - {"name":"Missing Command Parameter","message":"{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Connection":"close","Content-Length":"71","Content-Type":"application/json;charset=UTF-8","Host":"127.0.0.1:2650","User-Agent":"Python http auth"},"httpVersion":"1.1","method":"POST","post":"{\"pageLoad\": 2000, \"sessionId\": \"60b5bc60-e0a8-11e7-bb6f-8df56dd28746\"}","url":"/timeouts","urlParsed":{"anchor":"","query":"","file":"timeouts","directory":"/","path":"/timeouts","relative":"/timeouts","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/timeouts","queryKey":{},"chunks":["timeouts"]},"urlOriginal":"/session/60b5bc60-e0a8-11e7-bb6f-8df56dd28746/timeouts"}","line":546,"sourceURL":"phantomjs://code/session_request_handler.js","stack":"_postTimeout@phantomjs://code/session_request_handler.js:546:73
_handle@phantomjs://code/session_request_handler.js:148:25
_reroute@phantomjs://code/request_handler.js:61:20
_handle@phantomjs://code/router_request_handler.js:78:46"}

  phantomjs://platform/console++.js:263 in error

你可以在这里找到关于set_page_load_timeout()方法的详细讨论。

以上是关于当URL错误时,PhantomJS不会返回错误(python)的主要内容,如果未能解决你的问题,请参考以下文章

Python 3.7- PhantomJS - Driver.get(url)'窗口句柄/名称无效或已关闭?'

Socks5协议错误PhantomJS

当 URL 包含百分比符号时如何停止错误 400? (阿帕奇)

如何通过karma和phantomjs来解决内存错误

运行单元测试时出现语法错误后 PhantomJS 退出

从 phantomjs 收到错误:错误:渲染时,已达到超时