使用 PhantomJS 嵌入网页的所有图像会产生警告但有效
Posted
技术标签:
【中文标题】使用 PhantomJS 嵌入网页的所有图像会产生警告但有效【英文标题】:Using PhantomJS to embed all images of a webpage produces warnings but works 【发布时间】:2014-12-23 20:40:02 【问题描述】:我正在尝试通过嵌入所有图像(以及通过这一点后的其他外部资源)将网页转换为单个文件。以下是我运行 PhantomJs 的方式:
./phantomjs --web-security=false ./embed_images.js http://localhost/index.html > output.txt
这是embed_images.js
:
var page = require('webpage').create(),
system = require('system'),
address;
if (system.args.length === 1)
console.log('Usage: embed_images.js <some URL>');
phantom.exit(1);
else
page.onConsoleMessage = function(msg)
console.log(msg);
;
address = system.args[1];
page.open(address, function(status)
page.evaluate(function()
function embedImg(org)
var img = new Image();
img.src = org.src;
img.onload = function()
var canvas = document.createElement("canvas");
canvas.width = this.width;
canvas.height = this.height;
var ctx = canvas.getContext("2d");
ctx.drawImage(this, 0, 0);
var dataURL = canvas.toDataURL("image/png");
org.src = dataURL;
console.log(dataURL);
var imgs = document.getElementsByTagName("img");
for (var index=0; index < imgs.length; index++)
embedImg(imgs[index]);
);
phantom.exit()
);
当我运行上述命令时,它会生成如下文件:
Unsafe javascript attempt to access frame with URL from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
上述错误消息有多个实例。为了测试出了什么问题,我在 Chromium 的控制台中运行了以下代码:
function embedImg(org)
var img = new Image();
img.src = org.src;
img.onload = function()
var canvas = document.createElement("canvas");
canvas.width = this.width;
canvas.height = this.height;
var ctx = canvas.getContext("2d");
ctx.drawImage(this, 0, 0);
var dataURL = canvas.toDataURL("image/png");
org.src = dataURL;
console.log(dataURL);
var imgs = document.getElementsByTagName("img");
for (var index=0; index < imgs.length; index++)
embedImg(imgs[index]);
而且效果很好(我的网页没有引用任何跨域图片)!它将所有图像嵌入到 HTML 页面中。有谁知道可能是什么问题?
这是我的index.html
文件的内容:
<!DOCTYPE html >
<html>
<head>
<meta charset="utf-8" />
</head>
<body>
<img src="1.png" >
</body>
</html>
以及实际输出(output.txt
):
Unsafe JavaScript attempt to access frame with URL from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
Unsafe JavaScript attempt to access frame with URL about:blank from frame with URL file://./embed_images.js. Domains, protocols and ports must match.
奇怪的是,虽然我的页面上只有一张图片,但有很多错误消息!
我正在使用 phantomjs-1.9.8-linux-x86_64。
【问题讨论】:
可能跟这个有关:***.com/q/26424765 该错误属于toDataURL
调用,正如您提到的帖子中所指出的那样。但我不能确定它们是否相同,因为它们都在谈论 SVG,而我的只有一张 PNG 图像。
如果您将所有内容都包装在setTimeout(function()/*HERE*/, 2000);
中的page.open
回调中会发生什么?
你是对的。这都是 Image 的 onload
回调错误的异步行为。如果您发布它,我很乐意将您的建议标记为答案。谢谢。
我会调查的。有趣的是,我以前从未见过它,但直到今天才出现这种情况,试图为另一个问题找到解决方案。
【参考方案1】:
当调用phantom.exit
时会打印这些通知。它们不会造成任何麻烦,但当您需要干净的 PhantomJS 输出时,它们就不好用了。在您的情况下,您可以通过“异步”phantom.exit
来抑制通知,如下所示:
setTimeout(function()
phantom.exit();
, 0);
我认为发生这种情况的原因是当幻像试图退出时,从页面上下文传递了一个大字符串。
我为此创建了一个github issue。
【讨论】:
你把这个放在哪里?在你的脚本结束时? @Optimus 当你通常想调用phantom.exit()
时,你会用setTimeout()
包装那个调用。它在脚本的末尾,但前提是您考虑执行而不是实际的代码行。以上是关于使用 PhantomJS 嵌入网页的所有图像会产生警告但有效的主要内容,如果未能解决你的问题,请参考以下文章
Python:Phantomjs 找不到 chrome webdriver 工作正常的元素