Selenium - 以相等的间隔滚动以捕获整个网页时生成的额外屏幕截图

Posted

技术标签:

【中文标题】Selenium - 以相等的间隔滚动以捕获整个网页时生成的额外屏幕截图【英文标题】:Selenium - Extra screenshots generated when scrolling in equivalent intervals to capture an entire web page 【发布时间】:2019-07-07 16:38:15 【问题描述】:

目标:使用 Java 和 JSExecutor for Selenium WebDriver 以相等的间隔滚动以截取整个页面。

问题:我在下面实现的方法有效,但是,我最终在结束时有 2-3 个额外的屏幕截图网页 - 我想避免这些多余的屏幕截图。

滚动方式如下:

public static void pageScrollable() throws InterruptedException, IOException 

    javascriptExecutor jse = (JavascriptExecutor) driver;

    //Find page height
    pageHeight = ((Number) jse.executeScript("return document.body.scrollHeight")).intValue();
    //Find current browser dimensions and isolate its height
    Dimension d = driver.manage().window().getSize();
    int browserSize = d.getHeight();

    int currentHeight = 0;      
    System.out.println("Current scroll at: " + currentHeight);
    System.out.println("Page height is: " + pageHeight + "\n");

    //Scrolling logic
    while(pageHeight>=currentHeight)   
        jse.executeScript("window.scrollBy(0,"+currentHeight+")", "");
        screenShot();
        currentHeight+=browserSize;
        System.out.println("Current scroll now at: " + currentHeight);
       

截图方法如下:

public static void screenShot() throws IOException, InterruptedException 
    File screenShot = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
    FileUtils.copyFile(screenShot, new File("C:\\FilePath\\Screen " + count + ".png"));
    count++;
    System.out.println("Current screenshot count: " + count);

以下变量被定义为静态:

static int pageHeight = 0;
static int count = 0;
static WebDriver driver;

我知道目前没有使用 Selenium 捕获整个网页的屏幕的实现。任何帮助解决我的上述逻辑将不胜感激。

【问题讨论】:

【参考方案1】:

您的滚动逻辑是附加图像的原因。 Window.scrollBy 方法滚动多个像素而不是绝对位置。你需要滚动browserSize

也许您应该使用

来标识视口大小而不是浏览器窗口大小
int browserSize = ((Number) jse.executeScript("return window.innerHeight")).intValue();

我添加了一个完整的示例,如何使用 Chrome 获取单页屏幕截图:

package demo;

import java.awt.Graphics2D;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.util.LinkedList;
import java.util.List;
import javax.imageio.ImageIO;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebDriverException;
import org.openqa.selenium.chrome.ChromeDriver;

public class WebDriverDemo 
  static int pageHeight = 0;
  static int count = 0;
  static WebDriver driver;
  static List<BufferedImage> images = new LinkedList<>();

  public static void main(String[] args) 
    driver = new ChromeDriver();

    driver.get("http://automationpractice.com/");
    try 
      pageScrollable();

     catch (WebDriverException | InterruptedException | IOException e) 
      e.printStackTrace();
    

    driver.quit();
  

  public static void pageScrollable() throws InterruptedException, IOException 

    JavascriptExecutor jse = (JavascriptExecutor) driver;

    // Find page height
    pageHeight = ((Number) jse.executeScript("return document.body.scrollHeight")).intValue();
    // Find current browser dimensions and isolate its height
    int browserSize = ((Number) jse.executeScript("return window.innerHeight")).intValue();
    System.out.println("Page height is: " + pageHeight + "\n");
    System.out.println("Browser height is: " + browserSize + "\n");

    int currentHeight = 0;
    System.out.println("Current scroll at: " + currentHeight);
    System.out.println("Page height is: " + pageHeight + "\n");

    // Scrolling logic
    while (pageHeight >= currentHeight) 
      screenShot();
      currentHeight += browserSize;
      jse.executeScript("window.scrollBy(0," + browserSize + ")", "");
      System.out.println("Current scroll now at: " + currentHeight);
    

    BufferedImage result = null;
    Graphics2D g2d = null;
    int heightCurr = 0;
    System.out.println("Image count is " + images.size());
    for (int i = 0; i < images.size(); i++) 
      BufferedImage img = images.get(i);
      int imageHeight = 0;
      if (result == null) 
        System.out.println("Image height is " + img.getHeight()); // differs from browserSize
        imageHeight = pageHeight + images.size() * (img.getHeight() - browserSize);
        result = new BufferedImage(img.getWidth(), imageHeight, img.getType());
        g2d = result.createGraphics();
      
      if (i == images.size() - 1) 
        g2d.drawImage(img, 0, imageHeight - img.getHeight(), null);
       else 
        g2d.drawImage(img, 0, heightCurr, null);
        heightCurr += img.getHeight();
      
    
    g2d.dispose();
    ImageIO.write(result, "png", new File("screenshot.png"));
  

  public static void screenShot() throws IOException, InterruptedException 
    byte[] scr = ((TakesScreenshot) driver).getScreenshotAs(OutputType.BYTES);
    BufferedImage img = ImageIO.read(new ByteArrayInputStream(scr));
    images.add(img);
  


不幸的是,由于其Window.scrollBy 实现中的故障,它无法通过 FireFox 提供正确的结果。

【讨论】:

以上是关于Selenium - 以相等的间隔滚动以捕获整个网页时生成的额外屏幕截图的主要内容,如果未能解决你的问题,请参考以下文章

以编程方式截取 iOS 中另一个应用程序的整个可滚动区域的屏幕截图

如何使用Objective C以2秒的间隔一张一张地自动滚动图像?

梯度提升模型的变量重要性(以 50 天为间隔滚动)

iOS自动布局:相等的空间以适应超级视图宽度[重复]

Flex - 如何显示可滚动文本并捕获单击/焦点事件以允许添加新文本?

如何使用Selenium Webdriver捕获特定元素而不是整个页面的屏幕截图?