Selenium - 以相等的间隔滚动以捕获整个网页时生成的额外屏幕截图

Posted 2023-03-24

技术标签:

【中文标题】Selenium - 以相等的间隔滚动以捕获整个网页时生成的额外屏幕截图【英文标题】：Selenium - Extra screenshots generated when scrolling in equivalent intervals to capture an entire web page 【发布时间】：2019-07-07 16:38:15 【问题描述】：

目标：使用 Java 和 JSExecutor for Selenium WebDriver 以相等的间隔滚动以截取整个页面。

问题：我在下面实现的方法有效，但是，我最终在结束时有 2-3 个额外的屏幕截图网页 - 我想避免这些多余的屏幕截图。

滚动方式如下：

public static void pageScrollable() throws InterruptedException, IOException 

    javascriptExecutor jse = (JavascriptExecutor) driver;

    //Find page height
    pageHeight = ((Number) jse.executeScript("return document.body.scrollHeight")).intValue();
    //Find current browser dimensions and isolate its height
    Dimension d = driver.manage().window().getSize();
    int browserSize = d.getHeight();

    int currentHeight = 0;      
    System.out.println("Current scroll at: " + currentHeight);
    System.out.println("Page height is: " + pageHeight + "\n");

    //Scrolling logic
    while(pageHeight>=currentHeight)   
        jse.executeScript("window.scrollBy(0,"+currentHeight+")", "");
        screenShot();
        currentHeight+=browserSize;
        System.out.println("Current scroll now at: " + currentHeight);

截图方法如下：

public static void screenShot() throws IOException, InterruptedException 
    File screenShot = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
    FileUtils.copyFile(screenShot, new File("C:\\FilePath\\Screen " + count + ".png"));
    count++;
    System.out.println("Current screenshot count: " + count);

以下变量被定义为静态：

static int pageHeight = 0;
static int count = 0;
static WebDriver driver;

我知道目前没有使用 Selenium 捕获整个网页的屏幕的实现。任何帮助解决我的上述逻辑将不胜感激。

【问题讨论】：

【参考方案1】：

您的滚动逻辑是附加图像的原因。 Window.scrollBy 方法滚动多个像素而不是绝对位置。你需要滚动browserSize。

也许您应该使用

来标识视口大小而不是浏览器窗口大小

int browserSize = ((Number) jse.executeScript("return window.innerHeight")).intValue();

我添加了一个完整的示例，如何使用 Chrome 获取单页屏幕截图：

package demo;

import java.awt.Graphics2D;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.util.LinkedList;
import java.util.List;
import javax.imageio.ImageIO;
import org.openqa.selenium.JavascriptExecutor;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebDriverException;
import org.openqa.selenium.chrome.ChromeDriver;

public class WebDriverDemo 
  static int pageHeight = 0;
  static int count = 0;
  static WebDriver driver;
  static List<BufferedImage> images = new LinkedList<>();

  public static void main(String[] args) 
    driver = new ChromeDriver();

    driver.get("http://automationpractice.com/");
    try 
      pageScrollable();

     catch (WebDriverException | InterruptedException | IOException e) 
      e.printStackTrace();
    

    driver.quit();
  

  public static void pageScrollable() throws InterruptedException, IOException 

    JavascriptExecutor jse = (JavascriptExecutor) driver;

    // Find page height
    pageHeight = ((Number) jse.executeScript("return document.body.scrollHeight")).intValue();
    // Find current browser dimensions and isolate its height
    int browserSize = ((Number) jse.executeScript("return window.innerHeight")).intValue();
    System.out.println("Page height is: " + pageHeight + "\n");
    System.out.println("Browser height is: " + browserSize + "\n");

    int currentHeight = 0;
    System.out.println("Current scroll at: " + currentHeight);
    System.out.println("Page height is: " + pageHeight + "\n");

    // Scrolling logic
    while (pageHeight >= currentHeight) 
      screenShot();
      currentHeight += browserSize;
      jse.executeScript("window.scrollBy(0," + browserSize + ")", "");
      System.out.println("Current scroll now at: " + currentHeight);
    

    BufferedImage result = null;
    Graphics2D g2d = null;
    int heightCurr = 0;
    System.out.println("Image count is " + images.size());
    for (int i = 0; i < images.size(); i++) 
      BufferedImage img = images.get(i);
      int imageHeight = 0;
      if (result == null) 
        System.out.println("Image height is " + img.getHeight()); // differs from browserSize
        imageHeight = pageHeight + images.size() * (img.getHeight() - browserSize);
        result = new BufferedImage(img.getWidth(), imageHeight, img.getType());
        g2d = result.createGraphics();
      
      if (i == images.size() - 1) 
        g2d.drawImage(img, 0, imageHeight - img.getHeight(), null);
       else 
        g2d.drawImage(img, 0, heightCurr, null);
        heightCurr += img.getHeight();
      
    
    g2d.dispose();
    ImageIO.write(result, "png", new File("screenshot.png"));
  

  public static void screenShot() throws IOException, InterruptedException 
    byte[] scr = ((TakesScreenshot) driver).getScreenshotAs(OutputType.BYTES);
    BufferedImage img = ImageIO.read(new ByteArrayInputStream(scr));
    images.add(img);

不幸的是，由于其Window.scrollBy 实现中的故障，它无法通过 FireFox 提供正确的结果。

【讨论】：

以上是关于Selenium - 以相等的间隔滚动以捕获整个网页时生成的额外屏幕截图的主要内容，如果未能解决你的问题，请参考以下文章