如何使用 PlayWright (python) 下载嵌入式 PDF？

Posted 2023-02-23

技术标签:

【中文标题】如何使用 PlayWright (python) 下载嵌入式 PDF？【英文标题】：How can I download an embeded PDF with PlayWright (python)? 【发布时间】：2022-01-23 05:43:29 【问题描述】：

我正在尝试下载一个嵌入的 pdf 文件或获取其原始内容以使用带有 Python 的 PlayWright 将其存储在一个变量中。我得到 page.content() 的以下输出：

我找不到下载按钮或打印按钮

【问题讨论】：

请提供足够的代码，以便其他人更好地理解或重现问题。 【参考方案1】：

当您访问该页面时，playwright/chrome 将需要发送网络请求以检索您正在查看的 PDF。只需拦截该请求，并将响应主体作为缓冲区获取。

如果您只是直接导航到 PDF，我认为这样的方法会起作用：

response = page.goto(url)
print(response.status)
body = response.body()

# Open file in binary write mode
pdf_file = open('file.pdf', 'wb')

# Write bytes to file
pdf_file.write(body)

# Close file
pdf_file.close()

【讨论】：

我使用的是 Chromium。我将其更改为 firefox，response.body() 的输出为 "b'%PDF-1.4\"(...) 非常感谢！

以上是关于如何使用 PlayWright (python) 下载嵌入式 PDF？的主要内容，如果未能解决你的问题，请参考以下文章