如何使用 PDF.JS 显示整个 PDF(不仅仅是一页)?
Posted
技术标签:
【中文标题】如何使用 PDF.JS 显示整个 PDF(不仅仅是一页)?【英文标题】:How to display whole PDF (not only one page) with PDF.JS? 【发布时间】:2013-05-05 00:46:40 【问题描述】:我已经创建了这个演示:
http://polishwords.com.pl/dev/pdfjs/test.html
它显示一页。我想显示所有页面。一个在另一个之下,或者放置一些按钮来更改页面,甚至更好地加载 PDF.JS 的所有标准控件,就像在 Firefox 中一样。如何实现?
【问题讨论】:
github.com/mozilla/pdf.js 在这里获得灵感:mozilla.github.io/pdf.js/web/viewer.html @DekDekku kuncajs 在我问这个问题之前,我今天整天都在阅读这些网站。他们没有帮助 @tomaszs 你为什么没有将此标记为已回答? 您的问题将通过此解决方案得到解答! ***.com/questions/25162554/… 【参考方案1】:PDFJS 有一个成员变量numPages
,因此您只需遍历它们。 但是请务必记住,在 pdf.js 中获取页面是异步的,因此无法保证顺序。所以你需要把它们锁起来。您可以按照以下方式做一些事情:
var currPage = 1; //Pages are 1-based not 0-based
var numPages = 0;
var thePDF = null;
//This is where you start
PDFJS.getDocument(url).then(function(pdf)
//Set PDFJS global object (so we can easily access in our page functions
thePDF = pdf;
//How many pages it has
numPages = pdf.numPages;
//Start with first page
pdf.getPage( 1 ).then( handlePages );
);
function handlePages(page)
//This gives us the page's dimensions at full scale
var viewport = page.getViewport( 1 );
//We'll create a canvas for each page to draw it on
var canvas = document.createElement( "canvas" );
canvas.style.display = "block";
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
//Draw it on the canvas
page.render(canvasContext: context, viewport: viewport);
//Add it to the web page
document.body.appendChild( canvas );
//Move to next page
currPage++;
if ( thePDF !== null && currPage <= numPages )
thePDF.getPage( currPage ).then( handlePages );
【讨论】:
这对我不起作用。我的画布在 div 内,当在代码上方运行时,它会在页面末尾显示 pdf 页面(不是 div) @Sara 你需要学习 DOM。上面的代码只是一个例子。它将创建的页面附加到文档中。您需要将它们放在您的 div 中,并根据项目的需要设置画布的样式。但所有这些都超出了这个问题的范围 感谢您的快速回复:) 我添加了 div 并将画布添加到正确的位置但它覆盖了它们.. @Mr.Hyde 我已经好几年没看过这个项目了。很可能 api 有方法可以帮助解决这个问题,但您仍然可以使用画布并监听鼠标事件来实现文本选择。 完美解决方案【参考方案2】:这是我的看法。以正确的顺序呈现所有页面,并且仍然异步工作。
<style>
#pdf-viewer
width: 100%;
height: 100%;
background: rgba(0, 0, 0, 0.1);
overflow: auto;
.pdf-page-canvas
display: block;
margin: 5px auto;
border: 1px solid rgba(0, 0, 0, 0.2);
</style>
<script>
url = 'https://github.com/mozilla/pdf.js/blob/master/test/pdfs/tracemonkey.pdf';
var thePdf = null;
var scale = 1;
PDFJS.getDocument(url).promise.then(function(pdf)
thePdf = pdf;
viewer = document.getElementById('pdf-viewer');
for(page = 1; page <= pdf.numPages; page++)
canvas = document.createElement("canvas");
canvas.className = 'pdf-page-canvas';
viewer.appendChild(canvas);
renderPage(page, canvas);
);
function renderPage(pageNumber, canvas)
thePdf.getPage(pageNumber).then(function(page)
viewport = page.getViewport( scale: scale );
canvas.height = viewport.height;
canvas.width = viewport.width;
page.render(canvasContext: canvas.getContext('2d'), viewport: viewport);
);
</script>
<div id='pdf-viewer'></div>
【讨论】:
太棒了 - 谢谢! 简单干净。使其 scale = 2 for web。 此解决方案如何确保正确的页面顺序?在我看来,它仍然可能在竞争条件下出现故障,因为您正在遍历要渲染的页面,但不等待上一页完成处理?如果我看错了,请纠正我 @redfox05 画布元素按顺序创建和附加。然后,渲染函数在它作为参数接收的画布上工作。 @RetoHöhener,谢谢,是的,我自己也想知道,所以我回来看看你是否回复了。当我考虑传递引用时,它就点击了。所以画布是按顺序创建的,然后对它的引用被传递给渲染函数,所以当它完成加载该页面时,它将它扔到它来自的原始画布元素中,从而按顺序渲染它:)我认为在我的实现中,我会在元素中添加某种 ID 计数,以使其对下一个开发人员更加明显。【参考方案3】:pdfjs-dist 库包含用于构建 PDF 查看器的部分。您可以使用 PDFPageView 呈现所有页面。基于https://github.com/mozilla/pdf.js/blob/master/examples/components/pageviewer.html:
var url = "https://cdn.mozilla.net/pdfjs/tracemonkey.pdf";
var container = document.getElementById('container');
// Load document
PDFJS.getDocument(url).then(function (doc)
var promise = Promise.resolve();
for (var i = 0; i < doc.numPages; i++)
// One-by-one load pages
promise = promise.then(function (id)
return doc.getPage(id + 1).then(function (pdfPage)
// Add div with page view.
var SCALE = 1.0;
var pdfPageView = new PDFJS.PDFPageView(
container: container,
id: id,
scale: SCALE,
defaultViewport: pdfPage.getViewport(SCALE),
// We can enable text/annotations layers, if needed
textLayerFactory: new PDFJS.DefaultTextLayerFactory(),
annotationLayerFactory: new PDFJS.DefaultAnnotationLayerFactory()
);
// Associates the actual page with the view, and drawing it
pdfPageView.setPdfPage(pdfPage);
return pdfPageView.draw();
);
.bind(null, i));
return promise;
);
#container > *:not(:first-child)
border-top: solid 1px black;
<link href="https://npmcdn.com/pdfjs-dist/web/pdf_viewer.css" rel="stylesheet"/>
<script src="https://npmcdn.com/pdfjs-dist/web/compatibility.js"></script>
<script src="https://npmcdn.com/pdfjs-dist/build/pdf.js"></script>
<script src="https://npmcdn.com/pdfjs-dist/web/pdf_viewer.js"></script>
<div id="container" class="pdfViewer singlePageView"></div>
【讨论】:
感谢您提供我需要的工作代码 sn-p。 "message": "Uncaught ReferenceError: PDFJS is not defined",【参考方案4】:已接受的答案不再有效(2021 年),由于 API 将 var viewport = page.getViewport( 1 );
更改为 var viewport = page.getViewport(scale: scale);
,您可以尝试以下完整的工作 html,只需将以下内容复制到 html
文件,然后打开它:
<html>
<head>
<script src="https://mozilla.github.io/pdf.js/build/pdf.js"></script>
<head>
<body>
</body>
<script>
var url = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/web/compressed.tracemonkey-pldi-09.pdf';
// Loaded via <script> tag, create shortcut to access PDF.js exports.
var pdfjsLib = window['pdfjs-dist/build/pdf'];
// The workerSrc property shall be specified.
pdfjsLib.GlobalWorkerOptions.workerSrc = 'https://mozilla.github.io/pdf.js/build/pdf.worker.js';
var currPage = 1; //Pages are 1-based not 0-based
var numPages = 0;
var thePDF = null;
//This is where you start
pdfjsLib.getDocument(url).promise.then(function(pdf)
//Set PDFJS global object (so we can easily access in our page functions
thePDF = pdf;
//How many pages it has
numPages = pdf.numPages;
//Start with first page
pdf.getPage( 1 ).then( handlePages );
);
function handlePages(page)
//This gives us the page's dimensions at full scale
var viewport = page.getViewport( scale: 1.5 );
//We'll create a canvas for each page to draw it on
var canvas = document.createElement( "canvas" );
canvas.style.display = "block";
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
//Draw it on the canvas
page.render(canvasContext: context, viewport: viewport);
//Add it to the web page
document.body.appendChild( canvas );
var line = document.createElement("hr");
document.body.appendChild( line );
//Move to next page
currPage++;
if ( thePDF !== null && currPage <= numPages )
thePDF.getPage( currPage ).then( handlePages );
</script>
</html>
【讨论】:
是唯一改变getViewport
的方法吗?
@DonRhummy 是的。
这是截至 2021 年的有效答案【参考方案5】:
以下答案是部分答案,针对任何试图让 PDF.js 在 2019 年显示整个 PDF 的人,因为 api 已发生重大变化。这当然是 OP 最关心的问题。 inspiration sample code
请注意以下几点:
正在使用额外的库 -- Lodash(用于 range() 函数)和 polyfills(用于 promises)...... 正在使用引导程序 <div class="row">
<div class="col-md-10 col-md-offset-1">
<div id="wrapper">
</div>
</div>
</div>
<style>
body
background-color: #808080;
/* margin: 0; padding: 0; */
</style>
<link href="//cdnjs.cloudflare.com/ajax/libs/pdf.js/2.1.266/pdf_viewer.css" rel="stylesheet"/>
<script src="//cdnjs.cloudflare.com/ajax/libs/pdf.js/2.1.266/pdf.js"></script>
<script src="//cdnjs.cloudflare.com/ajax/libs/pdf.js/2.1.266/pdf_viewer.js"></script>
<script src="//cdn.polyfill.io/v2/polyfill.min.js"></script>
<script src="//cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.15/lodash.js"></script>
<script>
$(document).ready(function ()
// startup
);
'use strict';
if (!pdfjsLib.getDocument || !pdfjsViewer.PDFViewer)
alert("Please build the pdfjs-dist library using\n" +
" `gulp dist-install`");
var url = '//www.pdf995.com/samples/pdf.pdf';
pdfjsLib.GlobalWorkerOptions.workerSrc =
'//cdnjs.cloudflare.com/ajax/libs/pdf.js/2.1.266/pdf.worker.js';
var loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function(pdf)
// please be aware this uses .range() function from lodash
var pagePromises = _.range(1, pdf.numPages).map(function(number)
return pdf.getPage(number);
);
return Promise.all(pagePromises);
).then(function(pages)
var scale = 1.5;
var canvases = pages.forEach(function(page)
var viewport = page.getViewport( scale: scale, ); // Prepare canvas using PDF page dimensions
var canvas = document.createElement('canvas');
canvas.height = viewport.height;
canvas.width = viewport.width; // Render PDF page into canvas context
var canvasContext = canvas.getContext('2d');
var renderContext =
canvasContext: canvasContext,
viewport: viewport
;
page.render(renderContext).promise.then(function()
if (false)
return console.log('Page rendered');
);
document.getElementById('wrapper').appendChild(canvas);
);
,
function(error)
return console.log('Error', error);
);
</script>
【讨论】:
【参考方案6】:如果你想在不同的画布中渲染pdf文档的所有页面,都一个一个同步,这是一种解决方案:
index.html
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>PDF Sample</title>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="pdf.js"></script>
<script type="text/javascript" src="main.js">
</script>
<link rel="stylesheet" type="text/css" href="main.css">
</head>
<body id="body">
</body>
</html>
main.css
canvas
display: block;
main.js
$(function()
var filePath = "document.pdf";
function Num(num)
var num = num;
return function ()
return num;
;
function renderPDF(url, canvasContainer, options)
var options = options ||
scale: 1.5
,
func,
pdfDoc,
def = $.Deferred(),
promise = $.Deferred().resolve().promise(),
width,
height,
makeRunner = function(func, args)
return function()
return func.call(null, args);
;
;
function renderPage(num)
var def = $.Deferred(),
currPageNum = new Num(num);
pdfDoc.getPage(currPageNum()).then(function(page)
var viewport = page.getViewport(options.scale);
var canvas = document.createElement('canvas');
var ctx = canvas.getContext('2d');
var renderContext =
canvasContext: ctx,
viewport: viewport
;
if(currPageNum() === 1)
height = viewport.height;
width = viewport.width;
canvas.height = height;
canvas.width = width;
canvasContainer.appendChild(canvas);
page.render(renderContext).then(function()
def.resolve();
);
)
return def.promise();
function renderPages(data)
pdfDoc = data;
var pagesCount = pdfDoc.numPages;
for (var i = 1; i <= pagesCount; i++)
func = renderPage;
promise = promise.then(makeRunner(func, i));
PDFJS.disableWorker = true;
PDFJS.getDocument(url).then(renderPages);
;
var body = document.getElementById("body");
renderPDF(filePath, body);
);
【讨论】:
这个canvasContainer是从哪里来的?你能解释一下我有动态 div 里面的画布将在页面加载后点击 repesticve 链接后附加 我正在使用 TouchPDF 库【参考方案7】:首先请注意,这样做确实不是一个好主意;如https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#allthepages中所述
怎么做;
使用 Mozilla 提供的查看器; https://mozilla.github.io/pdf.js/web/viewer.html
修改BaseViewer类,viewer.js中的_getVisiblePages()方法为
/* load all pages */
_getVisiblePages()
let visible = [];
let currentPage = this._pages[this._currentPageNumber - 1];
for (let i=0; i<this.pagesCount; i++)
let aPage = this._pages[i];
visible.push( id: aPage.id, view: aPage, );
return first: currentPage, last: currentPage, views: visible, ;
【讨论】:
谢谢。你刚刚为我节省了很多工作。【参考方案8】:如果您想在不同的画布中呈现pdf文档的所有页面
<html>
<head>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="pdf.js"></script>
<script src="jquery.js"></script>
</head>
<body>
<h1>PDF.js 'Hello, world!' example</h1>
<div id="canvas_div"></div>
<body>
<script>
// If absolute URL from the remote server is provided, configure the CORS
// header on that server.
var url = 'pdff.pdf';
// Loaded via <script> tag, create shortcut to access PDF.js exports.
var pdfjsLib = window['pdfjs-dist/build/pdf'];
// The workerSrc property shall be specified.
pdfjsLib.GlobalWorkerOptions.workerSrc = 'worker.js';
var loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function(pdf)
var __TOTAL_PAGES = pdf.numPages;
// Fetch the first page
var pageNumber = 1;
for( let i=1; i<=__TOTAL_PAGES; i+=1)
var id ='the-canvas'+i;
$('#canvas_div').append("<div style='background-color:gray;text-align: center;padding:20px;' ><canvas calss='the-canvas' id='"+id+"'></canvas></div>");
var canvas = document.getElementById(id);
//var pageNumber = 1;
renderPage(canvas, pdf, pageNumber++, function pageRenderingComplete()
if (pageNumber > pdf.numPages)
return;
// Continue rendering of the next page
renderPage(canvas, pdf, pageNumber++, pageRenderingComplete);
);
);
function renderPage(canvas, pdf, pageNumber, callback)
pdf.getPage(pageNumber).then(function(page)
var scale = 1.5;
var viewport = page.getViewport(scale: scale);
var pageDisplayWidth = viewport.width;
var pageDisplayHeight = viewport.height;
//var pageDivHolder = document.createElement();
// Prepare canvas using PDF page dimensions
//var canvas = document.createElement(id);
var context = canvas.getContext('2d');
canvas.width = pageDisplayWidth;
canvas.height = pageDisplayHeight;
// pageDivHolder.appendChild(canvas);
// Render PDF page into canvas context
var renderContext =
canvasContext: context,
viewport: viewport
;
page.render(renderContext).promise.then(callback);
);
</script>
<html>
【讨论】:
用你的答案给出一些解释。只给代码就没用了。【参考方案9】:accepted answer 完美适用于单个 PDF。在我的例子中,有多个 PDF,我想以相同的数组序列呈现所有页面。
我调整了代码,将全局变量封装在一个对象数组中,如下所示:
var docs = []; // Add this object array
var urls = []; // You would need an array of the URLs to start.
// Loop through each url. You will also need the index for later.
urls.forEach((url, ix) =>
//Get the document from the url.
PDFJS.getDocument(url).then(function(pdf)
// Make new doc object and set the properties of the new document
var doc = ;
//Set PDFJS global object (so we can easily access in our page functions
doc.thePDF = pdf;
//How many pages it has
doc.numPages = pdf.numPages;
//Push the new document to the global object array
docs.push(doc);
//Start with first page -- pass through the index for the handlePages method
pdf.getPage( 1 ).then(page => handlePages(page, ix) );
);
);
function handlePages(page, ix)
//This gives us the page's dimensions at full scale
var viewport = page.getViewport( scale: 1 );
//We'll create a canvas for each page to draw it on
var canvas = document.createElement( "canvas" );
canvas.style.display = "block";
var context = canvas.getContext('2d');
canvas.height = viewport.viewBox[3];
canvas.width = viewport.viewBox[2];
//Draw it on the canvas
page.render(canvasContext: context, viewport: viewport);
//Add it to an element based on the index so each document is added to its own element
document.getElementById('doc-' + ix).appendChild( canvas );
//Move to next page using the correct doc object from the docs object array
docs[ix].currPage++;
if ( docs[ix].thePDF !== null && docs[ix].currPage <= docs[ix].numPages )
console.log("Rendering page " + docs[ix].currPage + " of document #" + ix);
docs[ix].thePDF.getPage( docs[ix].currPage ).then(newPage => handlePages(newPage, ix) );
由于整个操作是异步的,每个文档没有唯一的对象,所以thePDF
、currPage
和numPages
的全局变量会在后续的PDF渲染时被覆盖,导致随机页面被跳过,整个文档跳过或将一个文档中的页面附加到错误的文档中。
最后一点是,如果这是离线完成或不使用 ES6 模块,PDFJS.getDocument(url).then()
方法应该更改为 pdfjsLib.getDocument(url).promise.then()
。
【讨论】:
【参考方案10】:让它在每一页上迭代你想要多少。
const url = '/storage/documents/reports/AR-2020-CCBI IND.pdf';
pdfjsLib.GlobalWorkerOptions.workerSrc = '/vendor/pdfjs-dist-2.12.313/package/build/pdf.worker.js';
const loadingTask = pdfjsLib.getDocument(
url: url,
verbosity: 0
);
(async () =>
const pdf = await loadingTask.promise;
let numPages = await pdf.numPages;
if (numPages > 10)
numPages = 10;
for (let i = 1; i <= numPages; i++)
let page = await pdf.getPage(i);
let scale = 1.5;
let viewport = page.getViewport( scale );
let outputScale = window.devicePixelRatio || 1;
let canvas = document.createElement('canvas');
let context = canvas.getContext("2d");
canvas.width = Math.floor(viewport.width * outputScale);
canvas.height = Math.floor(viewport.height * outputScale);
canvas.style.width = Math.floor(viewport.width) + "px";
canvas.style.height = Math.floor(viewport.height) + "px";
document.getElementById('canvas-column').appendChild(canvas);
let transform = outputScale !== 1
? [outputScale, 0, 0, outputScale, 0, 0]
: null;
let renderContext =
canvasContext: context,
transform,
viewport
;
page.render(renderContext);
)();
【讨论】:
以上是关于如何使用 PDF.JS 显示整个 PDF(不仅仅是一页)?的主要内容,如果未能解决你的问题,请参考以下文章