转 Html转pdf的工具——wkhtmltopdf

Posted 2020-10-10

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了转 Html转pdf的工具——wkhtmltopdf相关的知识，希望对你有一定的参考价值。

下载地址：http://wkhtmltopdf.org/downloads.html

安装好以后需要在系统环境变量变量名为”Path”的后添加：;D:\wkhtmltopdf\bin 也就是你安装的目录。安装好以后重启电脑。

测试使用效果
直接在cmd里输入：wkhtmltopdf http://www.baidu.com/ D:website1.pdf

临时文件在哪儿，你的css就得在哪儿，或者你直接使用相对路径，引用其他文件中的css样式也可以的，最简单的就是把css样式直接写在要转成pdf的html页面中。如果存在样式没有，那就是你的样式路径没有写对，在检查一下就可以了！

解决分页问题
wkhtmltopdf 很好用，但也有些不尽人意。就是当一个html页面很长我需要在指定的地方分页那怎么办呢？ wkhtmltopdf 开发者在开发的时候并不是没有考虑到这一点，
wkhtmltopdf 有个很好的方法，就是在那个div的样式后添加一个：page-break-inside:avoid;就ok了。
例如

div{ width:800px; min-height:1362px;margin:auto;page-break-inside:avoid;}

注：html的table中不能用thead，用了后换页会出现两个表头
解决中文乱码问题

将 windows下的字体，例如 c:\WINDOWS\Fonts\simsun.TTF，或者msyh.TTF，或者msyhbd.TTF复制到 Linux系统 /usr/share/fonts 下即可。记住要将扩展名改为了.TTC
例如：将Windows下的字体文件 c:\WINDOWS\Fonts\simsun.ttf，Copy到 /usr/share/fonts/SIMSUN.TTC下

php环境中，采用系统调用命令：

exec("wkhtmltopdf ‘http://www.hywtest.com/xt/bill/MTIzNDU2Nzg5‘ ‘/data/users/MTIzNDU2Nzg5/downloadfile//20170607043855515.pdf‘");

Java中调用：

Runtime.getRuntime().exec("wkhtmltopdf ‘http://www.hywtest.com/xt/bill/MTIzNDU2Nzg5‘ ‘/data/users/MTIzNDU2Nzg5/downloadfile//20170607043855515.pdf‘");

附录：中文参数详解

linux：wkhtmltopdf [OPTIONS]… [More input files]
windows：wkhtmltopdf.exe [OPTIONS]… [More input files]
常规选项
–allow 允许加载从指定的文件夹中的文件或文件（可重复）
–book* 设置一会打印一本书的时候，通常设置的选项
–collate 打印多份副本时整理
–cookie 设置一个额外的cookie（可重复）
–cookie-jar 读取和写入的Cookie，并在提供的cookie jar文件
–copies 复印打印成pdf文件数（默认为1）
–cover* 使用HTML文件作为封面。它会带页眉和页脚的TOC之前插入
–custom-header 设置一个附加的HTTP头（可重复）
–debug-javascript 显示的javascript调试输出
–default-header* 添加一个缺省的头部，与页面的左边的名称，页面数到右边，例如： –header-left ‘[webpage]’ –header-right ‘[page]/[toPage]’ –header-line
–disable-external-links* 禁止生成链接到远程网页
–disable-internal-links* 禁止使用本地链接
–disable-javascript 禁止让网页执行JavaScript
–disable-pdf-compression* 禁止在PDF对象使用无损压缩
–disable-smart-shrinking* 禁止使用WebKit的智能战略收缩，使像素/ DPI比没有不变
–disallow-local-file-access 禁止允许转换的本地文件读取其他本地文件，除非explecitily允许用 –allow
–dpi 显式更改DPI（这对基于X11的系统没有任何影响）
–enable-plugins 启用已安装的插件（如Flash
–encoding 设置默认的文字编码
–extended-help 显示更广泛的帮助，详细介绍了不常见的命令开关
–forms* 打开HTML表单字段转换为PDF表单域
–grayscale PDF格式将在灰阶产生
–help Display help
–htmldoc 输出程序HTML帮助
–ignore-load-errors 忽略claimes加载过程中已经遇到了一个错误页面
–lowquality 产生低品质的PDF/ PS。有用缩小结果文档的空间
–manpage 输出程序手册页
–margin-bottom 设置页面下边距 (default 10mm)
–margin-left 将左边页边距 (default 10mm)
–margin-right 设置页面右边距 (default 10mm)
–margin-top 设置页面上边距 (default 10mm)
–minimum-font-size 最小字体大小 (default 5)
–no-background 不打印背景
–orientation 设置方向为横向或纵向
–page-height 页面高度 (default unit millimeter)
–page-offset* 设置起始页码 (default 1)
–page-size 设置纸张大小: A4, Letter, etc.
–page-width 页面宽度 (default unit millimeter)
–password HTTP验证密码
–post Add an additional post field (repeatable)
–post-file Post an aditional file (repeatable)
–print-media-type* 使用的打印介质类型，而不是屏幕
–proxy 使用代理
–quiet Be less verbose
–read-args-from-stdin 读取标准输入的命令行参数
–readme 输出程序自述
–redirect-delay 等待几毫秒为JS-重定向(default 200)
–replace* 替换名称,值的页眉和页脚（可重复）
–stop-slow-scripts 停止运行缓慢的JavaScripts
–title 生成的PDF文件的标题（第一个文档的标题使用，如果没有指定）
–toc* 插入的内容的表中的文件的开头
–use-xserver* 使用X服务器（一些插件和其他的东西没有X11可能无法正常工作）
–user-style-sheet 指定用户的样式表，加载在每一页中
–username HTTP认证的用户名
–version 输出版本信息退出
–zoom 使用这个缩放因子 (default 1)
页眉和页脚选项
–header-center* (设置在中心位置的页眉内容)
–header-font-name* (default Arial) (设置页眉的字体名称)
–header-font-size* (设置页眉的字体大小)
–header-html* (添加一个HTML页眉,后面是网址)
–header-left* (左对齐的页眉文本)
–header-line* (显示一条线在页眉下)
–header-right* (右对齐页眉文本)
–header-spacing* (设置页眉和内容的距离,默认0)
–footer-center* (设置在中心位置的页脚内容)
–footer-font-name* (设置页脚的字体名称)
–footer-font-size* (设置页脚的字体大小default 11)
–footer-html* (添加一个HTML页脚,后面是网址)
–footer-left* (左对齐的页脚文本)
–footer-line* 显示一条线在页脚内容上)
–footer-right* (右对齐页脚文本)
–footer-spacing* (设置页脚和内容的距离)
./wkhtmltopdf –footer-right ‘[page]/[topage]’ http://www.baidu.com baidu.pdf
./wkhtmltopdf –header-center ‘报表’ –header-line –margin-top 2cm –header-line http://192.168.212.139/oma/ oma.pdf
表内容选项中
–toc-depth* Set the depth of the toc (default 3)
–toc-disable-back-links* Do not link from section header to toc
–toc-disable-links* Do not link from toc to sections
–toc-font-name* Set the font used for the toc (default Arial)
–toc-header-font-name* The font of the toc header (if unset use –toc-font-name)
–toc-header-font-size* The font size of the toc header (default 15)
–toc-header-text* The header text of the toc (default Table Of Contents)
–toc-l1-font-size* Set the font size on level 1 of the toc (default 12)
–toc-l1-indentation* Set indentation on level 1 of the toc (default 0)
–toc-l2-font-size* Set the font size on level 2 of the toc (default 10)
–toc-l2-indentation* Set indentation on level 2 of the toc (default 20)
–toc-l3-font-size* Set the font size on level 3 of the toc (default 8)
–toc-l3-indentation* Set indentation on level 3 of the toc (default 40)
–toc-l4-font-size* Set the font size on level 4 of the toc (default 6)
–toc-l4-indentation* Set indentation on level 4 of the toc (default 60)
–toc-l5-font-size* Set the font size on level 5 of the toc (default 4)
–toc-l5-indentation* Set indentation on level 5 of the toc (default 80)
–toc-l6-font-size* Set the font size on level 6 of the toc (default 2)
–toc-l6-indentation* Set indentation on level 6 of the toc (default 100)
–toc-l7-font-size* Set the font size on level 7 of the toc (default 0)
–toc-l7-indentation* Set indentation on level 7 of the toc (default 120)
–toc-no-dots* Do not use dots, in the toc
轮廓选项
–dump-outline 转储目录到一个文件
–outline 显示目录(文章中h1,h2来定)
–outline-depth 设置目录的深度（默认为4）
页脚和页眉
* [page] 由当前正在打印的页的数目代替
* [frompage] 由要打印的第一页的数量取代
* [topage] 由最后一页要打印的数量取代
* [webpage] 通过正在打印的页面的URL替换
* [section] 由当前节的名称替换
* [subsection] 由当前小节的名称替换
* [date] 由当前日期系统的本地格式取代
* [time] 由当前时间，系统的本地格式取代
./wkhtmltopdf –footer-right ‘[page]/[topage]’ http://www.baidu.com baidu.pdf
./wkhtmltopdf –header-center ‘报表’ –outline –header-line –margin-top 2cm –header-line http://www.hao123.com/ hao123.pdf
./wkhtmltopdf –header-left ‘[webpage]’ –footer-center ‘测试([page]/[toPage])’ http://www.baidu.com baidu.pdf

wkhtmltopdf 包含两个工具：wkhtmltopdf 和 wkhtmltoimage 。

输入一个 URL 地址，自动将网页保存成一个 PDF 文档或者是一个图片。

命令：

D:\Tools\wkhtmltopdf>wkhtmltoimage http://www.oschina.net/ oschina.jpg

PDF时间戳数字签名

由时间戳服务中心（TSA：Time Stamp Authority）颁发的具有法律效力的电子凭证, 时间戳与电子数据唯一对应，其中包含电子数据 “指纹”、产生时间、时间戳服务中心信息等。

import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Rectangle;
import com.itextpdf.text.pdf.*;

import java.io.*;
import java.security.MessageDigest;
import java.security.SignatureException;
import java.security.cert.CertificateParsingException;
import java.security.cert.X509Certificate;
import java.util.Calendar;
import java.util.HashMap;

/**
 * Created by zhangzhenhua on 2016/11/1.
 */
public class PDFSigner {

    //tsa

    private SignerKeystore signerKeystore;
    private TSAClient tsaClient;

    private PDFSigner(){}

    /**
     *
     * @param tsa_url   tsa服务器地址
     * @param tsa_accnt tsa账户号
     * @param tsa_passw tsa密码
     * @param cert_path 证书路径
     * @param cert_passw    证书密码
     */
    public PDFSigner(String tsa_url,String tsa_accnt,String tsa_passw,String cert_path,String cert_passw)  {

        tsaClient = new TSAClientBouncyCastle(tsa_url, tsa_accnt, tsa_passw);
        try {
            signerKeystore =  new SignerKeystorePKCS12(new FileInputStream(cert_path), cert_passw);
        } catch (Exception e) {
            e.printStackTrace();
        }

    }


    /**
     * TSA时间戳签名
     * @param infilePath    未签名的文件路径
     * @param outfilePath   签名后的文件路径
     * @throws Exception
     */
    public void signPDF(String infilePath,String outfilePath) throws Exception {
        PdfReader reader = new PdfReader(infilePath);
        FileOutputStream fout = new FileOutputStream(outfilePath);
        PdfStamper stp = PdfStamper.createSignature(reader, fout, ‘\0‘);
        PdfSignatureAppearance sap = stp.getSignatureAppearance();

        sap.setCrypto(null,  this.signerKeystore.getChain(), null, PdfSignatureAppearance.SELF_SIGNED);

        sap.setVisibleSignature(new Rectangle(100, 100, 300, 200), 1, "Signature");

        PdfSignature dic = new PdfSignature(PdfName.ADOBE_PPKLITE, new PdfName("adbe.pkcs7.detached"));
        dic.setReason(sap.getReason());
        dic.setLocation(sap.getLocation());
        dic.setContact(sap.getContact());
        dic.setDate(new PdfDate(sap.getSignDate()));
        sap.setCryptoDictionary(dic);

        int contentEstimated = 15000;
        HashMap exc = new HashMap();
        exc.put(PdfName.CONTENTS, new Integer(contentEstimated * 2 + 2));
        sap.preClose(exc);

        PdfPKCS7 sgn = new PdfPKCS7(this.signerKeystore.getPrivateKey(),  this.signerKeystore.getChain(), null, "SHA1", null, false);
        InputStream data = sap.getRangeStream();
        MessageDigest messageDigest = MessageDigest.getInstance("SHA1");
        byte buf[] = new byte[8192];
        int n;
        while ((n = data.read(buf)) > 0) {
            messageDigest.update(buf, 0, n);
        }
        byte hash[] = messageDigest.digest();
        Calendar cal = Calendar.getInstance();
        byte[] ocsp = null;
        if ( this.signerKeystore.getChain().length >= 2) {
            String url = PdfPKCS7.getOCSPURL((X509Certificate) this.signerKeystore.getChain()[0]);
            if (url != null && url.length() > 0)
                ocsp = new OcspClientBouncyCastle((X509Certificate) this.signerKeystore.getChain()[0], (X509Certificate) this.signerKeystore.getChain()[1], url).getEncoded();
        }
        byte sh[] = sgn.getAuthenticatedAttributeBytes(hash, cal, ocsp);
        sgn.update(sh, 0, sh.length);

        byte[] encodedSig = sgn.getEncodedPKCS7(hash, cal, this.tsaClient, ocsp);

        if (contentEstimated + 2 < encodedSig.length)
            throw new Exception("Not enough space");

        byte[] paddedSig = new byte[contentEstimated];
        System.arraycopy(encodedSig, 0, paddedSig, 0, encodedSig.length);

        PdfDictionary dic2 = new PdfDictionary();
        dic2.put(PdfName.CONTENTS, new PdfString(paddedSig).setHexWriting(true));
        sap.close(dic2);
    }


    public static void main(String[] args) {

        //test
        String TSA_URL    = "http://tsa.safelayer.com:8093";
        String TSA_ACCNT  = "";
        String TSA_PASSW  = "";
        String IN_FILE = "E:\\项目\\paperless\\lipsum.pdf";
        String OUT_FILE = "E:\\项目\\paperless\\test_signed.pdf";

        String CERT_PATH  = "E:\\项目\\paperless\\bfnsh.pfx";

        String CERT_PASSW = "123456";
        PDFSigner signer = new PDFSigner(TSA_URL,TSA_ACCNT,TSA_PASSW,CERT_PATH,CERT_PASSW);
        try {
            signer.signPDF(IN_FILE,OUT_FILE);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

}

import java.security.PrivateKey;
import java.security.Provider;
import java.security.cert.Certificate;

/**
 * Created by zhangzhenhua on 2016/10/28.
 */
public interface SignerKeystore {

    public PrivateKey getPrivateKey() ;

    public Certificate[] getChain() ;

    public Provider getProvider();

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17



/**
 * Created by hmt on 2016/10/28.
 */

import java.io.InputStream;
import java.security.KeyStore;
import java.security.PrivateKey;
import java.security.Provider;
import java.security.Security;
import java.security.cert.Certificate;

/**

 * SignerKeystore implementation using PKCS#12 file (.pfx etc)

 */

public class SignerKeystorePKCS12 implements SignerKeystore {

    private static Provider prov = null;

    private KeyStore ks;

    private String alias;

    private String pwd;



    private PrivateKey key;

    private Certificate[] chain;



    public SignerKeystorePKCS12(InputStream inp, String passw) throws Exception {

        // This should be done once only for the provider...

        if (prov == null) {

            prov = new org.bouncycastle.jce.provider.BouncyCastleProvider();

            Security.addProvider(prov);

        }



        this.ks = KeyStore.getInstance("pkcs12", prov);

        this.pwd = passw;

        this.ks.load(inp, pwd.toCharArray());

        this.alias = (String)ks.aliases().nextElement();

        this.key   = (PrivateKey)ks.getKey(alias, pwd.toCharArray());

        this.chain = ks.getCertificateChain(alias);

    }



    public PrivateKey getPrivateKey() {

        return key;

    }



    public Certificate[] getChain() {

        return chain;

    }



    public Provider getProvider() {

        return ks.getProvider();

    }

}

PDF盖骑缝章

把章按页数切割成等份的图片，合并在一起

import com.itextpdf.text.BadElementException;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Image;
import com.itextpdf.text.Rectangle;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfStamper;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.*;

/**
 * 盖骑缝章
 * Created by zhangzhenhua on 2016/11/2.
 */
public class PDFStamperCheckMark {

    /**
     * 切割图片
     * @param imgPath  原始图片路径
     * @param n 切割份数
     * @return  itextPdf的Image[]
     * @throws IOException
     * @throws BadElementException
     */
    public static Image[] subImages(String imgPath,int n) throws IOException, BadElementException {
        Image[] nImage = new Image[n];
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        BufferedImage img = ImageIO.read(new File(imgPath));
        int h = img.getHeight();
        int w = img.getWidth();

        int sw = w/n;
        for(int i=0;i<n;i++){
            BufferedImage subImg;
            if(i==n-1){//最后剩余部分
                 subImg = img.getSubimage(i * sw, 0, w-i*sw, h);
            }else {//前n-1块均匀切
                 subImg = img.getSubimage(i * sw, 0, sw, h);
            }

            ImageIO.write(subImg,imgPath.substring(imgPath.lastIndexOf(‘.‘)+1),out);
            nImage[i] = Image.getInstance(out.toByteArray());
            out.flush();
            out.reset();
        }
        return nImage;
    }

    /**
     *  盖骑缝章
     *
     * @param infilePath    原PDF路径
     * @param outFilePath    输出PDF路径
     * @param picPath    章图片路径
     * @throws IOException
     * @throws DocumentException
     */
    public static void stamperCheckMarkPDF(String infilePath,String outFilePath,String picPath) throws IOException, DocumentException {
        PdfReader reader = new PdfReader(infilePath);//选择需要印章的pdf
        PdfStamper stamp = new PdfStamper(reader, new FileOutputStream(outFilePath));//加完印章后的pdf

        Rectangle pageSize = reader.getPageSize(1);//获得第一页
        float height = pageSize.getHeight();
        float width  = pageSize.getWidth();

        int nums = reader.getNumberOfPages();
        Image[] nImage =  subImages(picPath,nums);//生成骑缝章切割图片


        for(int n=1;n<=nums;n++){
            PdfContentByte over = stamp.getOverContent(n);//设置在第几页打印印章
            Image img = nImage[n-1];//选择图片
//            img.setAlignment(1);
//            img.scaleAbsolute(200,200);//控制图片大小
            img.setAbsolutePosition(width-img.getWidth(),height/2-img.getHeight()/2);//控制图片位置
            over.addImage(img);
        }
        stamp.close();
    }



    public static void main(String[] args) throws IOException, DocumentException {
        String infilePath = "E:\\项目\\paperless\\page.pdf";
        String outfilePaht = "E:\\项目\\paperless\\page_pic.pdf";
        String picPath = "E:\\项目\\paperless\\公章.png";
        stamperCheckMarkPDF(infilePath,outfilePaht,picPath);
    }
}

以上是关于转 Html转pdf的工具——wkhtmltopdf的主要内容，如果未能解决你的问题，请参考以下文章

如何将html页面转成pdf

html转pdf工具 --- wkhtmltopdf

html转pdf工具：wkhtmltopdf.exe

Python快速将HTML转PDF，妈妈再也不会担心我不会转PDF了

chm 转 pdf 工具推荐与对比

HTML转PDF工具（wkhtmltopdf）介绍，支持widows和linux