docx4j 将office文件转为pdf
Posted 福州-司马懿
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了docx4j 将office文件转为pdf相关的知识,希望对你有一定的参考价值。
目的
使用docx4j可以将 docx, pptx, xlsx 文件转为 pdf
添加依赖
首先,在pom.xml中引入相关依赖
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j</artifactId>
<version>6.1.2</version>
</dependency>
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j-export-fo</artifactId>
<version>11.2.9</version>
</dependency>
转PDF
下面两套代码均可以实现
代码1
private static boolean docx4jConvert(String src, String dst) {
OutputStream os = null;
try {
File srcFile = new File(src);
WordprocessingMLPackage mlPackage = WordprocessingMLPackage.load(srcFile);
FOSettings foSettings = Docx4J.createFOSettings();
foSettings.setWmlPackage(mlPackage);
OutputStream os = new FileOutputStream(dst);
Docx4J.toFO(foSettings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);
return true;
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (Docx4JException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
if(os != null) {
try {
os.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return false;
}
代码2
private static boolean docx4jConvert(String src, String dst) {
OutputStream os = null;
try {
File srcFile = new File(src);
WordprocessingMLPackage mlPackage = WordprocessingMLPackage.load(srcFile);
os = new FileOutputStream(dst);
Docx4J.toPDF(mlPackage, os);
return true;
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (Docx4JException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
if(os != null) {
try {
os.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return false;
}
解决乱码问题
运行上面代码,程序打印一切正常,并输出pdf文件。但是你会发现PDF里,只要有中文的地方,都是乱码。这是由于在转换时没有找到对应字体导致的
因此我们需要在这里手动加载中文字体。如果该字体不存在于系统的字体库,我们需要使用 addPhysicalFonts
方法进行加载
/**
* 设置字体,解决中文乱码的问题
*/
private static void setFontMapper(WordprocessingMLPackage mlPackage) throws Exception {
Mapper fontMapper = new IdentityPlusMapper();
// 加载本地字体
//PhysicalFonts.addPhysicalFonts("SimSun", new URL("/fonts/SIMSUN.TTC"));
fontMapper.put("隶书", PhysicalFonts.get("LiSu"));
fontMapper.put("宋体",PhysicalFonts.get("SimSun"));
fontMapper.put("微软雅黑",PhysicalFonts.get("Microsoft Yahei"));
fontMapper.put("黑体",PhysicalFonts.get("SimHei"));
fontMapper.put("楷体",PhysicalFonts.get("KaiTi"));
fontMapper.put("新宋体",PhysicalFonts.get("NSimSun"));
fontMapper.put("华文行楷", PhysicalFonts.get("STXingkai"));
fontMapper.put("华文仿宋", PhysicalFonts.get("STFangsong"));
fontMapper.put("宋体扩展",PhysicalFonts.get("simsun-extB"));
fontMapper.put("仿宋",PhysicalFonts.get("FangSong"));
fontMapper.put("仿宋_GB2312",PhysicalFonts.get("FangSong_GB2312"));
fontMapper.put("幼圆",PhysicalFonts.get("YouYuan"));
fontMapper.put("华文宋体",PhysicalFonts.get("STSong"));
fontMapper.put("华文中宋",PhysicalFonts.get("STZhongsong"));
fontMapper.put("等线", PhysicalFonts.get("SimSun"));
fontMapper.put("等线 Light", PhysicalFonts.get("SimSun"));
fontMapper.put("华文琥珀", PhysicalFonts.get("STHupo"));
fontMapper.put("华文隶书", PhysicalFonts.get("STLiti"));
fontMapper.put("华文新魏", PhysicalFonts.get("STXinwei"));
fontMapper.put("华文彩云", PhysicalFonts.get("STCaiyun"));
fontMapper.put("方正姚体", PhysicalFonts.get("FZYaoti"));
fontMapper.put("方正舒体", PhysicalFonts.get("FZShuTi"));
fontMapper.put("华文细黑", PhysicalFonts.get("STXihei"));
fontMapper.put("宋体扩展", PhysicalFonts.get("simsun-extB"));
fontMapper.put("仿宋_GB2312", PhysicalFonts.get("FangSong_GB2312"));
fontMapper.put("新細明體", PhysicalFonts.get("SimSun"));
//解决宋体(正文)和宋体(标题)的乱码问题
PhysicalFonts.put("PMingLiU", PhysicalFonts.get("SimSun"));
PhysicalFonts.put("新細明體", PhysicalFonts.get("SimSun"));
//宋体&新宋体
PhysicalFont simsunFont = PhysicalFonts.get("SimSun");
fontMapper.put("SimSun", simsunFont);
mlPackage.setFontMapper(fontMapper);
}
之后即可正常显示了
转换限制
docx4j 只能转换 docx/pptx/xlsx 这三种类型的文件,如果你使用诸如 txt、doc 这类文件,就会报如下错误
ClassNotFoundException
javax.xml.bind.JAXBException: Implementation of JAXB-API has not been found on module path or classpath.
以上是关于docx4j 将office文件转为pdf的主要内容,如果未能解决你的问题,请参考以下文章
使用某些版本的 Office 10 打开由 docx4j 保存的文件时出错