在 c#.net 中使用 iTextSharp 合并多个 PDF

Posted

技术标签:

【中文标题】在 c#.net 中使用 iTextSharp 合并多个 PDF【英文标题】:Merging multiple PDFs using iTextSharp in c#.net 【发布时间】:2011-08-27 02:43:52 【问题描述】:

我正在尝试将多个 PDF 合并为一个。

我在编译时没有给出任何错误。我尝试先合并文档,但由于我正在处理表格而出错。

这是 asp.net 代码隐藏

if (Button.Equals("PreviewWord")) 

        String eventTemplate = Server.MapPath("/ERAS/Badges/Template/EventTemp" + EventName + ".doc");

        String SinglePreview = Server.MapPath("/ERAS/Badges/Template/PreviewSingle" + EventName + ".doc");

        String PDFPreview = Server.MapPath("/ERAS/Badges/Template/PDFPreviewSingle" + EventName + ".pdf");

        String previewPDFs = Server.MapPath("/ERAS/Badges/Template/PreviewPDFs" + EventName + ".pdf");

        if (System.IO.File.Exists((String)eventTemplate))
        

            if (vulGegevensIn == true)
            
              //This creates a Worddocument and fills in names etc from database
                CreateWordDocument(vulGegevensIn, eventTemplate, SinglePreview, false);
                //This saves the SinglePreview.doc as a PDF @param place of PDFPreview
                CreatePDF(SinglePreview, PDFPreview);


                //Trying to merge
                String[] previewsSmall=new String[1];
                previewsSmall[0] = PDFPreview;
                PDFMergenITextSharp.MergeFiles(previewPDFs, previewsSmall);
            


            // merge PDFs here...........................;
            //here
            //no here//
            //...


     

这是 PDFMergenITextSharpClass

公共静态类 PDFMergenITextSharp

public static void MergeFiles(string destinationFile, string[] sourceFiles)


    try
    
        int f = 0;
        // we create a reader for a certain document
        PdfReader reader = new PdfReader(sourceFiles[f]);
        // we retrieve the total number of pages
        int n = reader.NumberOfPages;
        //Console.WriteLine("There are " + n + " pages in the original file.");
        // step 1: creation of a document-object
        Document document = new Document(reader.GetPageSizeWithRotation(1));
        // step 2: we create a writer that listens to the document
        PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(destinationFile, FileMode.Create));
        // step 3: we open the document
        document.Open();
        PdfContentByte cb = writer.DirectContent;
        PdfImportedPage page;
        int rotation;
        // step 4: we add content
        while (f < sourceFiles.Length)
        
            int i = 0;
            while (i < n)
            
                i++;
                document.SetPageSize(reader.GetPageSizeWithRotation(i));
                document.NewPage();
                page = writer.GetImportedPage(reader, i);
                rotation = reader.GetPageRotation(i);
                if (rotation == 90 || rotation == 270)
                
                    cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
                
                else
                
                    cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
                
                //Console.WriteLine("Processed page " + i);
            
            f++;
            if (f < sourceFiles.Length)
            
                reader = new PdfReader(sourceFiles[f]);
                // we retrieve the total number of pages
                n = reader.NumberOfPages;
                //Console.WriteLine("There are " + n + " pages in the original file.");
            
        
        // step 5: we close the document
        document.Close();
    
    catch (Exception e)
    
        string strOb = e.Message;
    


public static int CountPageNo(string strFileName)

    // we create a reader for a certain document
    PdfReader reader = new PdfReader(strFileName);
    // we retrieve the total number of pages
    return reader.NumberOfPages;


【问题讨论】:

使用 PdfCopy 代替 PdfWriter。应该有几个示例和相关问题。 @Liquid -- CreatePDF(SinglePreview, PDFPreview);您能否分享一下您是如何从 doc 创建 pdf 的。如果您提供有关您如何转换的一些详细信息,这将非常有帮助。我希望您使用 iTextSharp 进行 doc 到 pdf 的转换。 【参考方案1】:

也请访问并阅读这篇文章,我详细解释了有关 How to Merge Multiple PDF Files Into Single PDF Using Itextsharp in C# 的所有内容

实施:

try

    string FPath = "";
    // Create For loop for get/create muliple report on single click based on row of gridview control
    for (int j = 0; j < Gridview1.Rows.Count; j++)
    
        // Return datatable for data
        DataTable dtDetail = new My_GlobalClass().GetDataTable(Convert.ToInt32(Gridview1.Rows[0]["JobId"]));
 
        int i = Convert.ToInt32(Gridview1.Rows[0]["JobId"]);
        if (dtDetail.Rows.Count > 0)
        
            // Create Object of ReportDocument
            ReportDocument cryRpt = new ReportDocument();
            //Store path of .rpt file
            string StrPath = Application.StartupPath + "\\RPT";
            StrPath = StrPath + "\\";
            StrPath = StrPath + "rptCodingvila_Articles_Report.rpt";
            cryRpt.Load(StrPath);
            // Assign Report Datasource
            cryRpt.SetDataSource(dtDetail);
            // Assign Reportsource to Report viewer
            CryViewer.ReportSource = cryRpt;
            CryViewer.Refresh();
            // Store path/name of pdf file one by one 
            string StrPathN = Application.StartupPath + "\\Temp" + "\\Codingvila_Articles_Report" + i.ToString() + ".Pdf";
            FPath = FPath == "" ? StrPathN : FPath + "," + StrPathN;
            // Export Report in PDF
            cryRpt.ExportToDisk(CrystalDecisions.Shared.ExportFormatType.PortableDocFormat, StrPathN);
        
    
    if (FPath != "")
    
        // Check for File Existing or Not
        if (System.IO.File.Exists(Application.StartupPath + "\\Temp" + "\\Codingvila_Articles_Report.pdf"))
            System.IO.File.Delete(Application.StartupPath + "\\Temp" + "\\Codingvila_Articles_Report.pdf");
        // Split and store pdf input file
        string[] files = FPath.Split(',');
        //  Marge Multiple PDF File
        MargeMultiplePDF(files, Application.StartupPath + "\\Temp" + "\\Codingvila_Articles_Report.pdf");
        // Open Created/Marged PDF Output File
        Process.Start(Application.StartupPath + "\\Temp" + "\\Codingvila_Articles_Report.pdf");
        // Check and Delete Input file
        foreach (string item in files)
        
            if (System.IO.File.Exists(item.ToString()))
                System.IO.File.Delete(item.ToString());
        
 
    

catch (Exception ex)

    XtraMessageBox.Show(ex.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);

为 Marge PDF 创建函数

public static void MargeMultiplePDF(string[] PDFfileNames, string OutputFile)

    iTextSharp.text.Document PDFdoc = new iTextSharp.text.Document();
    using (System.IO.FileStream MyFileStream = new System.IO.FileStream(OutputFile, System.IO.FileMode.Create))
    
        iTextSharp.text.pdf.PdfCopy PDFwriter = new iTextSharp.text.pdf.PdfCopy(PDFdoc, MyFileStream);
        if (PDFwriter == null)
        
            return;
        
        PDFdoc.Open();
        foreach (string fileName in PDFfileNames)
        
            iTextSharp.text.pdf.PdfReader PDFreader = new iTextSharp.text.pdf.PdfReader(fileName);
            PDFreader.ConsolidateNamedDestinations();
            for (int i = 1; i <= PDFreader.NumberOfPages; i++)
            
                iTextSharp.text.pdf.PdfImportedPage page = PDFwriter.GetImportedPage(PDFreader, i);
                PDFwriter.AddPage(page);
            
            iTextSharp.text.pdf.PRAcroForm form = PDFreader.AcroForm;
            if (form != null)
            
                PDFwriter.CopyAcroForm(PDFreader);
            
            PDFreader.Close();
        
        PDFwriter.Close();
        PDFdoc.Close();
    

【讨论】:

对于与 iText(Sharp) 2.x、4.x 和 5.x 合并,使用基于 PdfCopy 的解决方案通常比使用基于 PdfWriter 的解决方案更好。【参考方案2】:

我找到了答案:

将更多文件添加到第一个输入文件数组中,而不是第二种方法。

public static void CombineMultiplePDFs(string[] fileNames, string outFile)

    // step 1: creation of a document-object
    Document document = new Document();
    //create newFileStream object which will be disposed at the end
    using (FileStream newFileStream = new FileStream(outFile, FileMode.Create))
    
       // step 2: we create a writer that listens to the document
       PdfCopy writer = new PdfCopy(document, newFileStream);

       // step 3: we open the document
       document.Open();

       foreach (string fileName in fileNames)
       
           // we create a reader for a certain document
           PdfReader reader = new PdfReader(fileName);
           reader.ConsolidateNamedDestinations();

           // step 4: we add content
           for (int i = 1; i <= reader.NumberOfPages; i++)
           
               PdfImportedPage page = writer.GetImportedPage(reader, i);
               writer.AddPage(page);
           

           PRAcroForm form = reader.AcroForm;
           if (form != null)
           
               writer.CopyAcroForm(reader);
           

           reader.Close();
       

       // step 5: we close the document and writer
       writer.Close();
       document.Close();
   //disposes the newFileStream object

    

【讨论】:

@liquid - 抱歉,您能说明一下您为此使用了哪些参考资料吗? “PdfCopy 不包含 CopyAcroForm 的定义” 必须将行 PRAcroForm form = reader.AcroForm; 更改为 PrAcroForm form = reader.AcroForm; (小 'r' 而不是大写,否则我收到错误 - 但是这部分没有考虑编辑...) @misanthrop 没有考虑编辑,因为该对象的类名带有大写的 R“PRAcroForm”。但我不知道您使用的是哪个 iTextsharp 版本。但我仍然很高兴这对您有所帮助 我们正在使用 iTextSharp.LGPLv2.Core 这确实是 iTextSharp(v4.1.6) 的非官方端口,也许它与此有关......也许这条评论也对其他人有帮助=)【参考方案3】:

在 Itextsharp 中合并 PDF 的代码

public static void Merge(List<String> InFiles, String OutFile)

    using (FileStream stream = new FileStream(OutFile, FileMode.Create))
    using (Document doc = new Document())
    using (PdfCopy pdf = new PdfCopy(doc, stream))
    
        doc.Open();

        PdfReader reader = null;
        PdfImportedPage page = null;

        //fixed typo
        InFiles.ForEach(file =>
        
            reader = new PdfReader(file);

            for (int i = 0; i < reader.NumberOfPages; i++)
            
                page = pdf.GetImportedPage(reader, i + 1);
                pdf.AddPage(page);
            

            pdf.FreeReader(reader);
            reader.Close();
            File.Delete(file);
        );
    

【讨论】:

一个人在没有认真阅读的情况下复制和粘贴您的代码可能会遇到一些麻烦:并非在每个合并用例中都将删除源文件!【参考方案4】:

我在任何地方都看不到这个解决方案,据说......据一个人说,正确的方法是使用 copyPagesTo()。这确实有效,我对其进行了测试。您的里程可能因城市驾驶和开放道路驾驶而异。祝你好运。

    public static bool MergePDFs(List<string> lststrInputFiles, string OutputFile, out int iPageCount, out string strError)
    
        strError = string.Empty;

        PdfWriter pdfWriter = new PdfWriter(OutputFile);
        PdfDocument pdfDocumentOut = new PdfDocument(pdfWriter);

        PdfReader pdfReader0 = new PdfReader(lststrInputFiles[0]);
        PdfDocument pdfDocument0 = new PdfDocument(pdfReader0);
        int iFirstPdfPageCount0 = pdfDocument0.GetNumberOfPages();
        pdfDocument0.CopyPagesTo(1, iFirstPdfPageCount0, pdfDocumentOut);
        iPageCount = pdfDocumentOut.GetNumberOfPages();

        for (int ii = 1; ii < lststrInputFiles.Count; ii++)
        
            PdfReader pdfReader1 = new PdfReader(lststrInputFiles[ii]);
            PdfDocument pdfDocument1 = new PdfDocument(pdfReader1);
            int iFirstPdfPageCount1 = pdfDocument1.GetNumberOfPages();
            iPageCount += iFirstPdfPageCount1;
            pdfDocument1.CopyPagesTo(1, iFirstPdfPageCount1, pdfDocumentOut);
            int iFirstPdfPageCount00 = pdfDocumentOut.GetNumberOfPages();
        

        pdfDocumentOut.Close();

        return true;
    

【讨论】:

您的解决方案适用于 iText 7,而问题和其他答案集中在 iText 5。 这是正确的,也是我发布它的原因。我发现的几乎所有其他答案都已过时,但我应该指出其中的区别。我找到的每个答案都是针对 iText 5 的,所以我认为为当前的 iText 7 发布一个是个好主意。【参考方案5】:

合并多个PDF文件的字节数组:

    public static byte[] MergePDFs(List<byte[]> pdfFiles)
      
        if (pdfFiles.Count > 1)
        
            PdfReader finalPdf;
            Document pdfContainer;
            PdfWriter pdfCopy;
            MemoryStream msFinalPdf = new MemoryStream();

            finalPdf = new PdfReader(pdfFiles[0]);
            pdfContainer = new Document();
            pdfCopy = new PdfSmartCopy(pdfContainer, msFinalPdf);

            pdfContainer.Open();

            for (int k = 0; k < pdfFiles.Count; k++)
            
                finalPdf = new PdfReader(pdfFiles[k]);
                for (int i = 1; i < finalPdf.NumberOfPages + 1; i++)
                
                    ((PdfSmartCopy)pdfCopy).AddPage(pdfCopy.GetImportedPage(finalPdf, i));
                
                pdfCopy.FreeReader(finalPdf);

            
            finalPdf.Close();
            pdfCopy.Close();
            pdfContainer.Close();

            return msFinalPdf.ToArray();
        
        else if (pdfFiles.Count == 1)
        
            return pdfFiles[0];
        
        return null;
    

【讨论】:

【参考方案6】:

我在这个网站上找到了一个非常好的解决方案:http://weblogs.sqlteam.com/mladenp/archive/2014/01/10/simple-merging-of-pdf-documents-with-itextsharp-5-4-5.aspx

我在这种模式下更新方法:

    public static bool MergePDFs(IEnumerable<string> fileNames, string targetPdf)
    
        bool merged = true;
        using (FileStream stream = new FileStream(targetPdf, FileMode.Create))
        
            Document document = new Document();
            PdfCopy pdf = new PdfCopy(document, stream);
            PdfReader reader = null;
            try
            
                document.Open();
                foreach (string file in fileNames)
                
                    reader = new PdfReader(file);
                    pdf.AddDocument(reader);
                    reader.Close();
                
            
            catch (Exception)
            
                merged = false;
                if (reader != null)
                
                    reader.Close();
                
            
            finally
            
                if (document != null)
                
                    document.Close();
                
            
        
        return merged;
    

【讨论】:

我更喜欢这个解决方案,因为它不涉及已弃用的CopyAcroForm 功能,该功能在最新版本的itextsharp 中不再可用。 在 nuget 上提供一些文档和最新版本的 itextsharp 对我来说非常适合:) 据我所知,几乎没有理由不使用 PdfSmartCopy 而不是 PdfCopy。至少对我来说,PDF 的大小节省是非常重要的。【参考方案7】:

使用 iTextSharp.dll

protected void Page_Load(object sender, EventArgs e)

    String[] files = @"C:\ENROLLDOCS\A1.pdf,C:\ENROLLDOCS\A2.pdf".Split(',');
    MergeFiles(@"C:\ENROLLDOCS\New1.pdf", files);

public void MergeFiles(string destinationFile, string[] sourceFiles)

    if (System.IO.File.Exists(destinationFile))
        System.IO.File.Delete(destinationFile);

    string[] sSrcFile;
    sSrcFile = new string[2];

    string[] arr = new string[2];
    for (int i = 0; i <= sourceFiles.Length - 1; i++)
    
        if (sourceFiles[i] != null)
        
            if (sourceFiles[i].Trim() != "")
                arr[i] = sourceFiles[i].ToString();
        
    

    if (arr != null)
    
        sSrcFile = new string[2];

        for (int ic = 0; ic <= arr.Length - 1; ic++)
        
            sSrcFile[ic] = arr[ic].ToString();
        
    
    try
    
        int f = 0;

        PdfReader reader = new PdfReader(sSrcFile[f]);
        int n = reader.NumberOfPages;
        Response.Write("There are " + n + " pages in the original file.");
        Document document = new Document(PageSize.A4);

        PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(destinationFile, FileMode.Create));

        document.Open();
        PdfContentByte cb = writer.DirectContent;
        PdfImportedPage page;

        int rotation;
        while (f < sSrcFile.Length)
        
            int i = 0;
            while (i < n)
            
                i++;

                document.SetPageSize(PageSize.A4);
                document.NewPage();
                page = writer.GetImportedPage(reader, i);

                rotation = reader.GetPageRotation(i);
                if (rotation == 90 || rotation == 270)
                
                    cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
                
                else
                
                    cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
                
                Response.Write("\n Processed page " + i);
            

            f++;
            if (f < sSrcFile.Length)
            
                reader = new PdfReader(sSrcFile[f]);
                n = reader.NumberOfPages;
                Response.Write("There are " + n + " pages in the original file.");
            
        
        Response.Write("Success");
        document.Close();
    
    catch (Exception e)
    
        Response.Write(e.Message);
    



【讨论】:

在使用这个时遇到问题我的内容被截断了?

以上是关于在 c#.net 中使用 iTextSharp 合并多个 PDF的主要内容,如果未能解决你的问题,请参考以下文章

itextsharp 水印

ASP.NET 转自定内容到 PDF - 使用 iTextSharp

ITextSharp - VB.NET - PdfTextExtractor - PDF 到 TXT - 缺少部分页面文本

如何用.net技术中的itextsharp给pdf文档中添加超链接水印?

如何使用 itextsharp.net 将相同的数字签名放置到 PDF 中的多个位置

使用 iTextSharp 在不同 PDF 中的坐标系和置换文本问题