PDF文件损坏,将内存流移动到文件流时无法修复
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了PDF文件损坏,将内存流移动到文件流时无法修复相关的知识,希望对你有一定的参考价值。
我正在使用iTextSharp和VB.Net将图像标记到PDF文档中。 (因为这不是语言特定的,我也标记为C#。)我有两个使用该过程的应用程序。
- 第一个使用来自内存流的字节在线显示PDF文档。这件作品正在发挥作用
- 第二个使用相同的功能,而是将PDF保存到文件中。这件作品生成无效的PDF。
我已经看到了一些类似的问题,但它们最初都在创建一个文档,并在代码中有一个文档对象。他们的记忆流从一开始就是腐败的。我的代码没有文档对象,我的原始内存流打开正常。
这里是我收到错误的地方:(我必须将缓冲区从m放入新的内存流,因为fillPDF函数中的压模默认为关闭流,除非另有标记。)
Dim m As MemoryStream = PDFHelper.fillPDF(filename, Nothing, markers, "")
Dim m2 As New MemoryStream(m.GetBuffer, 0, m.GetBuffer.Length)
Dim f As FileStream = New FileStream("C: emp.pdf", FileMode.Create)
m2.CopyTo(f, m.GetBuffer.Length)
m2.Close()
f.Close()
这是我在网站上成功使用它的方法之一。这个不使用图像,虽然其他一些类似的成功地方确实在多个文档上使用图像然后合并在一起。
Dim m As System.IO.MemoryStream = PDFHelper.fillPDF(filename, New Dictionary(Of String, String), New List(Of PDFHelper.PDfImage), "SAMPLE")
Dim data As Byte() = m.GetBuffer
Response.Clear()
//Send the file to the output stream
Response.Buffer = True
//Try and ensure the browser always opens the file and doesn’t just prompt to “open/save”.
Response.AddHeader("Content-Length", data.Length.ToString())
Response.AddHeader("Content-Disposition", "inline; filename=" + "Sample")
Response.AddHeader("Expires", "0")
Response.AddHeader("Pragma", "cache")
Response.AddHeader("Cache-Control", "private")
//Set the output stream to the correct content type (PDF).
Response.ContentType = "application/pdf"
Response.AddHeader("Accept-Ranges", "bytes")
//Output the file
Response.BinaryWrite(data)
//Flushing the Response to display the serialized data to the client browser.
Response.Flush()
Try
Response.End()
Catch ex As Exception
Throw ex
End Try
这是我的实用程序类中的函数(PDFHelper.fillPDF)
Public Shared Function fillPDF(fileToFill As String, Optional fieldValues As Dictionary(Of String, String) = Nothing, Optional images As List(Of PDfImage) = Nothing, Optional watermarkText As String = "") As MemoryStream
Dim m As MemoryStream = New MemoryStream() // for storing the pdf
Dim reader As PdfReader = New PdfReader(fileToFill) // for reading the document
Dim outStamper As PdfStamper = New PdfStamper(reader, m) //for filling the document
If fieldValues IsNot Nothing Then
For Each kvp As KeyValuePair(Of String, String) In fieldValues
outStamper.AcroFields.SetField(kvp.Key, kvp.Value)
Next
End If
If images IsNot Nothing AndAlso images.Count > 0 Then //add all the images
For Each PDfImage In images
Dim img As iTextSharp.text.Image = Nothing //image to stamp
//set up the image (different for different cases
Select Case PDfImage.ImageType
//removed for brevity
End Select
Dim overContent As PdfContentByte = outStamper.GetOverContent(PDfImage.PageNumber) // specify page number for stamping
overContent.AddImage(img)
Next
End If
//add the water mark
If watermarkText <> "" Then
Dim underContent As iTextSharp.text.pdf.PdfContentByte = Nothing
Dim watermarkRect As iTextSharp.text.Rectangle = reader.GetPageSizeWithRotation(1)
//removed for brevity
End If
//flatten and close out
outStamper.FormFlattening = True
outStamper.SetFullCompression()
outStamper.Close()
reader.Close()
Return m
答案
由于您的代码正在流式传输PDF,解决问题的一种简单方法是对fillPDF
方法进行一些小改动 - 让它返回一个字节数组:
// other parameters left out for simplicity sake
public static byte[] fillPDF(string resource) {
PdfReader reader = new PdfReader(resource);
using (var ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
// do whatever you need to do
}
return ms.ToArray();
}
}
然后,您可以将字节数组流式传输到ASP.NET中的客户端并将其保存到文件系统:
// get the manipulated PDF
byte[] myPdf = fillPDF(inputFile);
// stream via ASP.NET
Response.BinaryWrite(myPdf);
// save to file system
File.WriteAllBytes(outputFile, myPdf);
如果要从标准ASP.NET Web表单生成PDF,请不要忘记在写入PDF后调用Response.End()
,否则字节数组将在末尾附加html标记垃圾。
另一答案
这会将现有PDF复制到MemoryStream中,然后将其保存到磁盘。也许你可以调整它来解决你的问题?
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim strInputFilename As String = "C:JunkJunk.pdf"
Dim strOutputFilename As String = "C:JunkJunk2.pdf"
Dim byt() As Byte
Using ms As New MemoryStream
'1. Load PDF into memory stream'
Using bw As New BinaryWriter(ms)
Using fsi As New FileStream(strInputFilename, FileMode.Open)
Using br As New BinaryReader(fsi)
Try
Do
bw.Write(br.ReadByte())
Loop
Catch ex As EndOfStreamException
End Try
End Using
End Using
End Using
byt = ms.ToArray()
End Using
'2. Write memory copy of PDF back to disk'
My.Computer.FileSystem.WriteAllBytes(strOutputFilename, byt, False)
Process.Start(strOutputFilename)
End Sub
以上是关于PDF文件损坏,将内存流移动到文件流时无法修复的主要内容,如果未能解决你的问题,请参考以下文章