当我使用 GZIPOutputStream 将文件发布到 servlet 时文件损坏
Posted
技术标签:
【中文标题】当我使用 GZIPOutputStream 将文件发布到 servlet 时文件损坏【英文标题】:file corrupted when I post it to the servlet using GZIPOutputStream 【发布时间】:2013-09-17 18:29:59 【问题描述】:我尝试修改@BalusC优秀教程here发送gzip压缩文件。这是一个有效的 java 类:
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import java.util.zip.GZIPOutputStream;
public final class NetworkService
// *** EDIT THOSE AS APROPRIATE
private static final String FILENAME = "C:/Dropbox/TMP.txt";
private static final String URL =
"http://192.168.1.64:8080/DataCollectionServlet/";
// *** END EDIT
private static final CharSequence CRLF = "\r\n";
private static boolean isServerGzip = true; // ***
private static String charsetForMultipartHeaders = "UTF-8";
public static void main(String[] args)
HttpURLConnection connection = null;
OutputStream serverOutputStream = null;
try
File file = new File(FILENAME);
final String boundary = Long
.toHexString(System.currentTimeMillis());
connection = connection(true, boundary);
serverOutputStream = connection.getOutputStream();
try
flushMultiPartData(file, serverOutputStream, boundary);
catch (IOException e)
System.out.println(connection.getResponseCode()); // 200
catch (IOException e)
// Network unreachable : not connected
// No route to host : probably on an encrypted network
// Connection timed out : Server DOWN
finally
if (connection != null) connection.disconnect();
private static HttpURLConnection connection(boolean isMultiPart,
String boundary) throws MalformedURLException, IOException
HttpURLConnection connection = (HttpURLConnection) new URL(URL)
.openConnection();
connection.setDoOutput(true); // triggers POST
connection.setUseCaches(false); // *** no difference
connection.setRequestProperty("Connection", "Keep-Alive");
connection.setRequestProperty("User-Agent",
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) "
+ "Gecko/20100401"); // *** tried others no difference
connection.setChunkedStreamingMode(1024); // *** no difference
if (isMultiPart)
if (boundary == null || "".equals(boundary.trim()))
throw new IllegalArgumentException("Boundary can't be "
+ ((boundary == null) ? "null" : "empty"));
connection.setRequestProperty("Content-Type",
"multipart/form-data; boundary=" + boundary);
return connection;
// =========================================================================
// Multipart
// =========================================================================
private static void flushMultiPartData(File file,
OutputStream serverOutputStream, String boundary)
throws IOException
PrintWriter writer = null;
try
// true = autoFlush, important!
writer = new PrintWriter(new OutputStreamWriter(serverOutputStream,
charsetForMultipartHeaders), true);
appendBinary(file, boundary, writer, serverOutputStream);
// End of multipart/form-data.
writer.append("--" + boundary + "--").append(CRLF);
finally
if (writer != null) writer.close();
private static void appendBinary(File file, String boundary,
PrintWriter writer, OutputStream output)
throws FileNotFoundException, IOException
// Send binary file.
writer.append("--" + boundary).append(CRLF);
writer.append(
"Content-Disposition: form-data; name=\"binaryFile\"; filename=\""
+ file.getName() + "\"").append(CRLF);
writer.append(
"Content-Type: " // ***
+ ((isServerGzip) ? "application/gzip" : URLConnection
.guessContentTypeFromName(file.getName())))
.append(CRLF);
writer.append("Content-Transfer-Encoding: binary").append(CRLF);
writer.append(CRLF).flush();
InputStream input = null;
OutputStream output2 = output;
if (isServerGzip)
output2 = new GZIPOutputStream(output);
try
input = new FileInputStream(file);
byte[] buffer = new byte[1024]; // *** tweaked, no difference
for (int length = 0; (length = input.read(buffer)) > 0;)
output2.write(buffer, 0, length);
output2.flush(); // Important! Output cannot be closed. Close of
// writer will close output as well.
finally
if (input != null) try
input.close();
catch (IOException logOrIgnore)
writer.append(CRLF).flush(); // CRLF is important! It indicates end of
// binary boundary.
您必须编辑FILENAME
和URL
字段并在URL 中设置一个servlet - 它的doPost()
方法是:
@Override
protected void doPost(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException
Collection<Part> parts = req.getParts();
for (Part part : parts)
File save = new File(uploadsDirName, getFilename(part) + "_"
+ System.currentTimeMillis() + ".zip");
final String absolutePath = save.getAbsolutePath();
log.debug(absolutePath);
part.write(absolutePath);
sc.getRequestDispatcher(DATA_COLLECTION_JSP).forward(req, resp);
现在,当 isServerGzip
字段设置为 true 时,FILENAME 被正确压缩并发送到服务器,但是当我尝试提取它时它已损坏(我在 Windows 上使用 7z,它打开 gzip 文件作为存档但是当我尝试提取文件 inside 它说它已损坏的 gzip 存档 - 尽管它确实提取了(确实已损坏的)文件)。尝试了各种文件 - 较大的文件最终损坏,较小的文件提取为空 - 存档中较大文件的报告大小比实际大小大得多,而较小的文件为 0。我标记了那些需要注意// ***
。我可能会错过一些连接配置,或者我压缩流的方式可能完全错误或者......?
尝试调整连接属性、缓冲区、缓存等无济于事
【问题讨论】:
Encoding issues ? @SotiriosDelimanolis:是的 - 你必须在最近的 servlet 容器中 - 我在 tomcat7.0.32
- getParts()
有它的 share of bugs。我得到了文件并保存在我的文件系统中,它只是损坏了
我忘记了@MultipartConfig
。
【参考方案1】:
你需要打电话
((GZIPOutputStream)output2).finish();
冲洗前。请参阅 javadoc here。它指出
在不关闭的情况下完成将压缩数据写入输出流 底层流。应用多个过滤器时使用此方法 连续输出相同的流。
你正在做什么。所以
for (int length = 0; (length = input.read(buffer)) > 0;)
output2.write(buffer, 0, length);
((GZIPOutputStream)output2).finish(); //Write the compressed parts
// obviously make sure output2 is truly GZIPOutputStream
output2.flush(); //
关于对同一个输出流连续应用多个过滤器,我是这样理解的:
您有一个到 HTTP 服务器的OutputStream
,即一个套接字连接。 HttpUrlConnection
写入标题,然后您直接写入正文。在这种情况下(多部分),您将边界和标头作为解压缩字节发送,压缩文件内容,然后再发送边界。所以流最终看起来像这样:
start writing with GZIPOutputStream
v
|---boundary---|---the part headers---|---gzip encoded file content bytes---|---boundary---|
^ ^
write directly with PrintWriter use PrintWriter again
因此,您可以看到如何使用不同的过滤器连续编写不同的部分。将PrintWriter
视为未经过滤的过滤器,您给它的任何内容都是直接写入的。 GZIPOutputStream
是一个 gzip 过滤器,它对给定的字节进行编码(gzip)。
至于源代码,查看你的 Java JDK 安装,你应该有一个 src.zip
文件,其中包含公共源代码,java.lang*
、java.util.*
、java.io.*
、javax.*
等。
【讨论】:
现在将对其进行测试-同时您能否详细说明applying multiple filters
?我在哪里做呢?打电话给finish()
之前 flush()
也很重要吗?
你有一个OutputStream
,连接的。我认为 javadoc 的那部分意味着您不仅要编写 gzip 的字节,而且还要在 gzip 的文件内容之前和之后编写未编码的字节。我不确定flush()
订单。我还没看源码呢。
好吧,似乎顺序无关紧要(实际上,当包装的流(输出)关闭时会执行刷新 - 但比抱歉更安全)。接受 - 如果您能详细说明“过滤器”部分,我们将不胜感激;)(源代码链接)
@Mr_and_Mrs_D 查看我对那个 javadoc 的最后一次编辑。
您可能会感兴趣:developer.android.com/reference/java/util/zip/…以上是关于当我使用 GZIPOutputStream 将文件发布到 servlet 时文件损坏的主要内容,如果未能解决你的问题,请参考以下文章