如何查看 InputStream 中的前两个字节？

Posted 2023-02-25

技术标签:

【中文标题】如何查看 InputStream 中的前两个字节？【英文标题】：How do I peek at the first two bytes in an InputStream? 【发布时间】：2010-09-13 23:17:51 【问题描述】：

应该很简单：我有一个 InputStream，我想在其中窥视（而不是读取）前两个字节，即我希望 InputStream 的“当前位置”在我窥视后仍为 0。最好和最安全的方法是什么？

Answer - 正如我所怀疑的，解决方案是将其包装在提供可标记性的 BufferedInputStream 中。谢谢拉斯穆斯。

【问题讨论】：

【参考方案1】：

对于一般的 InputStream，我会将其包装在 BufferedInputStream 中并执行以下操作：

BufferedInputStream bis = new BufferedInputStream(inputStream);
bis.mark(2);
int byte1 = bis.read();
int byte2 = bis.read();
bis.reset();
// note: you must continue using the BufferedInputStream instead of the inputStream

【讨论】：

另见java.sun.com/javase/6/docs/api/java/io/…【参考方案2】：

我在这里找到了 PeekableInputStream 的实现：

http://www.heatonresearch.com/articles/147/page2.html

本文中展示的实现的想法是它在内部保留了一个“偷看”值的数组。当您调用 read 时，首先从 peeked 数组返回值，然后从输入流返回。当您调用 peek 时，值会被读取并存储在“peeked”数组中。

由于示例代码的许可证是 LGPL，所以可以附在这个帖子上：

package com.heatonresearch.httprecipes.html;

import java.io.*;

/**
 * The Heaton Research Spider Copyright 2007 by Heaton
 * Research, Inc.
 * 
 * HTTP Programming Recipes for Java ISBN: 0-9773206-6-9
 * http://www.heatonresearch.com/articles/series/16/
 * 
 * PeekableInputStream: This is a special input stream that
 * allows the program to peek one or more characters ahead
 * in the file.
 * 
 * This class is released under the:
 * GNU Lesser General Public License (LGPL)
 * http://www.gnu.org/copyleft/lesser.html
 * 
 * @author Jeff Heaton
 * @version 1.1
 */
public class PeekableInputStream extends InputStream


  /**
   * The underlying stream.
   */
  private InputStream stream;

  /**
   * Bytes that have been peeked at.
   */
  private byte peekBytes[];

  /**
   * How many bytes have been peeked at.
   */
  private int peekLength;

  /**
   * The constructor accepts an InputStream to setup the
   * object.
   * 
   * @param is
   *          The InputStream to parse.
   */
  public PeekableInputStream(InputStream is)
  
    this.stream = is;
    this.peekBytes = new byte[10];
    this.peekLength = 0;
  

  /**
   * Peek at the next character from the stream.
   * 
   * @return The next character.
   * @throws IOException
   *           If an I/O exception occurs.
   */
  public int peek() throws IOException
  
    return peek(0);
  

  /**
   * Peek at a specified depth.
   * 
   * @param depth
   *          The depth to check.
   * @return The character peeked at.
   * @throws IOException
   *           If an I/O exception occurs.
   */
  public int peek(int depth) throws IOException
  
    // does the size of the peek buffer need to be extended?
    if (this.peekBytes.length <= depth)
    
      byte temp[] = new byte[depth + 10];
      for (int i = 0; i < this.peekBytes.length; i++)
      
        temp[i] = this.peekBytes[i];
      
      this.peekBytes = temp;
    

    // does more data need to be read?
    if (depth >= this.peekLength)
    
      int offset = this.peekLength;
      int length = (depth - this.peekLength) + 1;
      int lengthRead = this.stream.read(this.peekBytes, offset, length);

      if (lengthRead == -1)
      
        return -1;
      

      this.peekLength = depth + 1;
    

    return this.peekBytes[depth];
  

  /*
   * Read a single byte from the stream. @throws IOException
   * If an I/O exception occurs. @return The character that
   * was read from the stream.
   */
  @Override
  public int read() throws IOException
  
    if (this.peekLength == 0)
    
      return this.stream.read();
    

    int result = this.peekBytes[0];
    this.peekLength--;
    for (int i = 0; i < this.peekLength; i++)
    
      this.peekBytes[i] = this.peekBytes[i + 1];
    

    return result;

【讨论】：

【参考方案3】：

使用 BufferedInputStream 时，请确保 inputStream 尚未缓冲，双缓冲会导致一些严重难以发现的错误。此外，您需要以不同的方式处理 Reader，如果 Reader 被缓冲，则转换为 StreamReader 和 Buffering 会导致字节丢失。此外，如果您使用阅读器，请记住您不是在读取字节，而是在默认编码中读取字符（除非设置了显式编码）。一个缓冲输入流的例子，你可能不知道是 URL url； url.openStream();

我没有任何关于此信息的参考，它来自调试代码。对我来说发生问题的主要情况是从文件读取到压缩流的代码。如果我没记错的话，一旦您开始通过代码进行调试，Java 源代码中就有一些 cmets 某些东西不能始终正常工作。我不记得使用 BufferedReader 和 BufferedInputStream 的信息在哪里来自，但我认为即使是最简单的测试也会立即失败。记住要测试这一点，您需要标记超过缓冲区大小（对于 BufferedReader 与 BufferedInputStream 不同），当正在读取的字节到达缓冲区末尾时会出现问题。请注意，源代码缓冲区大小可能与您在构造函数中设置的缓冲区大小不同。自从我这样做以来已经有一段时间了，所以我对细节的回忆可能有点不对劲。测试是使用 FilterReader/FilterInputStream 完成的，向直接流添加一个，向缓冲流添加一个以查看差异。

【讨论】：

有趣！您对双缓冲以及将 BufferedInputStream 与 InputStreamReader 结合时的问题有任何详细信息吗？我在谷歌上找不到任何东西。我认为对双缓冲的担忧是错误的，而且通常是针对流架构的。流应该堆叠在彼此之上，而无需知道另一个的内部。除非你有具体的细节，但你没有，我会说你看到的问题可能在你的代码中。【参考方案4】：

您可能会发现 PushbackInputStream 很有用：

http://docs.oracle.com/javase/6/docs/api/java/io/PushbackInputStream.html

【讨论】：

我认为这实际上是简单地查看几个字节的理想解决方案。如果您只需要检查两个字节，那么 BufferedInputStream 会非常浪费内存！

以上是关于如何查看 InputStream 中的前两个字节？的主要内容，如果未能解决你的问题，请参考以下文章