Java - 读取、操作和编写 WAV 文件

Posted 2023-02-16

技术标签:

【中文标题】Java - 读取、操作和编写 WAV 文件【英文标题】：Java - reading, manipulating and writing WAV files 【发布时间】：2011-03-18 21:15:28 【问题描述】：

在 Java 程序中，将音频文件（WAV 文件）读取到数字数组（float[]、short[]、...）以及从一个数字数组？

【问题讨论】：

【参考方案1】：

我通过AudioInputStream 读取 WAV 文件。来自Java Sound Tutorials 的以下 sn-p 运行良好。

int totalFramesRead = 0;
File fileIn = new File(somePathName);
// somePathName is a pre-existing string whose value was
// based on a user selection.
try 
  AudioInputStream audioInputStream = 
    Audiosystem.getAudioInputStream(fileIn);
  int bytesPerFrame = 
    audioInputStream.getFormat().getFrameSize();
    if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) 
    // some audio formats may have unspecified frame size
    // in that case we may read any amount of bytes
    bytesPerFrame = 1;
   
  // Set an arbitrary buffer size of 1024 frames.
  int numBytes = 1024 * bytesPerFrame; 
  byte[] audioBytes = new byte[numBytes];
  try 
    int numBytesRead = 0;
    int numFramesRead = 0;
    // Try to read numBytes bytes from the file.
    while ((numBytesRead = 
      audioInputStream.read(audioBytes)) != -1) 
      // Calculate the number of frames actually read.
      numFramesRead = numBytesRead / bytesPerFrame;
      totalFramesRead += numFramesRead;
      // Here, do something useful with the audio data that's 
      // now in the audioBytes array...
    
   catch (Exception ex)  
    // Handle the error...
  
 catch (Exception e) 
  // Handle the error...

要编写 WAV，我发现这很棘手。表面上看起来像是一个循环问题，写入的命令依赖于AudioInputStream 作为参数。

但是如何将字节写入AudioInputStream？不应该有AudioOutputStream吗？

我发现可以定义一个可以访问原始音频字节数据的对象来实现TargetDataLine。

这需要实现很多方法，但大多数方法可以保持虚拟形式，因为它们不需要将数据写入文件。实现的关键方法是read(byte[] buffer, int bufferoffset, int numberofbytestoread)。

由于此方法可能会被多次调用，因此还应该有一个实例变量来指示数据的进度，并将其作为上述read 方法的一部分进行更新。

当你实现了这个方法后，你的对象就可以被用来创建一个新的AudioInputStream，而AudioInputStream又可以用于：

AudioSystem.write(yourAudioInputStream, AudioFileFormat.WAV, yourFileDestination)

提醒一下，AudioInputStream 可以使用 TargetDataLine 作为源来创建。

关于直接操作数据，我在上面的 sn-p 示例的最内层循环中对缓冲区中的数据进行了很好的成功操作，audioBytes。

当您处于该内部循环中时，您可以将字节转换为整数或浮点数并乘以 volume 值（范围从 0.0 到 1.0），然后将它们转换回小端字节。

我相信，由于您可以访问该缓冲区中的一系列样本，因此您还可以在该阶段使用各种形式的 DSP 过滤算法。根据我的经验，我发现直接在此缓冲区中的数据上进行音量更改会更好，因为这样您就可以做出尽可能小的增量：每个样本一个增量，从而最大限度地减少由于音量引起的不连续性而导致点击的机会。

我发现 Java 提供的音量“控制线”倾向于音量跳跃会导致点击的情况，我相信这是因为增量仅以单个缓冲区读取的粒度实现（通常在每 1024 个样本一次更改的范围），而不是将更改分成更小的部分并在每个样本中添加一个。但我不知道音量控制是如何实现的，所以请对这个猜想持保留态度。

总而言之，Java.Sound 一直是一个令人头疼的问题。我认为本教程没有包含直接从字节写入文件的明确示例。我指责教程在“如何转换...”部分中埋葬了播放文件编码的最佳示例。但是，该教程中有很多有价值的免费信息。

编辑：2017 年 12 月 13 日

此后，我在自己的项目中使用以下代码从 PCM 文件写入音频。可以扩展InputStream 并将其用作AudioSystem.write 方法的参数，而不是实现TargetDataLine。

public class StereoPcmInputStream extends InputStream

    private float[] dataFrames;
    private int framesCounter;
    private int cursor;
    private int[] pcmOut = new int[2];
    private int[] frameBytes = new int[4];
    private int idx;
    
    private int framesToRead;

    public void setDataFrames(float[] dataFrames)
    
        this.dataFrames = dataFrames;
        framesToRead = dataFrames.length / 2;
    
    
    @Override
    public int read() throws IOException
    
        while(available() > 0)
        
            idx &= 3; 
            if (idx == 0) // set up next frame's worth of data
            
                framesCounter++; // count elapsing frames

                // scale to 16 bits
                pcmOut[0] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
                pcmOut[1] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
            
                // output as unsigned bytes, in range [0..255]
                frameBytes[0] = (char)pcmOut[0];
                frameBytes[1] = (char)(pcmOut[0] >> 8);
                frameBytes[2] = (char)pcmOut[1];
                frameBytes[3] = (char)(pcmOut[1] >> 8);
            
            
            return frameBytes[idx++]; 
        
        return -1;
    

    @Override 
    public int available()
    
        // NOTE: not concurrency safe.
        // 1st half of sum: there are 4 reads available per frame to be read
        // 2nd half of sum: the # of bytes of the current frame that remain to be read
        return 4 * ((framesToRead - 1) - framesCounter) 
                + (4 - (idx % 4));
        

    @Override
    public void reset()
    
        cursor = 0;
        framesCounter = 0;
        idx = 0;
    

    @Override
    public void close()
    
        System.out.println(
            "StereoPcmInputStream stopped after reading frames:" 
                + framesCounter);

这里要导出的源数据是立体声浮点数形式，范围从-1到1。结果流的格式是16位，立体声，小端。

对于我的特定应用程序，我省略了 skip 和 markSupported 方法。但如果需要，添加它们应该不难。

【讨论】：

【参考方案2】：

这是直接写入 wav 文件的源代码。您只需要了解数学和声音工程即可产生您想要的声音。在本例中，方程式计算双耳节拍。

import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;

public class Program 
    public static void main(String[] args) throws IOException 
        final double sampleRate = 44100.0;
        final double frequency = 440;
        final double frequency2 = 90;
        final double amplitude = 1.0;
        final double seconds = 2.0;
        final double twoPiF = 2 * Math.PI * frequency;
        final double piF = Math.PI * frequency2;

        float[] buffer = new float[(int)(seconds * sampleRate)];

        for (int sample = 0; sample < buffer.length; sample++) 
            double time = sample / sampleRate;
            buffer[sample] = (float)(amplitude * Math.cos(piF * time) * Math.sin(twoPiF * time));
        

        final byte[] byteBuffer = new byte[buffer.length * 2];

        int bufferIndex = 0;
        for (int i = 0; i < byteBuffer.length; i++) 
            final int x = (int)(buffer[bufferIndex++] * 32767.0);

            byteBuffer[i++] = (byte)x;
            byteBuffer[i] = (byte)(x >>> 8);
        

        File out = new File("out10.wav");

        final boolean bigEndian = false;
        final boolean signed = true;

        final int bits = 16;
        final int channels = 1;

        AudioFormat format = new AudioFormat((float)sampleRate, bits, channels, signed, bigEndian);
        ByteArrayInputStream bais = new ByteArrayInputStream(byteBuffer);
        AudioInputStream audioInputStream = new AudioInputStream(bais, format, buffer.length);
        AudioSystem.write(audioInputStream, AudioFileFormat.Type.WAVE, out);
        audioInputStream.close();

【讨论】：

【参考方案3】：

有关您想要实现的目标的更多详细信息会有所帮助。如果原始 WAV 数据适合您，只需使用 FileInputStream 和可能的 Scanner 将其转换为数字。但是，让我尝试为您提供一些有意义的示例代码来帮助您入门：

为此目的，有一个名为 com.sun.media.sound.WaveFileWriter 的类。

InputStream in = ...;
OutputStream out = ...;

AudioInputStream in = AudioSystem.getAudioInputStream(in);

WaveFileWriter writer = new WaveFileWriter();
writer.write(in, AudioFileFormat.Type.WAVE, outStream);

您可以实现自己的 AudioInputStream 来执行任何巫术将您的数字数组转换为音频数据。

writer.write(new VoodooAudioInputStream(numbers), AudioFileFormat.Type.WAVE, outStream);

正如@stacker 提到的，您当然应该熟悉 API。

【讨论】：

我的主要问题是巫毒教本身。我想看看是否有现成的代码/类可以做到这一点。我想我现在成功了，使用 AudioSystem 和 AudioInputStream。诀窍是在将每个声音样本转换为短字节之前反转每个声音样本中的字节顺序，因为 WAV 以 little-Endian 方式对数值进行编码。谢谢你，Yonatan。【参考方案4】：

如果您需要访问实际样本值，javax.sound.sample 包不适合处理 WAV 文件。该软件包可让您更改音量、采样率等，但如果您想要其他效果（例如，添加回声），您就得靠自己了。（Java 教程暗示应该可以直接处理示例值，但技术作者承诺过高。）

这个网站有一个处理WAV文件的简单类：http://www.labbookpages.co.uk/audio/javaWavFiles.html

【讨论】：

【参考方案5】：

WAV 文件规范 https://ccrma.stanford.edu/courses/422/projects/WaveFormat/

有一个 API 可用于您的目的 http://code.google.com/p/musicg/

【讨论】：

【参考方案6】：

首先，您可能需要知道 WAVE 结构的标头和数据位置，您可以找到规范 here。请注意，数据是小端的。

有一个API 可以帮助您实现目标。

【讨论】：

【参考方案7】：

javax.sound.sample package 支持 Wave 文件

由于不是一个微不足道的 API，您应该阅读介绍 API 之类的文章/教程

Java Sound, An Introduction

【讨论】：

【参考方案8】：

如果有人仍然认为它是必需的，我正在开发一个音频框架，旨在解决该问题和类似问题。虽然它在 Kotlin 上。你可以在 GitHub 上找到它：https://github.com/WaveBeans/wavebeans

看起来像这样：

wave("file:///path/to/file.wav")
    .map  it.asInt()  // here it as Sample type, need to convert it to desired type
    .asSequence(44100.0f) // framework processes everything as sequence/stream
    .toList() // read fully
    .toTypedArray() // convert to array

而且它不依赖于 Java 音频。

【讨论】：

【参考方案9】：

我使用FileInputStream 有一些魔力：

    byte[] byteInput = new byte[(int)file.length() - 44];
    short[] input = new short[(int)(byteInput.length / 2f)];


    try

        FileInputStream fis = new FileInputStream(file);
        fis.read(byteInput, 44, byteInput.length - 45);
        ByteBuffer.wrap(byteInput).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(input);

    catch(Exception e  )
        e.printStackTrace();

您的样本值在short[] input！

【讨论】：

什么意思：file.length() - 44 ...你是怎么得到这些数字的这是非常糟糕的代码。 WAV 是一个可以容纳几乎任何音频格式（甚至是 mp3）的容器。没有理由假设 WAV 文件包含 16 位 PCM。 It's also wrong to assume that the sound data appears at a fixed position in the file.

以上是关于Java - 读取、操作和编写 WAV 文件的主要内容，如果未能解决你的问题，请参考以下文章