Java合并音频文件延迟和重叠

Posted 2023-02-16

技术标签:

【中文标题】Java合并音频文件延迟和重叠【英文标题】：Java merge Audio files with delay and overlapping 【发布时间】：2021-12-29 22:54:19 【问题描述】：

我的意图是不录制我的系统音频输出或简单地合并音频文件。

我需要一个“AudioMerger”类。它应该有一个方法 merge ，其中给出了总长度。然后它应该创建一个该长度的无声 wav 文件并向其中添加指定的声音。添加的声音可以重叠，也可以有偏移

示例： sound.wav 的长度为 3 秒

Merger merger = new Merger();
merger.add("sound.wav", 2);
merger.add("sound.wav", 6);
merger.add("sound.wav", 7);

//creates a 10 seconds wav file with the contents of sound.wav inserted at the specific seconds
merger.merge(10);
merger.saveToFile(new File("out.wav"));

【问题讨论】：

你已经搜索过关于在 Java 中使用音频的教程，以及这些教程如何没有让你达到可以应用他们展示的代码或想法的地步工作吗？那听起来你还是想看看如何混合音频数据。（当然有很明显的Java Record / Mix two audio streams，但是网上也有关于这个的教程，一般会更详细一点）嗯，你为什么还要遍历每个字节？您已经知道每次插入的所有偏移量，迭代每个与声音相关的偏移量将节省大量周期。另外，请注意，您 (a) 没有从 sound.wav 中删除 RIFF 标头，并且 (b) 您没有在其通道/采样率/等中读取以确保您的音频合并使用相同的值。跨度> 您想混合您的 sound.wav 数据，但您将文件作为字节加载，因此您将获得文件中的所有字节：44 RIFF 标头字节，后跟实际音频数据。所以你想读入字节，从头的前 44 个字节中复制采样率、比特率和通道数据，然后删除头。（并且您要确保正确混合您的样本。如果您的源文件每个样本使用 16 或 32 位，几乎可以肯定的是，您需要对短值求和，而不是字节值：***.com/questions/32856836/…) 【参考方案1】：

感谢https://***.com/users/740553/mike-pomax-kamermans 的帮助，我现在有了工作代码。

whistle.wav：https://voca.ro/1iqDr3yVZ6uG out.wav：https://voca.ro/1jxlHkNUuH9r

市长的问题是创建一个空的 wav 文件。为了实现这一点，我需要在开始时编写一个适当的标题。您可以在此处阅读有关 .wav 标头的详细信息：http://www.topherlee.com/software/pcm-tut-wavformat.html 当我实现这个时，我在 Little/Big-Endian 中读写字节时遇到了困难。基本上，这些指定了存储数字的方向。 Big Endian 像我们现在和 Java 一样存储它们（从左到右），而 Little Endian 将它们反向存储（从右到左）。一个 wav 文件期望它的所有数字都在 Little Endian 中。因此，当我们从 wav 文件加载数字时，我们需要将它们转换为 Big Endian（我们的 Java 数字），而当我们编写文件时，我们需要将它们重新转换为 Little Endian。为此，我们可以使用 Integer.reverseBytes() 和 Short.reverseBytes() 方法。

一百二：大端：102 小端：201

我遇到的另一个难题是合并音频字节数组时。我将数组的每一位加在一起并计算了平均值。但是我的 SampleSize 是 16 位的，所以我需要计算每两个字节的平均值，而不是每个字节。

当首先让这个工作时，在我插入的音频播放之前总是有一种奇怪的噪音。我不小心用文件内容填充了字节数组。合并我的程序时，还合并了标题数据并将它们解释为产生这种噪音的声音数据。砍掉标题后，我的音频听起来不错。

但是，当我的流重叠时，它们会产生很多前景噪音。在计算平均值时，我没有将除数转换为浮点数，因此它削减了一些音频数据。 3/2 变成 1 而不是 1.5 舍入到 2

我实际上做对的事情是确保我的音频只能以可被 2 整除的偏移量插入。否则它将前一个幅度的第一个字节与下一个幅度的最后一个字节合并。

import java.io.File;
import java.io.IOException;

import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.UnsupportedAudioFileException;

public class Main 

    public static void main(String[] args) throws IOException, UnsupportedAudioFileException, LineUnavailableException 

        AudioMerger merger = new AudioMerger();
        MergeSound sound = new MergeSound(new File("whistle.wav"));

        merger.addSound(2, sound);
        merger.addSound(5, sound);
        merger.addSound(5.5, sound);
        merger.merge(10);
        merger.saveToFile(new File("out.wav"));

import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.ByteBuffer;

public class MergeSound 

    private short audioFormat;
    private int sampleRate;
    private short sampleSize;
    private short channels;

    private ByteBuffer buffer;

    public MergeSound(File file) throws IOException 

        DataInputStream in = new DataInputStream(new FileInputStream(file));
        byte[] sound = new byte[in.available() - 44];

        // read header data
        in.skipNBytes(20);
        audioFormat = Short.reverseBytes(in.readShort());
        channels = Short.reverseBytes(in.readShort());
        sampleRate = Integer.reverseBytes(in.readInt());
        in.skipNBytes(6);
        sampleSize = Short.reverseBytes(in.readShort());
        in.skipNBytes(8);// make sure to cut the full header of else there will be strange noise

        in.read(sound);
        buffer = ByteBuffer.wrap(sound);
    

    public ByteBuffer getBuffer() 
        return buffer;
    

    public short getAudioFormat() 
        return audioFormat;
    

    public void setAudioFormat(short audioFormat) 
        this.audioFormat = audioFormat;
    

    public int getSampleRate() 
        return sampleRate;
    

    public void setSampleRate(int sampleRate) 
        this.sampleRate = sampleRate;
    

    public short getSampleSize() 
        return sampleSize;
    

    public void setSampleSize(short sampleSize) 
        this.sampleSize = sampleSize;
    

    public short getChannels() 
        return channels;
    

    public void setChannels(short channels) 
        this.channels = channels;


import static java.lang.Math.ceil;
import static java.lang.Math.round;

import java.io.DataOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.ArrayList;

public class AudioMerger 

    private short audioFormat = 1;
    private int sampleRate = 44100;
    private short sampleSize = 16;
    private short channels = 2;
    private short blockAlign = (short) (sampleSize * channels / 8);
    private int byteRate = sampleRate * sampleSize * channels / 8;
    private ByteBuffer audioBuffer;
    private ArrayList<MergeSound> sounds = new ArrayList<MergeSound>();
    private ArrayList<Integer> offsets = new ArrayList<Integer>();

    public void addSound(double offsetInSeconds, MergeSound sound) 

        if (sound.getAudioFormat() != audioFormat)
            new RuntimeException("Incompatible AudioFormat");
        if (sound.getSampleRate() != sampleRate)
            new RuntimeException("Incompatible SampleRate");
        if (sound.getSampleSize() != sampleSize)
            new RuntimeException("Incompatible SampleSize");
        if (sound.getChannels() != channels)
            new RuntimeException("Incompatible amount of Channels");

        int offset = secondsToByte(offsetInSeconds);
        offset = offset % 2 == 0 ? offset : offset + 1;// ensure we start at short when merging

        sounds.add(sound);
        offsets.add(offset);
    

    public void merge(double durationInSeconds) 
        audioBuffer = ByteBuffer.allocate(secondsToByte(durationInSeconds));

        for (int i = 0; i < sounds.size(); i++) 

            ByteBuffer buffer = sounds.get(i).getBuffer();
            int offset1 = offsets.get(i);

            // iterate over all sound data to append it
            while (buffer.hasRemaining()) 

                int position = offset1 + buffer.position();// the global position in audioBuffer

                // exit if audio plays after end
                if (position >= audioBuffer.capacity())
                    return;

                // add the audio data to the vars
                short sum = Short.reverseBytes(buffer.getShort());
                int matches = 1;

                // make sure later entries dont override the previsously merged
               //continue only if theres empty audio data
                if (audioBuffer.getShort(position) == 0) 

                    // iterate over the other sounds and check if the need to be merged
                    for (int j = i + 1; j < sounds.size(); j++) // set j to i+1 to avoid all previous
                        ByteBuffer mergeBuffer = sounds.get(j).getBuffer();
                        int mergeOffset = offsets.get(j);

                        // check if this soundfile contains data that has to be merged
                        if (position >= mergeOffset && position < mergeOffset + mergeBuffer.capacity()) 
                            sum += Short.reverseBytes(mergeBuffer.getShort(position - mergeOffset));
                            matches++;
                        
                    
//make sure to cast to float 3/1=1 BUT round(3/1f)=2 for example
                    audioBuffer.putShort(position, Short.reverseBytes((short) round(sum / (float) matches)));
                
            
            buffer.rewind();// So the sound can be added again
        
    

    private int secondsToByte(double seconds) 
        return (int) ceil(seconds * byteRate);
    

    public void saveToFile(File file) throws IOException 

        byte[] audioData = audioBuffer.array();

        int audiosize = audioData.length;
        int fileSize = audioSize + 44;

        // The stream that writes the audio file to the disk
        DataOutputStream out = new DataOutputStream(new FileOutputStream(file));

        // Write Header
        out.writeBytes("RIFF");// 0-4 ChunkId always RIFF
        out.writeInt(Integer.reverseBytes(fileSize));// 5-8 ChunkSize always audio-length +header-length(44)
        out.writeBytes("WAVE");// 9-12 Format always WAVE
        out.writeBytes("fmt ");// 13-16 Subchunk1 ID always "fmt " with trailing whitespace
        out.writeInt(Integer.reverseBytes(16)); // 17-20 Subchunk1 Size always 16
        out.writeShort(Short.reverseBytes(audioFormat));// 21-22 Audio-Format 1 for PCM PulseAudio
        out.writeShort(Short.reverseBytes(channels));// 23-24 Num-Channels 1 for mono, 2 for stereo
        out.writeInt(Integer.reverseBytes(sampleRate));// 25-28 Sample-Rate
        out.writeInt(Integer.reverseBytes(byteRate));// 29-32 Byte Rate
        out.writeShort(Short.reverseBytes(blockAlign));// 33-34 Block Align
        out.writeShort(Short.reverseBytes(sampleSize));// 35-36 Bits-Per-Sample
        out.writeBytes("data");// 37-40 Subchunk2 ID always data
        out.writeInt(Integer.reverseBytes(audioSize));// 41-44 Subchunk 2 Size audio-length

        out.write(audioData);// append the merged data
        out.close();// close the stream properly

【讨论】：

请注意，未来的访问者会从听到您解释必须更改的内容中受益匪浅。链接很好，但是很多列表帖子，即使不点击=，答案也应该有意义（当然，写一篇文章有点工作，但这只是一次性的工作，这将使受益匪浅 的未来访问者）。我能想到的都写了。希望这对将来的某人有所帮助。谢谢。太棒了，干得好！

以上是关于Java合并音频文件延迟和重叠的主要内容，如果未能解决你的问题，请参考以下文章