如何将音频帧从输入 .mp4 传递到 libavcodec 中的输出 .mp4？

Posted 2023-03-13

技术标签:

【中文标题】如何将音频帧从输入 .mp4 传递到 libavcodec 中的输出 .mp4？【英文标题】：How can I pass audio frames from an input .mp4 to an output .mp4 in libavcodec? 【发布时间】：2021-08-10 08:13:01 【问题描述】：

我有一个项目可以正确打开 .mp4、提取视频帧、修改它们，然后将修改后的帧转储到输出 .mp4。一切正常（主要是 - 我有一个随机弹出的视频计时错误，但我会杀了它）除了音频的编写。我根本不想修改音频通道 - 我只想将输入 .mp4 中的音频原封不动地传递到输出 .mp4。

这里的代码太多，无法提供一个工作示例，主要是因为其中有很多 OpenGL 和 GLSL，但最重要的部分是我推进帧的位置。该方法在循环中调用，如果该帧是视频帧，则循环将图像数据发送到渲染硬件，对其执行一堆 GL 魔术，然后写出一帧视频。如果该帧是音频帧，则循环不执行任何操作，但 advance_frame() 方法应该只是将该帧转储到输出 mp4。我不知道 libavcodec 提供了什么来做到这一点。

请注意，在这里，我将音频数据包解码为帧，但这不是必需的。我宁愿使用数据包而不是消耗 CPU 时间来进行解码。（我已经尝试过另一种方式，但这就是我尝试解码数据，然后重新编码以创建输出流时的结果。）我只需要一种方法将数据包从输入传递到输出。

bool MediaContainerMgr::advance_frame() 
    int ret; // Crappy naming, but I'm using ffmpeg's name for it.
    while (true) 
        ret = av_read_frame(m_format_context, m_packet);
        if (ret < 0) 
            // Do we actually need to unref the packet if it failed?
            av_packet_unref(m_packet);
            if (ret == AVERROR_EOF) 
                finalize_output();
                return false;
            
            continue;
            //return false;
        
        else 
            int response = decode_packet();
            if (response != 0) 
                continue;
            
            // If this was an audio packet, the image renderer doesn't care about it - just push
            // the audio data to the output .mp4:
            if (m_packet->stream_index == m_audio_stream_index) 
                printf("m_packet->stream_index: %d\n", m_packet->stream_index);
                printf("  m_packet->pts: %lld\n", m_packet->pts);
                printf("  mpacket->size: %d\n", m_packet->size);
                // m_recording is true if we're writing a .mp4, as opposed to just letting OpenGL
                // display the frames onscreen.
                if (m_recording) 
                    int err = 0;
                    // I've tried everything I can find to try to push the audio frame to the
                    // output .mp4. This doesn't work, but neither do a half-dozen other libavcodec
                    // methods:
                    err = avcodec_send_frame(m_output_audio_codec_context, m_last_audio_frame);

                    if (err) 
                        printf("  encoding error: %d\n", err);
                    
                
            
            av_packet_unref(m_packet);
            if (m_packet->stream_index == m_video_stream_index) 
                return true;

advance_frame() 的主力是decode_packet()。所有这些都非常适合视频数据：

int MediaContainerMgr::decode_packet() 
    // Supply raw packet data as input to a decoder
    // https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga58bc4bf1e0ac59e27362597e467efff3
    int             response;
    AVCodecContext* codec_context = nullptr;
    AVFrame*        frame         = nullptr;

    if (m_packet->stream_index == m_video_stream_index) 
        codec_context = m_video_input_codec_context;
        frame = m_last_video_frame;
    
    if (m_packet->stream_index == m_audio_stream_index) 
        codec_context = m_audio_input_codec_context;
        frame = m_last_audio_frame;
    

    if (codec_context == nullptr) 
        return -1;
    

    response = avcodec_send_packet(codec_context, m_packet);
    if (response < 0) 
        char buf[256];
        av_strerror(response, buf, 256);
        printf("Error while receiving a frame from the decoder: %s\n", buf);
        return response;
    

    // Return decoded output data (into a frame) from a decoder
    // https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga11e6542c4e66d3028668788a1a74217c
    response = avcodec_receive_frame(codec_context, frame);
    if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) 
        return response;
     else if (response < 0) 
        char buf[256];
        av_strerror(response, buf, 256);
        printf("Error while receiving a frame from the decoder: %s\n", buf);
        return response;
     else 
        printf(
            "Stream %d, Frame %d (type=%c, size=%d bytes), pts %lld, key_frame %d, [DTS %d]\n",
            m_packet->stream_index,
            codec_context->frame_number,
            av_get_picture_type_char(frame->pict_type),
            frame->pkt_size,
            frame->pts,
            frame->key_frame,
            frame->coded_picture_number
        );
    
    return 0;

如有必要，我可以为所有上下文提供设置，但为简洁起见，也许我们可以避开 av_dump_format(m_output_format_context, 0, filename, 1) 显示的内容：

Output #0, mp4, to 'D:\yodeling_monkey_nuggets.mp4':
  Metadata:
    encoder         : Lavf58.64.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1920x1080, q=-1--1, 20305 kb/s, 29.97 fps, 30k tbn
    Stream #0:1: Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 125 kb/s

【问题讨论】：

也许使用或重新创建复制过滤器？ github.com/FFmpeg/FFmpeg/blob/master/libavfilter/af_acopy.c 这看起来不像一个完整的例子。看起来它复制了上下文（实际上，这是一个很棒的发现 - 谢谢），但仍然缺乏最终的细节 - 如何将输入数据包发送到输出上下文。奇怪的是，我遇到了很多与此类似的示例 - 但每个示例都非常古老，并且不适用于当前的库。 【参考方案1】：

要在没有解码-编码步骤的情况下将音频 AVPacket “按原样”输出到输出，您应该对此类数据包使用 av_write_frame 函数而不是 avcodec_send_frame。请注意，这些函数使用不同的上下文：AVFormatContext和AVCodecContext。

avcodec_send_frame 向编码器提供原始视频或音频帧

av_write_frame 将数据包直接传递给复用器

【讨论】：

我会试一试，但我不能说必须更改上下文类型令人兴奋。为什么有区别？为什么缺少兼容的接口？是否有可能从AVCodecContext 获得AVFormatContext，还是我必须经历一个完全不同的创建和绑定到输出容器的循环？ AVCodecContext 和 AVFormatContext 是在不同的媒体工作层上使用的不同的东西。当您使用文件/流和 AVCodecContext 时使用 AVFormatContext - 使用编码器/编码器时。我认为当您将媒体流写入输出时，您的代码中已经有 AVFormatContext 。搜索它并使用它来编写音频包，

以上是关于如何将音频帧从输入 .mp4 传递到 libavcodec 中的输出 .mp4？的主要内容，如果未能解决你的问题，请参考以下文章