Media Foundation 网络摄像头视频 H264 编码/解码在播放时会产生伪影
Posted
技术标签:
【中文标题】Media Foundation 网络摄像头视频 H264 编码/解码在播放时会产生伪影【英文标题】:Media Foundation webcam video H264 encode/decode produces artifacts when played back 【发布时间】:2017-01-09 11:12:56 【问题描述】:我有一个解决方案,我使用 Media Foundation 的 h264 编码器对来自网络摄像头的视频 (YUY2) 样本进行编码。然后我通过 TCP 连接将它发送到另一个应用程序,该应用程序使用 Media Foundation 的 h264 解码器将流解码回 YUY2 格式。解码后,视频样本/图像使用 DirectX 呈现在屏幕上。
问题在于,在关键帧之间,视频图像会出现越来越多的伪影。收到关键帧后,工件就会消失。
我将 TCP 连接从范围中删除,并在编码后立即进行解码,但我仍然有工件困扰我。
这是从网络摄像头接收样本的回调方法:
//-------------------------------------------------------------------
// OnReadSample
//
// Called when the IMFMediaSource::ReadSample method completes.
//-------------------------------------------------------------------
HRESULT CPreview::OnReadSample(
HRESULT hrStatus,
DWORD /* dwStreamIndex */,
DWORD dwStreamFlags,
LONGLONG llTimestamp,
IMFSample *pSample // Can be NULL
)
HRESULT hr = S_OK;
IMFMediaBuffer *pBuffer = NULL;
EnterCriticalSection(&m_critsec);
if (FAILED(hrStatus))
hr = hrStatus;
if (SUCCEEDED(hr))
if (pSample)
IMFSample *pEncodedSample = NULL;
hr = m_pCodec->EncodeSample(pSample, &pEncodedSample);
if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT || pEncodedSample == NULL)
hr = m_pReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, NULL, NULL, NULL, NULL);
LeaveCriticalSection(&m_critsec);
return S_OK;
LONGLONG llEncodedSampleTimeStamp = 0;
LONGLONG llEncodedSampleDuration = 0;
pEncodedSample->GetSampleTime(&llEncodedSampleTimeStamp);
pEncodedSample->GetSampleDuration(&llEncodedSampleDuration);
pBuffer = NULL;
hr = pEncodedSample->GetBufferByIndex(0, &pBuffer);
if (hr != S_OK)
hr = m_pReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, NULL, NULL, NULL, NULL);
LeaveCriticalSection(&m_critsec);
return hr;
BYTE *pOutBuffer = NULL;
DWORD dwMaxLength, dwCurrentLength;
hr = pBuffer->Lock(&pOutBuffer, &dwMaxLength, &dwCurrentLength);
if (hr != S_OK)
hr = m_pReader->ReadSample((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, 0, NULL, NULL, NULL, NULL);
LeaveCriticalSection(&m_critsec);
return hr;
// Send encoded webcam data to connected clients
//SendData(pOutBuffer, dwCurrentLength, llEncodedSampleTimeStamp, llEncodedSampleDuration);
pBuffer->Unlock();
SafeRelease(&pBuffer);
IMFSample *pDecodedSample = NULL;
m_pCodec->DecodeSample(pEncodedSample, &pDecodedSample);
if (pDecodedSample != NULL)
pDecodedSample->SetSampleTime(llTimestamp);
pDecodedSample->SetSampleTime(llTimestamp - llLastSampleTimeStamp);
llLastSampleTimeStamp = llTimestamp;
hr = pDecodedSample->GetBufferByIndex(0, &pBuffer);
//hr = pSample->GetBufferByIndex(0, &pBuffer);
// Draw the frame.
if (SUCCEEDED(hr))
hr = m_draw.DrawFrame(pBuffer);
SafeRelease(&pDecodedSample);
SafeRelease(&pBuffer);
SafeRelease(&pEncodedSample);
// Request the next frame.
if (SUCCEEDED(hr))
hr = m_pReader->ReadSample(
(DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM,
0,
NULL, // actual
NULL, // flags
NULL, // timestamp
NULL // sample
);
if (FAILED(hr))
NotifyError(hr);
SafeRelease(&pBuffer);
LeaveCriticalSection(&m_critsec);
return hr;
这里是编码器/解码器初始化代码:
HRESULT Codec::InitializeEncoder()
IMFMediaType *pMFTInputMediaType = NULL, *pMFTOutputMediaType = NULL;
IUnknown *spTransformUnk = NULL;
DWORD mftStatus = 0;
UINT8 blob[] = 0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0xc0, 0x1e, 0x96, 0x54, 0x05, 0x01,
0xe9, 0x80, 0x80, 0x40, 0x00, 0x00, 0x00, 0x01, 0x68, 0xce, 0x3c, 0x80 ;
CoInitializeEx(NULL, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE);
MFStartup(MF_VERSION);
// Create H.264 encoder.
CHECK_HR(CoCreateInstance(CLSID_CMSH264EncoderMFT, NULL, CLSCTX_INPROC_SERVER, IID_IUnknown, (void**)&spTransformUnk), "Failed to create H264 encoder MFT.\n");
CHECK_HR(spTransformUnk->QueryInterface(IID_PPV_ARGS(&pEncoderTransform)), "Failed to get IMFTransform interface from H264 encoder MFT object.\n");
// Transform output type
MFCreateMediaType(&pMFTOutputMediaType);
pMFTOutputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
pMFTOutputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);
pMFTOutputMediaType->SetUINT32(MF_MT_AVG_BITRATE, 500000);
CHECK_HR(MFSetAttributeSize(pMFTOutputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");
pMFTOutputMediaType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_MixedInterlaceOrProgressive);
pMFTOutputMediaType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);
// Special attributes for H264 transform, if needed
/*CHECK_HR(pMFTOutputMediaType->SetUINT32(MF_MT_MPEG2_PROFILE, eAVEncH264VProfile_Base), "Failed to set profile on H264 MFT out type.\n");
CHECK_HR(pMFTOutputMediaType->SetUINT32(MF_MT_MPEG2_LEVEL, eAVEncH264VLevel4), "Failed to set level on H264 MFT out type.\n");
CHECK_HR(pMFTOutputMediaType->SetUINT32(MF_MT_MAX_KEYFRAME_SPACING, 10), "Failed to set key frame interval on H264 MFT out type.\n");
CHECK_HR(pMFTOutputMediaType->SetUINT32(CODECAPI_AVEncCommonQuality, 100), "Failed to set H264 codec qulaity.\n");
CHECK_HR(pMFTOutputMediaType->SetUINT32(CODECAPI_AVEncMPVGOPSize, 1), "Failed to set CODECAPI_AVEncMPVGOPSize = 1\n");*/
CHECK_HR(pEncoderTransform->SetOutputType(0, pMFTOutputMediaType, 0), "Failed to set output media type on H.264 encoder MFT.\n");
// Transform input type
MFCreateMediaType(&pMFTInputMediaType);
pMFTInputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
pMFTInputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_YUY2);
CHECK_HR(MFSetAttributeSize(pMFTInputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");
CHECK_HR(pEncoderTransform->SetInputType(0, pMFTInputMediaType, 0), "Failed to set input media type on H.264 encoder MFT.\n");
CHECK_HR(pEncoderTransform->GetInputStatus(0, &mftStatus), "Failed to get input status from H.264 MFT.\n");
if (MFT_INPUT_STATUS_ACCEPT_DATA != mftStatus)
printf("E: pEncoderTransform->GetInputStatus() not accept data.\n");
goto done;
CHECK_HR(pEncoderTransform->ProcessMessage(MFT_MESSAGE_COMMAND_FLUSH, NULL), "Failed to process FLUSH command on H.264 MFT.\n");
CHECK_HR(pEncoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_BEGIN_STREAMING, NULL), "Failed to process BEGIN_STREAMING command on H.264 MFT.\n");
CHECK_HR(pEncoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_START_OF_STREAM, NULL), "Failed to process START_OF_STREAM command on H.264 MFT.\n");
return S_OK;
done:
SafeRelease(&pMFTInputMediaType);
SafeRelease(&pMFTOutputMediaType);
return S_FALSE;
HRESULT Codec::InitializeDecoder()
IUnknown *spTransformUnk = NULL;
IMFMediaType *pMFTOutputMediaType = NULL;
IMFMediaType *pMFTInputMediaType = NULL;
DWORD mftStatus = 0;
// Create H.264 decoder.
CHECK_HR(CoCreateInstance(CLSID_CMSH264DecoderMFT, NULL, CLSCTX_INPROC_SERVER, IID_IUnknown, (void**)&spTransformUnk), "Failed to create H264 decoder MFT.\n");
// Query for the IMFTransform interface
CHECK_HR(spTransformUnk->QueryInterface(IID_PPV_ARGS(&pDecoderTransform)), "Failed to get IMFTransform interface from H264 decoder MFT object.\n");
// Create input mediatype for the decoder
MFCreateMediaType(&pMFTInputMediaType);
pMFTInputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
pMFTInputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);
CHECK_HR(MFSetAttributeSize(pMFTInputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTInputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");
pMFTInputMediaType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_MixedInterlaceOrProgressive);
pMFTInputMediaType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);
CHECK_HR(pDecoderTransform->SetInputType(0, pMFTInputMediaType, 0), "Failed to set input media type on H.264 encoder MFT.\n");
CHECK_HR(pDecoderTransform->GetInputStatus(0, &mftStatus), "Failed to get input status from H.264 MFT.\n");
if (MFT_INPUT_STATUS_ACCEPT_DATA != mftStatus)
printf("E: pDecoderTransform->GetInputStatus() not accept data.\n");
goto done;
// Create outmedia type for the decoder
MFCreateMediaType(&pMFTOutputMediaType);
pMFTOutputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
pMFTOutputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_YUY2);
CHECK_HR(MFSetAttributeSize(pMFTOutputMediaType, MF_MT_FRAME_SIZE, 640, 480), "Failed to set frame size on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_FRAME_RATE, 30, 1), "Failed to set frame rate on H264 MFT out type.\n");
CHECK_HR(MFSetAttributeRatio(pMFTOutputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1), "Failed to set aspect ratio on H264 MFT out type.\n");
CHECK_HR(pDecoderTransform->SetOutputType(0, pMFTOutputMediaType, 0), "Failed to set output media type on H.264 decoder MFT.\n");
CHECK_HR(pDecoderTransform->ProcessMessage(MFT_MESSAGE_COMMAND_FLUSH, NULL), "Failed to process FLUSH command on H.264 MFT.\n");
CHECK_HR(pDecoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_BEGIN_STREAMING, NULL), "Failed to process BEGIN_STREAMING command on H.264 MFT.\n");
CHECK_HR(pDecoderTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_START_OF_STREAM, NULL), "Failed to process START_OF_STREAM command on H.264 MFT.\n");
return S_OK;
done:
SafeRelease(&pMFTInputMediaType);
SafeRelease(&pMFTOutputMediaType);
return S_FALSE;
这是实际的解码/编码器部分:
HRESULT Codec::EncodeSample(IMFSample *pSample, IMFSample **ppEncodedSample)
return TransformSample(pEncoderTransform, pSample, ppEncodedSample);
HRESULT Codec::DecodeSample(IMFSample *pSample, IMFSample **ppEncodedSample)
return TransformSample(pDecoderTransform, pSample, ppEncodedSample);
HRESULT Codec::TransformSample(IMFTransform *pTransform, IMFSample *pSample, IMFSample **ppSampleOut)
IMFSample *pOutSample = NULL;
IMFMediaBuffer *pBuffer = NULL;
DWORD mftOutFlags;
pTransform->ProcessInput(0, pSample, 0);
CHECK_HR(pTransform->GetOutputStatus(&mftOutFlags), "H264 MFT GetOutputStatus failed.\n");
// Note: Decoder does not return MFT flag MFT_OUTPUT_STATUS_SAMPLE_READY, so we just need to rely on S_OK return
if (pTransform == pEncoderTransform && mftOutFlags == S_OK)
return S_OK;
else if (pTransform == pEncoderTransform && mftOutFlags == MFT_OUTPUT_STATUS_SAMPLE_READY ||
pTransform == pDecoderTransform && mftOutFlags == S_OK)
DWORD processOutputStatus = 0;
MFT_OUTPUT_DATA_BUFFER outputDataBuffer;
MFT_OUTPUT_STREAM_INFO StreamInfo;
pTransform->GetOutputStreamInfo(0, &StreamInfo);
CHECK_HR(MFCreateSample(&pOutSample), "Failed to create MF sample.\n");
CHECK_HR(MFCreateMemoryBuffer(StreamInfo.cbSize, &pBuffer), "Failed to create memory buffer.\n");
if (pTransform == pEncoderTransform)
CHECK_HR(pBuffer->SetCurrentLength(StreamInfo.cbSize), "Failed SetCurrentLength.\n");
CHECK_HR(pOutSample->AddBuffer(pBuffer), "Failed to add sample to buffer.\n");
outputDataBuffer.dwStreamID = 0;
outputDataBuffer.dwStatus = 0;
outputDataBuffer.pEvents = NULL;
outputDataBuffer.pSample = pOutSample;
HRESULT hr = pTransform->ProcessOutput(0, 1, &outputDataBuffer, &processOutputStatus);
if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT)
SafeRelease(&pBuffer);
SafeRelease(&pOutSample);
return hr;
LONGLONG llVideoTimeStamp, llSampleDuration;
pSample->GetSampleTime(&llVideoTimeStamp);
pSample->GetSampleDuration(&llSampleDuration);
CHECK_HR(outputDataBuffer.pSample->SetSampleTime(llVideoTimeStamp), "Error setting MFT sample time.\n");
CHECK_HR(outputDataBuffer.pSample->SetSampleDuration(llSampleDuration), "Error setting MFT sample duration.\n");
if (pTransform == pEncoderTransform)
IMFMediaBuffer *pMediaBuffer = NULL;
DWORD dwBufLength;
CHECK_HR(pOutSample->ConvertToContiguousBuffer(&pMediaBuffer), "ConvertToContiguousBuffer failed.\n");
CHECK_HR(pMediaBuffer->GetCurrentLength(&dwBufLength), "Get buffer length failed.\n");
WCHAR *strDebug = new WCHAR[256];
wsprintf(strDebug, L"Encoded sample ready: time %I64d, sample duration %I64d, sample size %i.\n", llVideoTimeStamp, llSampleDuration, dwBufLength);
OutputDebugString(strDebug);
SafeRelease(&pMediaBuffer);
else if (pTransform == pDecoderTransform)
IMFMediaBuffer *pMediaBuffer = NULL;
DWORD dwBufLength;
CHECK_HR(pOutSample->ConvertToContiguousBuffer(&pMediaBuffer), "ConvertToContiguousBuffer failed.\n");
CHECK_HR(pMediaBuffer->GetCurrentLength(&dwBufLength), "Get buffer length failed.\n");
WCHAR *strDebug = new WCHAR[256];
wsprintf(strDebug, L"Decoded sample ready: time %I64d, sample duration %I64d, sample size %i.\n", llVideoTimeStamp, llSampleDuration, dwBufLength);
OutputDebugString(strDebug);
SafeRelease(&pMediaBuffer);
// Decoded sample out
*ppSampleOut = pOutSample;
//SafeRelease(&pMediaBuffer);
SafeRelease(&pBuffer);
return S_OK;
done:
SafeRelease(&pBuffer);
SafeRelease(&pOutSample);
return S_FALSE;
我已经为此搜索了很长一段时间的解决方案,发现一个问题的定义与我的问题非常相似,但由于它是针对不同的 API,所以对我没有帮助。 FFMPEG decoding artifacts between keyframes
最好的问候, 托尼·里科宁
【问题讨论】:
我注意到,如果我在流开始后等待大约 30-60 秒,工件就会消失。这可能是某种缓冲问题,还是我应该在让解码器获取样本之前稍微缓冲一下样本。还是我的时间戳有问题? pMFTOutputMediaType->SetUINT32(MF_MT_AVG_BITRATE, 500000); 500kbps 对于比特率来说太低了。在这种情况下,编码质量会很差,并会导致伪影。尝试使用 5000000 (5Mbps),甚至更高的值。 尝试了更大的比特率,工件的尺寸变小了,但并没有消失。我不是在谈论由有损编码算法引起的常见伪像,而是在我看来像是丢弃数据或样本乱序的伪像。是否有可能在 OnReadSample 回调方法中,因为第一个编码样本需要多个输入样本,传递给此回调函数的 IMFSamples 被释放?我在想也许将提供的样本复制为回调函数,然后在我完成它们后释放它们。 OnReadSample中返回的IMFSample在函数退出后被释放。如果您需要保留样本,则必须对其进行 AddRef 并将其放入队列中。 确实复制了回调函数提供的样本并将它们排队,然后在我完成它们后释放它们。但是,这并没有改变任何东西 - 从流开始的 30-60 秒内仍然有工件。令我困扰的是,如果流打开 30-60 秒,伪影确实会结束,之后不会出现伪影。但我需要从一开始就保持健康。 【参考方案1】:我在这里玩游戏有点晚了,但我可以确认主页上的答案是正确的解决方案。我也遇到了同样的问题,但我只使用了这个示例代码的解码器部分。我正在阅读一个 MP4 文件,并看到关键帧之间的伪影越来越多。收到关键帧后,图像看起来不错,然后逐渐变差。这是我在 Codec::InitializeDecoder() 中添加的代码:
// Set CODECAPI_AVLowLatencyMode
ICodecAPI *mpCodecAPI = NULL;
hr = pDecoderTransform->QueryInterface(IID_PPV_ARGS(&mpCodecAPI));
CHECK_HR(hr, "Failed to get ICodecAPI.\n");
VARIANT var;
var.vt = VT_BOOL;
var.boolVal = VARIANT_TRUE;
hr = mpCodecAPI->SetValue(&CODECAPI_AVLowLatencyMode, &var);
CHECK_HR(hr, "Failed to enable low latency mode.\n");
添加这些更改后,程序运行得更好!感谢 GitHub 上的这个软件给了我必要的代码: https://github.com/GameTechDev/ChatHeads/blob/master/VideoStreaming/EncodeTransform.cpp
【讨论】:
很高兴听到这将解决 Windows 8 及更高版本中的问题。遗憾的是,我们仍然支持此 LowLatencyMode 选项不可用的 Windows 7。很可能我们很快就会放弃对 Windows 7 的支持,所以这个解决方案是有效的。【参考方案2】:这听起来像是质量/比特率问题。
pMFTOutputMediaType->SetUINT32(MF_MT_AVG_BITRATE, 500000);
500kbps 的比特率值太低了,您可以尝试使用更大的值,例如 5、10 或 20Mbps。
我可以建议:
由于您是自己创建 H264 编码器,因此您可以查询 ICodecAPI 并尝试不同的设置。即CODECAPI_AVEncCommonRateControlMode、CODECAPI_AVEncCommonQuality、CODECAPI_AVEncAdaptiveMode、CODECAPI_AVEncCommonQualityVsSpeed、CODECAPI_AVEncVideoEncodeQP。
您也可以尝试创建硬件 H264 编码器并使用 IMFDXGIDeviceManager(Windows 8 及更高版本?)
【讨论】:
【参考方案3】:这个问题似乎有了答案,但我还是想分享一下我的经验。希望对遇到类似问题的人有所帮助。
我在解码 H264 时也遇到了类似的伪影问题。但是,在我的情况下,流来自视频捕获设备,并且在流开始后 30-60 秒后伪影不会消失。
在我看来,我猜由于低延迟,正常设置的解码器无法解码实时流。因此,我尝试启用 CODECAPI_AVLowLatencyMode,它可以将解码/编码模式设置为低延迟以进行实时通信或实时捕获。 (要了解更多详细信息,请参阅以下来自 MS 的链接 https://msdn.microsoft.com/zh-tw/library/windows/desktop/hh447590(v=vs.85).aspx ) 好在问题已经解决,解码器工作正常。
虽然我们的问题有点不同,但你可以尝试根据你的情况启用/禁用CODECAPI_AVLowLatencyMode,希望你也能有好消息。
【讨论】:
【参考方案4】:这听起来像是一个 IP(B) 帧排序问题。
编码帧顺序与解码帧顺序不同。我没有测试您的代码,但我认为编码器以编码顺序提供帧,并且您需要在渲染之前重新排序帧。
【讨论】:
似乎即使使用 Microsoft API 示例并将其另存为 MP4,工件仍然存在。所以我猜测是微软的 H264 编码器/解码器在处理网络摄像头图像上的噪声方面表现不佳(因为大多数廉价网络摄像头会产生噪声图像),还是编码器/解码器有些损坏。我决定改用 WMV3。不确定哪个答案是正确答案,因为实际问题尚未完全解决,并且可能是开发人员无法解决的问题。以上是关于Media Foundation 网络摄像头视频 H264 编码/解码在播放时会产生伪影的主要内容,如果未能解决你的问题,请参考以下文章
Windows Media Foundation 音视频采集 小记
使用 Microsoft Media Foundation 从文件播放视频