RTMP视频流格式解析

Posted 2022-11-11 贺二公子

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了RTMP视频流格式解析相关的知识，希望对你有一定的参考价值。

原文地址：https://blog.csdn.net/zero_sama/article/details/69802783

文章目录

FLV 格式解析
RTMP抓包的视频流：

FLV 格式解析

FLV是由一个FLV Header 和若干tag(Video Tag, Audio Tag, Script Tag三种，分别代表视频流，音频流和脚本流)组成的二进制文件。

FLV Header示意图

FLV Header:

文件类型: 固定为 “FLV” (3 bytes)
版本信息: 一般为0x01 (1 byte)
流信息: 0000 0101 此flv文件包含视音频, 0000 0001 此flv文件包含视频 0000 0100 包含音频 (1 byte)
头长度: FLV文件头长度,一般为 3+1+1+4=9 bytes (4bytes)

FLV Body:

Body由一系列pre tag length 和 tag组成。

+----------------------------------------------------------------------+  
| Pre Tag Length | Tag Header | Tag Data | .... | Pre Tag Length | ... |
+----------------------------------------------------------------------+

Pre Tag Length: 前一个tag的长度 4 bytes
Tag Header: 1 + 3 + 3 +1 + 3 = 11 bytes

Tag:

tag header (11 bytes)

+-------------------------------------------------------------------------------------------------------------+
| Tag Type(1 byte) | Tag Data Length(3 bytes) | Timestap(3 bytes) | TimestapExt(1 byte)  |  StreamID(3 bytes) |
+-------------------------------------------------------------------------------------------------------------+

字段名	长度	取值
Tag Type Tag 类型	1 byte	0x08 音频 0x09 视频 0x12 脚本
Tag Data Length: Tag Data 长度	3 bytes
Timestamp 时间戳(单位ms)	3 bytes
TimestampExt 扩展时间戳	1 byte
StreamID 流ID 总是0	3 bytes

tag data

tag data如果是音频数据

第一个byte记录audio信息。前4bits表示音频格式（全部格式请看官方文档）:

hex	comment
0	未压缩
1	ADPCM
2	MP3
4	Nellymoser 16-kHz mono
5	Nellymoser 8-kHz mono
10	AAC

下面两个bits表示samplerate：

hex	comment
0	5.5KHz
1	11kHz
2	22kHz
3	44kHz

下面1bit表示采样长度：

hex	comment
0	snd8Bit
1	snd16Bit

下面1bit表示类型：

hex	comment
0	sndMomo
1	sndStereo

之后是数据。

tag data如果是视频数据

第一个byte记录video信息。前4bits表示视频帧类型：

hex	comment
1	keyframe
2	inner frame
3	disposable inner frame （h.263 only）
4	generated keyframe

后4bits表示解码器ID：

hex	comment
2	seronson h.263
3	screen video
4	On2 VP6
5	On2 VP6 with alpha channel
6	Screen video version 2
7	AVC (h.264)

FLV文件解析

数据	说明
00 00 00 00	前一个tag的长度，由于是第一个tag所以全为0.
12	表示tag类型为脚本
00 00 F6	表示tag data长度为0xF6个字节
00 00 00	时间戳
00	扩展时间戳
00 00 00	流ID

Video Tag

Audio Tag

RTMP抓包的视频流：

RTMP视频流格式与flv很相似,就是video tag 和 audio tag的tag data一个接一个的发送(不含tag header 和 pre tag length)。

audio /video信息：1 字节，这里0x17 表示I帧 AVC
AVC packet type ：1字节
- 0x00：AVC Sequence Header
- 0x01：AVC NALU
composition time ： 3字节， AVC时无意义，全为0

当AVC packet type为AVC Sequence Header时：

接下来就是AVCDecoderConfigurationRecord的内容：

type	length(byte)	value	comment
configurationVersion	1	0x01	版本
AVCProfileIndication	1	0x4d	sps[1]
profile_compatibility	1	0x00	sps[2]
AVCLevelIndication	1	0x2a	sps[3]
lengthSizeMinusOne	1	0xff	FLV中NALU包长数据所使用的字节数，包长= （lengthSizeMinusOne & 3） + 1
numOfSequenceParameterSets	1	0xe1	SPS个数，通常为0xe1 个数= numOfSequenceParameterSets & 01F
sequenceParameterSetLength	2	0x0014	SPS长度
sequenceParameterSetNALUnits			SPS内容
numOfPictureParameterSets	1	0x01	PPS个数，通常为0x01
pictureParameterSetLength	2	0x0004	PPS长度
pictureParameterSetNALUnits			PPS内容

当AVC packet type为AVC NALU(0x01)时：

接下来就是NALU的格式，如下：

NALU length：(lengthSizeMinusOne & 3) + 1字节 NALU长度
NALU Data：
NALU length：
NALU Data：
…

NALU Data包含H264编码数据，详解如下：

H264 SPS(Sequence Parameter Set) 和PPS(Picture Parameter Set) 数据结构，有如下H264bitstream

/* h.264 bitstreams */
const uint8_t sps[] =
0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0x00, 0x0a, 0xf8, 0x41, 0xa2; //0x00 00 00 01 或者 0x00 00 01 是分隔符
const uint8_t pps[] =
0x00, 0x00, 0x00, 0x01, 0x68, 0xce, 0x38, 0x80;

代码中SPS 2进制如下：

SPS各位详述如下：

Parameter Name	Type u:unsigned bit ue:指数哥伦布码	Value	Comments
forbidden_zero_bit	u(1)	0	Despite being forbidden, it must be set to 0!
nal_ref_idc	u(2)	3	3 means it is “important” (this is an SPS)
nal_unit_type	u(5)	7	Indicates this is a sequence parameter set
profile_idc	u(8)	66	Baseline profile
constraint_set0_flag	u(1)	0	We’re not going to honor constraints
constraint_set1_flag	u(1)	0	We’re not going to honor constraints
constraint_set2_flag	u(1)	0	We’re not going to honor constraints
constraint_set3_flag	u(1)	0	We’re not going to honor constraints
reserved_zero_4bits	u(4)	0	Better set them to zero
level_idc	u(8)	10	Level 1, sec A.3.1
seq_parameter_set_id	ue(v)	0	We’ll just use id 0.
log2_max_frame_num_minus4	ue(v)	0	Let’s have as few frame numbers as possible
pic_order_cnt_type	ue(v)	0	Keep things simple
log2_max_pic_order_cnt_lsb_minus4	ue(v)	0	Fewer is better.
num_ref_frames	ue(v)	0	We will only send I slices
gaps_in_frame_num_value_allowed_flag	u(1)	0	We will have no gaps
pic_width_in_mbs_minus_1	ue(v)	7	SQCIF is 8 macroblocks wide
pic_height_in_map_units_minus_1	ue(v)	5	SQCIF is 6 macroblocks high
frame_mbs_only_flag	u(1)	1	We will not to field/frame encoding
direct_8x8_inference_flag	u(1)	0	Used for B slices. We will not send B slices
frame_cropping_flag	u(1)	0	We will not do frame cropping
vui_prameters_present_flag	u(1)	0	We will not send VUI data
rbsp_stop_one_bit	u(1)	1	Stop bit. I missed this at first and it caused me much trouble.

贴几段代码:将RTSP回调的H264裸流转换并通过RTMP协议发送到客户端.

// pData:   H264裸流数据
// 将H264裸流转换成适用于RTMP协议的流并发送
void CRTMP::_DataCallBack( LONG nChannel, char* pData, LONG nSize, RTP_HEAD* pHead, RTP_FRAME_TYPE FrameType )

    char* p    = pData;
    char* pEnd = pData + nSize;
    if (m_bNeedUpdateMetadata)
       // 第一次收到I帧时，获取sps pps并保存
        if (FrameType != RTP_FRAME_TYPE_I)
        
            return;
        
        Nalu sps, pps;
        p = ReadOneNalu(pData, nSize, sps); ASSERT(p != NULL);
        p = ReadOneNalu(p, pEnd - p, pps);  ASSERT(p != NULL);
        h264_decode_sps((BYTE*)sps.data, sps.size, m_nWitdh, m_nHeight, m_nFps);

        m_MetaData.nSpsLen    = sps.size;
        m_MetaData.Sps        = (unsigned char*)calloc(sps.size, 1); memcpy(m_MetaData.Sps, sps.data, sps.size);
        m_MetaData.nPpsLen    = pps.size;
        m_MetaData.Pps        = (unsigned char *)calloc(pps.size, 1); memcpy(m_MetaData.Pps, pps.data, pps.size);
        m_MetaData.nWidth     = m_nWitdh;
        m_MetaData.nHeight    = m_nHeight;
        m_MetaData.nFrameRate = m_nFps;

        m_bNeedUpdateMetadata = false;
    
    unsigned int tick_gap = 1000 / m_MetaData.nFrameRate; // 1000ms/fps
    Nalu idr;
    while ((p = ReadOneNalu(p, pEnd - p, idr)) != NULL)
    
        if (idr.type == 0x07 || idr.type == 0x08)
           // 忽略重复的sps pps
            continue;
        

        if (!SendH264Packet(idr, m_tick)) 
        
            cout << "SendH264Packet Failed" << WSAGetLastError() << endl;
        
    

    m_tick += tick_gap; // 更新时间戳

// 从缓存读取一个nalu
// 返回指向下一个nalu起始位置的指针或者NULL
static char* ReadOneNalu(char* pBuf, int nSize, Nalu& NaluUnit)

    char Sep1[3], Sep2[4];
    Sep1[0] = (char)(0x1 >> 16 & 0xFF);
    Sep1[1] = (char)(0x1 >> 8 & 0xFF);
    Sep1[2] = (char)(0x1 & 0xFF);

    Sep2[0] = (char)(0x1 >> 24 & 0xFF);
    Sep2[1] = (char)(0x1 >> 16 & 0xFF);
    Sep2[2] = (char)(0x1 >> 8 & 0xFF);
    Sep2[3] = (char)(0x1 & 0xFF);

    bool bFindHead = false;
    bool bFindTail = false;
    char* pStart   = pBuf;
    char* pEnd     = pBuf + nSize;
_FIND:
    while (pStart <= pEnd - 4)
       // nalu 以00 00 01 或者00 00 00 01作为分隔符
        // look for head
        if (!bFindHead 
            &&(::memcmp(Sep1, pStart, 3) == 0))
        
            pStart       += 3;
            NaluUnit.data = pStart;
            NaluUnit.type = NaluUnit.data[0] & 0x1F;
            bFindHead     = true;
        
        else if (!bFindHead
            && (::memcmp(Sep2, pStart, 4) == 0))
        
            pStart       += 4;
            NaluUnit.data = pStart;
            NaluUnit.type = NaluUnit.data[0] & 0x1F;
            bFindHead     = true;
        
        // look for tail
        if (bFindHead)
        
            if (pEnd - pStart < 3)
            
                break;
            
            if (::memcmp(Sep1, pStart, 3) == 0)
            
                NaluUnit.size = pStart - NaluUnit.data;
                bFindTail     = true;
                break;
            
            else if (::memcmp(Sep2, pStart, 4) == 0)
            
                NaluUnit.size = pStart - NaluUnit.data;
                bFindTail     = true;
                break;
            
        

        ++pStart;
    
    if (!bFindTail)
    
        NaluUnit.size = pEnd - NaluUnit.data;
    

    if ((NaluUnit.type & 0x1F) == 0x06)
       // sei 跳过
        bFindHead = false;
        bFindTail = false;
        goto _FIND;
    

    return bFindHead ? NaluUnit.data + NaluUnit.size : NULL;

// 将nalu封包并发送
// 大端法存储
bool CRTMP::SendH264Packet(Nalu nalunit, unsigned int timestamp)

    CAMFBuffer buf(nalunit.size + 9);
    RTMP_Packet p;
    if (nalunit.type == 5)
    
        buf.WriteByte(0x17);    // I 帧
        buf.WriteByte(0x1);     // nalu
        buf.WriteInt24(0x0);
        buf.WriteInt32(nalunit.size);
        buf.WriteBuffer(nalunit.data, nalunit.size);
        SendSpsPpsInfo(m_MetaData.Pps, m_MetaData.nPpsLen, m_MetaData.Sps, m_MetaData.nSpsLen);
    
    else
    
        buf.WriteByte(0x27);    // p、b 帧
        buf.WriteByte(0x1);     // nalu
        buf.WriteInt24(0x0);
        buf.WriteInt32(nalunit.size);
        buf.WriteBuffer(nalunit.data, nalunit.size);
    

    return SendPacket(buf, 0x9, timestamp);

// 发送sps，pps信息
bool CRTMP::SendSpsPpsInfo(unsigned char *pps,int pps_len,unsigned char * sps,int sps_len)
    
        CAMFBuffer buffer(pps_len + sps_len + 16);
        buffer.WriteByte(0x17);
        buffer.WriteByte(0x0);  // AVCSequenceHeader
        buffer.WriteInt24(0x0); // composition time
        buffer.WriteByte(0x01);
        buffer.WriteByte(sps[1]);
        buffer.WriteByte(sps[2]);
        buffer.WriteByte(sps[3]);
        buffer.WriteByte((char)0xFF);       // lengthSizeMinusOne NALU包长所用字节数=(lengthSizeMinusOne & 3) + 1
        buffer.WriteByte((char)0xE1);       // numOfSequenceParameterSet SPS个数=numOfSequenceParameterSet & 0x1F
        buffer.WriteInt16(sps_len);         // SPS 长度
        buffer.WriteBuffer((char*)sps, sps_len);
        buffer.WriteByte((char)0x01);       // numOfPictureParameterSet PPS个数=numOfPictureParameterSet & 0x1F
        buffer.WriteInt16(pps_len);         // PPS 长度
        buffer.WriteBuffer((char*)pps, pps_len);

        return SendPacket(buffer, 0x9, 0);

以上是关于RTMP视频流格式解析的主要内容，如果未能解决你的问题，请参考以下文章