WebRTC音频系统 peerconnection初始化

Posted 2023-02-17 shichaog

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了WebRTC音频系统 peerconnection初始化相关的知识，希望对你有一定的参考价值。

文章目录

本章以WebRTC 的peerconnection native层例子分析P2P视频会议是如何实现的，第一章1.5小节编译出各个模块的二进制可执行测试程序的同时也会编译出WebRTC peerconnection例子的可执行程序，其中客户端应用程序在examples/peerconnection/client目录，服务器端应用程序在examples/peerconnection/server目录下。客户端具有简单的音视频功能，服务器端使得客户端程序能够通过信令开启会议。

2.1 peerconnection conductor

conductor的作用是桥接UI层和WebRTC会议逻辑层，和会议有关的核心启动流程放在了peerConnectionFactory的创建过程中，本小节先看一下UI层是如何触发核心启动流程创建的，整个会议过程需要服务器先启动服务，如./peerconnection_server --port=8888，成功启动服务端后Ubuntu终端上后会有如下输出：

Server listening on port 8888

WebRTC的P2P例子通信双方在连接阶段使用SDP协商（offer/answer）后续传输的多媒体信息、主机候选地址以及网络传输协议等，SDP协议内容不是本书的重点，关于该协议可见RFC4566。启动 peerconnection native例子的server端之后，再启动client端，则client端会弹出如图2-1所示的界面，鼠标点击右侧connect按钮之后，将连接到server端，并在界面中展示可以显示可以通信的对象，如图2-2所示，可以看到P2P两边显示的名称是一样的，（这是因为两个client是在同一个电脑不同终端启动的，所以两边看到的都是gsc@240），下文所述发生在选中图中gsc@240这一通信方并回车之后。

图2-1 启动peerconnection client界面

图2-2 选中gsc@240并回车

图2-2中的动作会触发conductor.cc文件中的void Conductor::ConnectToPeer(int peer_id)函数，其会通过SDP协议向对端发送要通信的元信息（网络以及多媒体信息），而对端则是通过void Conductor::**OnMessageFromPeer**(int peer_id, const std::string& message)处理接收到的信息，而既然是peer到peer的通信，所以不论是发送端（offer）还是接收端（answer）都应该创建peer_connection_这个对象，并由peer_connection_对象调用CreateOffer和CreateAnswer完成SDP协议协商。

//webrtc/examples/peerconnection/client/conductor.cc
void Conductor::ConnectToPeer(int peer_id) 
  RTC_DCHECK(peer_id_ == -1);
  RTC_DCHECK(peer_id != -1);
//并不支持mesh网络的多个P2P之间的互联，只支持单个P2P，而图形上connect按钮应该触发首次创建peer_connection_这个对象
  if (peer_connection_.get()) 
    main_wnd_->MessageBox(
        "Error", "We only support connecting to one peer at a time", true);
    return;
  
//初始化peer_connection_对象，并且保存peer_id
  if (InitializePeerConnection()) 
    peer_id_ = peer_id;
    //SDP协议offer侧
    peer_connection_->CreateOffer(
        this, webrtc::PeerConnectionInterface::RTCOfferAnswerOptions());
   else 
    main_wnd_->MessageBox("Error", "Failed to initialize PeerConnection", true);

InitializePeerConnection函数定义于同一个文件，该函数并不长，创建signaling_thread_、peer_connection_factory_以及调用AddTracks函数完成初始化工作，完成这些工作之后才使用SDP协议完成信令通讯，一个P2P会议中，由于传输的多媒体内容是音频和视频，因而在创建peer_connection_factory_对象时，需要对多媒体信息进行细化，音频部分主要包括APM、ADM以及ACM，视频部分主要VCM、VDM，为了层级上便于管理，这分别由于voice engine和video engine两大引擎类进行了管理，在此基础上又使用了channel、track以及stream的概念进行了抽象封装以便于使用。

//webrtc/examples/peerconnection/client/conductor.cc
#include "api/create_peerconnection_factory.h"
bool Conductor::InitializePeerConnection() 
  //在进入这个函数的时候，peer_connection_和peer_connection_factory_这个对象都是还未创建的
  RTC_DCHECK(!peer_connection_factory_);
  RTC_DCHECK(!peer_connection_);

  if (!signaling_thread_.get()) 
    signaling_thread_ = rtc::Thread::CreateWithSocketServer();
    signaling_thread_->Start();
  
  //使用工厂类创建peer_connection_factory_对象，这是启动一个典型的WebRTC会议步骤见2.2小节，属于第一步，实现见2.2.1小节
  peer_connection_factory_ = webrtc::CreatePeerConnectionFactory(
      nullptr /* network_thread */, nullptr /* worker_thread */,
      signaling_thread_.get(), nullptr /* default_adm */,
      webrtc::CreateBuiltinAudioEncoderFactory(),
      webrtc::CreateBuiltinAudioDecoderFactory(),
      webrtc::CreateBuiltinVideoEncoderFactory(),
      webrtc::CreateBuiltinVideoDecoderFactory(), nullptr /* audio_mixer */,
      nullptr /* audio_processing */);

  if (!peer_connection_factory_) 
    main_wnd_->MessageBox("Error", "Failed to initialize PeerConnectionFactory",
                          true);
    DeletePeerConnection();
    return false;
  
//这是启动一个典型的WebRTC会议步骤见2.2小节，属于第二步
  if (!CreatePeerConnection()) 
    main_wnd_->MessageBox("Error", "CreatePeerConnection failed", true);
    DeletePeerConnection();
  
//这是启动一个典型的WebRTC会议步骤见2.3小节，属于第三步
  AddTracks();

  return peer_connection_ != nullptr;

2.2 PeerConnectionFactory和PeerConnection

PeerConnectionFactory 工厂类提供创建PeerConnection、MediaStream以及MediaStreamTrack对象的工厂方法。

启动一个典型的WebRTC会议步骤如下：

1.创建一个PeerConnectionFactoryInterface，所需参数可以参考构造函数；

2.创建一个PeerConnection对象，提供用于ICE透传候选的STUN/TURN服务器配置结构和用于接收来自PeerConnection回调的实现PeerConnectionObserver接口的对象；

3.使用PeerConnectionFactory创建local MediaStreamTracks，并使用AddTrack方法将其添加到上一步创建的PeerConnection对象中；

4.创建SDP协议中的offer侧请求信息，调用SetLocalDescription方法并将其发送到对端；

5.当ICE透传信息收集到后，PeerConnection对象将调用ICE observer OnIceCandidate，ICE的透传信息需要传递给对端；

6.当接收到来自对端的SDP answer之后，本地端将调用SetRemoteDescription设置远端answer的SDP信息；

7.当接收到对端的透传候选信息之后，调用AddIceCandidate将其传递给PeerConnection对象；

8.当接收到会议请求之后，接收到可以选择接受或者拒绝，这一决定权取决于应用程序而非PeerConnection对象，当选择接受会议请求之后，接收方需要做如下事项：

a. 如果PeerConnectionFactoryInterface对象不存在则创建一个；

b. 创建一个新的PeerConnection对象；

c.通过调用SetRemoteDescription将远端通过SDP协议传来的offer信息设置到新的PeerConnection对象中；

d.调用CreateAnswer创建应答远端SDP offer信息的SDP answer，并将该answer发送给offer端；

e.通过调用SetLocalDescription将本地刚刚创建的answer内容设置到新的PeerConnection对象中；

f.通过调用AddIceCandidate设置远端ICE candidates

g.一旦candidate信息收集到之后，PeerConnection对象将会调用观察函数OnIceCandidate，并将这些candidates发送到远端；

关于SDP协议并不深入分析，其主要是通过offer/answer模型建立通信，透传这里也不涉及，大多数多人会议场景还是需要多媒体服务器，WebRTC这种只有信令服务器的场景在多人视频会议中使用到的还是非常少的。

2.2.1 CreatePeerConnectionFactory

在2.1小节，conductor调用webrtc::CreatePeerConnectionFactory的如下参数传递的是NULL值，这是由于该工厂方法中定义了默认的创建方式。

    //见1.7小节
        rtc::scoped_refptr<AudioDeviceModule> default_adm,
    rtc::scoped_refptr<AudioMixer> audio_mixer,
    //默认创建采用AudioProcessingBuilder().Create();方法，见1.6小节
    rtc::scoped_refptr<AudioProcessing> audio_processing,

这个函数一个非常重要的作用是创建多媒体引擎cricket::CreateMediaEngine(std::move(media_dependencies));，传递给该创建引擎API中的media_dependencies参数中的audio_processing是创建好了的，而mixer和adm则是NULL值，这会丢给engine自己调用默认方法创建，peer_connection_factory_对象时，其工厂方法中音视频编解码对象参数类型和实参如下：

rtc::scoped_refptr<PeerConnectionFactoryInterface> CreatePeerConnectionFactory(
    rtc::Thread* network_thread,
    rtc::Thread* worker_thread,
    rtc::Thread* signaling_thread,
    rtc::scoped_refptr<AudioDeviceModule> default_adm,
    rtc::scoped_refptr<AudioEncoderFactory> audio_encoder_factory,
    rtc::scoped_refptr<AudioDecoderFactory> audio_decoder_factory,
    std::unique_ptr<VideoEncoderFactory> video_encoder_factory,
    std::unique_ptr<VideoDecoderFactory> video_decoder_factory,
    rtc::scoped_refptr<AudioMixer> audio_mixer,
    rtc::scoped_refptr<AudioProcessing> audio_processing,
    AudioFrameProcessor* audio_frame_processor,
    std::unique_ptr<FieldTrialsView> field_trials) 
  if (!field_trials) 
    field_trials = std::make_unique<webrtc::FieldTrialBasedConfig>();
  

  PeerConnectionFactoryDependencies dependencies;
  dependencies.network_thread = network_thread;
  dependencies.worker_thread = worker_thread;
  dependencies.signaling_thread = signaling_thread;
  dependencies.task_queue_factory =
      CreateDefaultTaskQueueFactory(field_trials.get());
  dependencies.call_factory = CreateCallFactory();
  dependencies.event_log_factory = std::make_unique<RtcEventLogFactory>(
      dependencies.task_queue_factory.get());
  dependencies.trials = std::move(field_trials);

  if (network_thread) 
    // TODO(bugs.webrtc.org/13145): Add an rtc::SocketFactory* argument.
    dependencies.socket_factory = network_thread->socketserver();
  
  cricket::MediaEngineDependencies media_dependencies;
  media_dependencies.task_queue_factory = dependencies.task_queue_factory.get();
  media_dependencies.adm = std::move(default_adm);
  media_dependencies.audio_encoder_factory = std::move(audio_encoder_factory);
  media_dependencies.audio_decoder_factory = std::move(audio_decoder_factory);
  media_dependencies.audio_frame_processor = audio_frame_processor;
  if (audio_processing) 
    media_dependencies.audio_processing = std::move(audio_processing);
   else 
    media_dependencies.audio_processing = AudioProcessingBuilder().Create();
  
  media_dependencies.audio_mixer = std::move(audio_mixer);
  media_dependencies.video_encoder_factory = std::move(video_encoder_factory);
  media_dependencies.video_decoder_factory = std::move(video_decoder_factory);
  media_dependencies.trials = dependencies.trials.get();
  //多媒体引擎创建，非常重要
  dependencies.media_engine =
      cricket::CreateMediaEngine(std::move(media_dependencies));

  return CreateModularPeerConnectionFactory(std::move(dependencies));

这里以CreateBuiltinAudioEncoderFactory为例一窥音频工厂类创建方法。

//api/audio_codecs/builtin_audio_encoder_factory.cc
rtc::scoped_refptr<AudioEncoderFactory> CreateBuiltinAudioEncoderFactory() 
  return CreateAudioEncoderFactory<

#if WEBRTC_USE_BUILTIN_OPUS
      AudioEncoderOpus, NotAdvertised<AudioEncoderMultiChannelOpus>,
#endif

      AudioEncoderIsac, AudioEncoderG722,

#if WEBRTC_USE_BUILTIN_ILBC
      AudioEncoderIlbc,
#endif

      AudioEncoderG711, NotAdvertised<AudioEncoderL16>>();

CreateAudioEncoderFactory是一个模板类，是对audio_encoder_factory_template_impl命名空间中的AudioEncoderFactoryT类的封装，最终创建的返回的是AudioEncoderFactory类型的对象，这一对象的MakeAudioEncoder方法很重要，是真正创建编解码的API，不过真正的创建放在了voice engine里。

每一个编码器都有MakeAudioEncoder方法，比如opus通过该方法创建opus编码类的方法如下。

std::unique_ptr<AudioEncoder> AudioEncoderOpus::MakeAudioEncoder(
    const AudioEncoderOpusConfig& config,
    int payload_type,
    absl::optional<AudioCodecPairId> /*codec_pair_id*/,
    const FieldTrialsView* field_trials) 
  if (!config.IsOk()) 
    RTC_DCHECK_NOTREACHED();
    return nullptr;
  
  return AudioEncoderOpusImpl::MakeAudioEncoder(config, payload_type);

2.2.2 PeerConnection

PeerConnection是PeerConnectionInterface API定义的实现类，该类目前仅负责的内容如下：

* 管理会话状态机（信号状态）；
* 创建和初始化底层如PortAllocator和BaseChannels等底层对象；
* 拥有和管理RtpSender/RtpReceiver以及音视频track对象的生命周期；
* 踪当前和挂起的本地/远程会话描述；

该类联合负责的内容如下：

* 解析SDP协议；
* 根据当前状态创建SDP offer/answer信息；
* ICE透传状态机；
* 生成统计信息；

SDP（Session Description Protocol）协议是会话描述协议，是用于描述用于通知和邀请的多媒体通信会话的一种格式，其主要用途是支持流媒体应用，如IP语音（VoIP）和视频会议。SDP本身不传递任何媒体流，而是在端点之间用于协商网络度量、媒体类型和其他相关属性，属性和参数集称为会话配置文件，在WebRTC的实现中，SDP协议分为PlanB与UnifiedPlan两种，PlanB：只有两个媒体描述，即音频媒体描述（m=audio…）和视频媒体描述（m=video…）。如果要传输多路视频，则他们在视频媒体描述中需要通过SSRC来区分。
UnifiedPlan中可以有多个媒体描述，因此对于多路视频，将其拆成多个视频媒体描述即可，如果引⼊ Stream 和 Track 的概念，那么⼀个 Stream 可能包含AudioTrack 和 VideoTrack，当有多路 Stream 时，就会有更多的 Track，如果每⼀个 Track 唯⼀对应⼀个⾃⼰的m描述，那么这就是 UnifiedPlan，如果每⼀个m=描述了多个Track(track id)，那么这就是 Plan B。在p2p实现上，UnifiedPlan使用AddTrack API，而PlanB使用AddStream API。
WebRTC中也使用该协议，Jsep （JavaScript Session Establishment Protocol）协议描述了允许javascript应用程序通过W3C RTCPeerConnection API中指定的接口控制多媒体会话的信令平面的机制，并讨论了这与现有信令协议的关系，Native例子用c++实现了这一通信协议。

图2-3 PeerConnection 类UML关系图

在2.1小节，在成功创建peer_connection_factory_对象之后，就会创建peer_connection_对象，这个对象创建的源自conductor命名空间中的CreatePeerConnection方法，其定义如下：

//examples/peerconnection/client/conductor.cc
bool Conductor::CreatePeerConnection() 
  //至此，peer_connection_factory_已经成功创建，peer_connection_还未创建，这一函数将会创建这个对象
  RTC_DCHECK(peer_connection_factory_);
  RTC_DCHECK(!peer_connection_);

  webrtc::PeerConnectionInterface::RTCConfiguration config;
  config.sdp_semantics = webrtc::SdpSemantics::kUnifiedPlan;
  webrtc::PeerConnectionInterface::IceServer server;
  server.uri = GetPeerConnectionString();
  config.servers.push_back(server);

  webrtc::PeerConnectionDependencies pc_dependencies(this);
  auto error_or_peer_connection =
      peer_connection_factory_->CreatePeerConnectionOrError(
          config, std::move(pc_dependencies));
  if (error_or_peer_connection.ok()) 
    peer_connection_ = std::move(error_or_peer_connection.value());
  
  return peer_connection_ != nullptr;

CreatePeerConnectionOrError主要调用了三个函数实现PeerConnection的创建任务：

//webrtc/pc/peer_connection_factory.cc
RTCErrorOr<rtc::scoped_refptr<PeerConnectionInterface>>
PeerConnectionFactory::CreatePeerConnectionOrError(

  std::unique_ptr<Call> call =
      worker_thread()->BlockingCall([this, &event_log, trials, &configuration] 
        return CreateCall_w(event_log.get(), *trials, configuration);
      );

  auto result = PeerConnection::Create(context_, options_, std::move(event_log),
                                       std::move(call), configuration,
                                       std::move(dependencies));

   rtc::scoped_refptr<PeerConnectionInterface> result_proxy =
      PeerConnectionProxy::Create(signaling_thread(), network_thread(),
                                  result.MoveValue());

一个Call对象是可以包含多个发送/接收流，这些流对应于同一个远端，并且这些流共享比特率估计，Call对象提供了如下功能：

发送码率设置（最小30kbps、初始300kbps，最大2000kbps，初始码率）；
提供获取传输统计信息方法，接收端拥塞控制
创建PacketReceiver对象，接收到的说有RTP/RTCP数据包都会经过Call模块；

PeerConnectionProxy是为了多线程开发简单而衍生出来的PeerConnectionFactory代理对象，其Create方法就是可以将其封装成是线程安全的PeerConnectionFactory对象。

//传递的参数*c是 PeerConnection对象
static rtc::scoped_refptr<PeerConnectionFactoryProxyWithInternal> Create(                    
      rtc::Thread* signaling_thread, INTERNAL_CLASS* c)                   
    return new rtc::RefCountedObject<PeerConnectionFactoryProxyWithInternal>(signaling_thread,  c);

2.2.3 PeerConnection::Create

这个函数完成了和网络相关的一些设置，比如SDP以及RTP等对象的创建。

//webrtc/pc/peer_connection.cc
RTCErrorOr<rtc::scoped_refptr<PeerConnection>> PeerConnection::Create(
    rtc::scoped_refptr<ConnectionContext> context,
    const PeerConnectionFactoryInterface::Options& options,
    std::unique_ptr<RtcEventLog> event_log,
    std::unique_ptr<Call> call,
    const PeerConnectionInterface::RTCConfiguration& configuration,
    PeerConnectionDependencies dependencies) 
      // PeerConnection构造函数依赖于部分dependencies，参数call是上一节创建的对象；
  auto pc = rtc::make_ref_counted<PeerConnection>(
      context, options, is_unified_plan, std::move(event_log), std::move(call),
      dependencies, dtls_enabled);
  //
  RTCError init_error = pc->Initialize(configuration, std::move(dependencies));
  

//这个函数忽略了STUN/TURN服务器初始化的内容
RTCError PeerConnection::Initialize(
    const PeerConnectionInterface::RTCConfiguration& configuration,
    PeerConnectionDependencies dependencies) 
  //SDP协议
    sdp_handler_ = SdpOfferAnswerHandler::Create(this, configuration,
                                               dependencies, context_.get());
  //RtpTransmissionManager负责RtpSender，RtpReceiver以及RtpTransceiver对象之间的关系和生命周期的管理。
    rtp_manager_ = std::make_unique<RtpTransmissionManager>(
      IsUnifiedPlan(), context_.get(), &usage_pattern_, observer_,
      legacy_stats_.get(), [this]() 
        RTC_DCHECK_RUN_ON(signaling_thread());
        sdp_handler_->UpdateNegotiationNeeded();
      );

  //如果是Plan B的SDP则在此时添加音视频的传输器；
  //PlanB和UnifiedPlan 是WebRTC在多路媒体（multi media source场景下的两种不同的SDP协商⽅式。
  if (!IsUnifiedPlan()) 
    rtp_manager()->transceivers()->Add(
        RtpTransceiverProxyWithInternal<RtpTransceiver>::Create(
            signaling_thread(), rtc::make_ref_counted<RtpTransceiver>(
                                    cricket::MEDIA_TYPE_AUDIO, context())));
    rtp_manager()->transceivers()->Add(
        RtpTransceiverProxyWithInternal<RtpTransceiver>::Create(
            signaling_thread(), rtc::make_ref_counted<RtpTransceiver>(
                                    cricket::MEDIA_TYPE_VIDEO, context())));

2.3 Conductor::AddTracks

根据2.2小节启动一个典型的WebRTC会议步骤可知，在成功创建好PeerConnection对象之后需按2.2小节的步骤3调用PeerConnection对象的AddTrack方法向其中添加音视频Track了，因为Track的还依赖于Source提供数据，Native 例子的AddTrack的起始位置源于2.1小节conductor的Conductor::InitializePeerConnection() 函数。该函数在调用了2.2小节所述的CreatePeerConnection()函数之后，紧接着调用本小节的Conductor::AddTracks()函数，该函数的定义如下：

void Conductor::AddTracks() 
  //如果sender非空，则意味着track已经创建好了，这是因为在添加track时会为其添加Sender。
  if (!peer_connection_->GetSenders().empty()) 
    return;  // Already added tracks.
  
//创建音频Track，其Audiosource见2.4小节
  rtc::scoped_refptr<webrtc::AudioTrackInterface> audio_track(
      peer_connection_factory_->CreateAudioTrack(
          kAudioLabel,
          peer_connection_factory_->CreateAudioSource(cricket::AudioOptions())
              .get()));
 //添加音频track
  auto result_or_error = peer_connection_->AddTrack(audio_track, kStreamId);
  if (!result_or_error.ok()) 
    RTC_LOG(LS_ERROR) << "Failed to add audio track to PeerConnection: "
                      << result_or_error.error().message();
  

  //创建视频device
  rtc::scoped_refptr<CapturerTrackSource> video_device =
      CapturerTrackSource::Create();
  //创建video track。
  if (video_device) 
    rtc::scoped_refptr<webrtc::VideoTrackInterface> video_track_(
        peer_connection_factory_->以上是关于WebRTC音频系统 peerconnection初始化的主要内容，如果未能解决你的问题，请参考以下文章 
 TSINGSEE青犀视频Webrtc实时通信的构建流程——PeerConnection对等通信的实现方式
 webRTC如何判断是否有音频
 webrtc自带client的音频引擎创建代码走读
 没有 <audio> 元素的 WebRTC 音频 (RTCMultiConnection)
 构建 Google Talk（又名 WebRTC）PeerConnection 示例
 AVAudioPlayer 在进行 WebRTC 音频通话时以非常低的音量播放声音