Today realtime communications and streaming technologies are converging within ultra low latency use cases such as augmented reality/virtual reality, game streaming or live streaming with fanout. These use cases may involve peer-to-peer interactions traversing Network Address Translators (NATs) so that they cannot be addressed solely with client/server APIs. For example, streaming a game from a console to a mobile device or a live streaming application supporting peer-to-peer fanout to improve scalability.
For these applications, today's WebRTC APIs may not be sufficient, due to:
-- Lack of support for custom metadata. In AR/VR applications, the metadata can be large enough to require custom packetization and rate control.
-- Lack of codec support. For music, the AAC codec is popular, but it is not supported in WebRTC implementations. However, AAC is supported in WebCodecs.
-- Lack of custom rate control. While WebRTC's built-in rate control is general purpose, it does not allow for rapid response to changes in bandwidth, as is possible with per-frame QP rate control in WebCodecs.
-- Inability to support custom RTCP messages. WebRTC implementations today do not support feedback messages such as LRR, RPSI or SLI, or extended statistics as provided by RTCP-XR.
Native applications can use raw UDP sockets, but those are not available on the web because they lack encryption, congestion control, and a mechanism for consent to send (to prevent DDoS attacks).
To enable new use cases, we think it would be useful to provide an API to send and receive RTP and RTCP packets.
The WebRTC-RtpTransport API enables web applications to support:
- Custom payloads (ML-based audio codecs)
- Custom packetization
- Custom FEC
- Custom RTX
- Custom Jitter Buffer
- Custom bandwidth estimate
- Custom rate control (with built-in bandwidth estimate)
- Custom bitrate allocation
- Custom metadata (header extensions)
- Custom RTCP messages
- Custom RTCP message timing
- RTP forwarding
This is not UDP Socket API. We must have encrypted and congestion-controlled communication.
WebRTC-RtpTransport can be used to implement the following use cases:
- Use Case 1: Custom Packetization
- Use Case 2: Custom Congestion Control
WebRTC-RtpTransport enables these use cases by enabling applications to:
- Encode with a custom (WASM) codec, packetize and send
- Obtain frames from Encoded Transform API, packetize and send
- Obtain frames from Encoded Transform API, apply custom FEC, and send
- Observe incoming NACKs and resend with custom RTX behavior
- Observe incoming packets and customize when NACKs are sent
- Receive packets using a custom jitter buffer implementation
- Use WebCodecs for encode or decode, implement packetization/depacketization and a custom jitter buffer
- Receive packets, depacketize and inject into Encoded Transform (requires a constructor for EncodedAudioFrame/EncodedVideoFrame)
- Observe incoming feedback and/or estimations from built-in congestion control and implement custom rate control (as long as the sending rate is lower than the bandwidth estimate provided by built-in congestion control)
- Obtain frames from Encoded Transform API, packetize, attach custom metadata, and send
- Obtain a bandwidth estimate from RtpTransport, do bitrate allocation, and set bitrates of RtpSenders
- Forward RTP/RTCP packets from one PeerConnection to another, with full control over the entire packet (modulo SRTP/CC exceptions)