Working with VP8 Simulcast - Twilio
Register for SIGNAL by 8/31 for $250 off. Register now.

Working with VP8 Simulcast

Overview

This guide introduces the Simulcast technique and explains how you can use it to enhance the video quality of your Group Room applications.

Contents

What’s Simulcast

An SFU (Selective Forwarding Unit) is a media infrastructure component used for scaling videoconferences. Twilio’s Group Rooms is based on an SFU that enables developers to add a large number of Participants to a video room by forwarding audio, video, and data information from each publisher to any of its subscribers. Given that this forwarding takes place at Twilio’s Cloud, there’s no additional client-side CPU or memory consumption as the number of Room Participants increases. However, the problem in these architectures is that SFUs just forward and can neither transcode nor modify the video. Hence, when there are subscribers with limited receive bandwidth, publishers need to reduce quality to adapt to the worst of them so that no subscriber is congested. As shown in the following figure, this is suboptimal as we are constraining the quality of participants that could communicate with much higher quality.

In Group Rooms, by default the video quality is constrained to the worst of the available bandwidths of the participants.

Simulcast is a standardized technique designed for solving this problem. Simulcast involves the simultaneous sending of different versions of the same video track encoded independently at different resolutions and framerates. With Simulcast, the SFU has several versions of the track with different qualities, so that it can forward higher qualities to higher bandwidth subscribers and lower qualities to lower bandwidth ones. In more technical jargon, we say that Simulcast is a mechanism for providing scalability to non-scalable video codecs such as VP8.

Simulcast involves the simultaneous sending of several version of the same video track so that the SFU infrastructure can forward different qualities to different subscribers depending on their network status and capabilities.

Remark that Simulcast involves the track publisher (which needs to send the different track qualities) and the SFU (which selects the most optimal quality for each subscriber.) However, when Participants act only as subscribers they are not aware of Simulcast as they just receive a standard VP8 encoded video. Hence, they can neither enable nor disable Simulcast use.

Enabling Simulcast in your Twilio application

Simulcast can be enabled in Group Room clients sending media to Twilio’s SFU. The following table illustrates Twilio’s current support for Simulcast:

Twilio Video SDK Browser (or N/A) VP8 Simulcast Support (only Group Rooms)
JavaScript Chrome Yes (SDK v1.7.0+)
JavaScript Firefox No
JavaScript Safari No
Android N/A No
iOS N/A Yes (SDK v2.1.0+)

Enabling Simulcast in Chrome using JavaScript SDK (required v1.7.0+)

By default, Simulcast is disabled. You can enable Simulcast on a per-Participant basis at Room connect-time. This is done using the ConnectOptions as shown in the following code snippet:

// Web Javascript
// Remember that Simulcast only need to be enabled in media publishers

const room = await connect(token, {
    preferredVideoCodecs: [
      { codec: 'VP8', simulcast: true }
    ]
});

Any Group Room Participant with VP8 Simulcast enabled publishes all its video tracks using VP8 Simulcast. Once this is done, Twilio’s video infrastructure leverages Simulcast tracks to provide the best possible quality to any subscriber without requiring any additional action from you.

Enabling Simulcast using the iOS SDK (required v2.1.0+)

By default, Simulcast is disabled. You can enable Simulcast on a per-Participant basis at Room connect-time. This is done using the ConnectOptions as shown in the following code snippet:

// Swift code
// Remember that Simulcast only need to be enabled in media publishers

let connectOptions = TVIConnectOptions.init(token: accessToken) { (builder) in
    builder.preferredVideoCodecs = [TVIVp8Codec(simulcast: true)]
}

Any Group Room Participant with VP8 Simulcast enabled publishes all its video tracks using VP8 Simulcast. Once this is done, Twilio’s video infrastructure leverages Simulcast tracks to provide the best possible quality to any subscriber without requiring any additional action from you.

Resolution and Simulcast layers

Twilio SDKs encode up to three spatial layers when simulcast is enabled. The following table illustrates what layers are typically generated given a particular capture resolution. Remark that this is just an approximation and that the real behavior may be slightly different. In the table, disabled means that that layer is not sent in those conditions (i.e. that quality is not generated by the publisher and hence is not available at the SFU to be forwarded to subscribers.)

Capture resolution Layer 1 Layer 2 Layer 3
352x288 352x288 disabled disabled
480x360 240x180 480x360 disabled
640x480 320x240 640x480 disabled
960x540 240x135 480x270 960x540
1280x720 320x180 640x360 1280x720

The media engine on the SDKs may choose to downscale simulcast video based upon feedback like available bandwidth and CPU usage. The following example shows the spatial layers generated by a mobile Client after downscaling the captured video from 1280x720 to 960x540.

Capture resolution Layer 1 Layer 2 Layer 3 (Downscaled)
1280x720 240x135 480x270 960x540

Pros and cons of Simulcast

When enabling Simulcast in your Group Rooms application you enjoy the following advantages:

  • VP8 subscribers enjoy differentiated quality adapted to their available bandwidth. This significantly improves the quality on Group Rooms with many heterogeneous Participants.
  • VP8 subscribers are isolated from each other so that a subscriber with a degraded network link does not affect the reception quality of other subscribers.

On the other hand, Simulcast also has some drawbacks:

  • Simulcast only contributes to improve the video quality in Group Rooms with 3 or more Participants.
  • Publishers battery consumption is higher due to the need of encoding multiple versions of the same video track.
  • Publishers bandwidth consumption is higher (up to double in some cases) due to sending multiple versions of the same video track. Note that this increase does not impact your Programmable Video costs as Twilio does not charge upstream (i.e. from sender to Twilio’s cloud) bandwidth.

Limitations and known issues

  • Simulcast should only be used in Group Rooms. Using it in P2P Rooms does not improve quality and only contributes to degrade application performance.
  • Simulcast is only supported for the VP8 video codec.
  • The combination of Simulcast and oscillating bandwidth conditions at the publisher might generate suboptimal recording qualities. If the primary objective of your application is to have optimal recording video quality you might prefer not to enable Simulcast on it.
Luis Lopez

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd browsing the Twilio tag on Stack Overflow.