Menu

Rate this page:

Thanks for rating this page!

We are always striving to improve our documentation quality, and your feedback is valuable to us. How could this documentation serve you better?

Managing Codecs

Introduction

The term codec is a portmanteau for encoder and decoder. An encoder is a device or software that encodes a media signal typically compressing it in the process. A decoder performs the opposite operation and decodes the media for playback. Intuitively, a codec can be seen as “the language in which the media is represented”. Hence, for a multimedia communication to take place, the parties must support at least one shared codec. There are many audio and video codecs in the market each of which has different properties in terms of required computing resources, compression ratio and fidelity. In this guide, we show how Twilio’s Programmable Video Platform enables developers to select the most appropriate codec for two objectives:

  1. guaranteeing interoperability
  2. optimizing the quality of experience of their end-users.

Encoding and decoding video.

Twilio Video SDKs: Supported codecs

The codecs supported by Twilio’s client SDKs are platform dependent. In the JavaScript SDK its up to the browser vendors to provide codec implementations, while on mobile SDKs it depends on the device’s capabilities.

Video Codecs Supported by Twilio’s Programmable Video SDKs.

SDK or Browser VP8 VP9 H.264
Chrome 57+ Yes Yes Yes
Chrome <57 Yes Yes No (Enable with internal flag)
Safari 11 No No Yes
Firefox 55+ Yes Yes Yes
Video iOS 2.0+ Yes Yes Yes *
Video iOS 1.x Yes Yes No
Video Android 2.0+ Yes Yes Yes ** (If hardware supports it)
Video Android 1.x Yes No No

Check the browser support per platform table.

* Twilio Video for iOS relies on hardware support for H.264. Hardware encode/decode is supported on all iOS devices.

** Twilio Video for Android relies on hardware support for H.264, which depends on devices’ capabilities. Developers can evaluate H.264 support in Android SDK v2.0+ using the following code snippet.

boolean isH264Supported = MediaCodecVideoDecoder.isH264HwSupported() &&
            MediaCodecVideoEncoder.isH264HwSupported();

Audio Codecs Supported by Twilio’s Programmable Video SDKs

SDK or Browser iSAC OPUS PCMU PCMA G.722
Chrome Yes Yes Yes Yes Yes
Safari 11 Yes Yes Yes Yes Yes
Firefox 55+ Yes Yes Yes Yes Yes
Video iOS 2.0+ Yes Yes Yes Yes Yes
Video iOS 1.x Yes Yes Yes Yes Yes
Android 2.0+ Yes Yes Yes Yes Yes
Android 1.x No Yes No No No

Twilio Video Rooms: Codec Interoperability

Before establishing a multimedia communication, the involved parties need to agree on the codecs to be used. Twilio’s manages this through an automatic codec negotiation that imposes some interoperability restrictions.

Codec interoperability in P2P Rooms

In P2P Rooms, communications take place in a “one-to-one” manner meaning that if a client A is sharing media with clients B and C, then A will be effectively sending one media stream to B and another separated media stream to C. This means that A can use use different codec suites for each of these streams. This is illustrated in the following figure that represents a P2P video Room where A, B and C participants with Chrome, Firefox and Safari 11 browsers respectively. As it can be observed Chrome and Firefox support VP8, VP9 and H.264. However, by default they communicate using VP8. On the other hand, Safari 11 only supports H.264. Hence, Chrome and Firefox need to use H.264 when communicating with it

In P2P rooms, Twilio's SDKs can combine different codecs in order to achieve interoperability. For example Chrome and Firefox support both VP8 and H.264 but by default communicate using VP8. On the other hand, Safari 11 only supports H.264.

Twilio’s Programmable Video default codecs for P2P Rooms are the following:

Default video codecs used by each pair of clients in P2P Rooms

Chrome Firefox Safari 11 Android iOS
Chrome VP8 VP8 H.264 VP8 VP8
Firefox VP8 VP8 H.264 VP8 VP8
Safari 11 H.264 H.264 H.264 H.264* H.264
Android VP8 VP8 H.264* VP8 VP8
iOS VP8 VP8 H.264 VP8 VP8

* Android H.264 relies on hardware support, which depends on devices’ capabilities.

Default audio codecs used by each pair of clients in P2P Rooms

Chrome Firefox Safari 11 Android iOS
Chrome OPUS OPUS OPUS OPUS OPUS
Firefox OPUS OPUS OPUS OPUS OPUS
Safari 11 OPUS OPUS OPUS OPUS OPUS
Android OPUS OPUS OPUS OPUS OPUS
iOS OPUS OPUS OPUS OPUS OPUS

Hence, in P2P Rooms, when using default settings, the following holds:

  • All the supported client SDKs interoperate their audio communications (using OPUS codec)
  • All the supported client SDKs interoperate their video communications (using either VP8 or H.264) except for Android-Safari 11 video links, where interoperability is only guaranteed when the Android hardware ships H.264.

Codec interoperability in Group Rooms

In Group Rooms, an SFU (Selective Forwarding Unit) mediates among clients. This means that if client A wants to send media to clients B and C, then A effectively sends only one media stream to the SFU that, in turn, forwards it to B and C. Due to this, in Group Rooms the codec negotiation takes place based on the following principles:

  • For inbound tracks (i.e. tracks getting into the SFU), the SFU tries to accept the codec preferred by the client as long as this codec is supported by the SFU.
  • For outbound tracks (i.e. tracks getting out the SFU), the tracks are only offered in the codec in which they are being received.

These have interoperability implications. First because the only codecs allowed are the ones supported by the SFU. The following tables summarize them:

Video Codecs supported in Group Rooms SFU infrastructure

VP8 VP9 H.264
Group Rooms Support Yes No Yes

Audio Codecs supported in Group Rooms SFU infrastructure

iSAC OPUS PCMU PCMA G.722
Group Rooms Support No Yes Yes No No

Second, because the codecs negotiated at some point in a call may limit the interoperability with clients joining later. As a result, in Group Rooms (in opposition to P2P Rooms), the fact of a set of clients supporting a common codec does not guarantee that they will be actually able to communicate. To illustrate this, imagine an example scenario in which two participants A and B, using Chrome and Firefox respectively, start communicating using VP8. If later a Safari 11 browser joins (as C) it will not be able to receive A's and B’s video tracks because Safari 11 only supports H.264. This is unfortunate because both A and B could communicate H.264 but when they connected to the SFU negotiated their default that is VP8. This example scenario is represented on the following figure:

The use of Safari 11 in Group Rooms may generate interoperability problems when using Twilio's default codec settings. For example, when Chrome and Firefox connect to a Group Room they negotiate VP8 by default. If a Safari 11 client connects later to that room, it will not he able to receive Chrome and Firefox VP8 video tracks. However, thanks to its H.264 codec support, Chrome and Firefox can seamlessly receive the video track published by Safari 11.

In order to predict when these types of interoperability problems emerge, observe the following tables that illustrate the default codec negotiated between the different clients and the SFU for publishing to a Group Room:

Default video codecs negotiated for publishing a video track to a Group Room

Default codec
Chrome VP8
Firefox VP8
Safari 11 H.264
Android VP8
iOS VP8

Default audio codecs negotiated for publishing an audio track to a Group Room

Default codec
Chrome OPUS
Firefox OPUS
Safari 11 OPUS
Android OPUS
iOS OPUS

As a conclusion, in Group Rooms, and using the default settings:

  • All the supported client SDKs will interoperate their audio communications (using OPUS codec)
  • All the supported client SDKs will interoperate their video communications (using VP8) except for Safari 11 clients.

Using Safari 11 in Group Rooms

For guaranteeing video interoperability with Safari 11 we can force all the clients of a Group Room to publish their videos using H.264. This can be achieved using two alternative mechanisms:

  • When the Group Room is created using the Rooms REST API, you can set the following parameter: VideoCodecs=H264.
  • You can use the Room Settings menu located at your Twilio Video Console and set H.264 as the Group Rooms video codec, as shown on the following figure:
    Default Room Settings in Programmable Video Console: Video Codec specification

When doing so, the SFU rejects to accept tracks published in VP8 and forces all the clients to use H.264, which guarantees interoperability with Safari 11. The resulting Group Room topology in this case is the one shown below:

If you want to guarantee video interoperability with Safari 11 you must configure the Group Rooms SFU to accept only H.264. This can be done using Room Settings at the Twilio Console or through the Rooms REST API.

For further details about using Safari 11 in Group Rooms see the Working with Safari 11 Guide.

Controlling codecs client side: Codec Preferences

For optimizing applications in media quality or battery life sometimes it is interesting to override codecs defaults at the client side. Due to this, Twilio’s APIs have introduced a Codec Preferences capability that is available at the SDKs listed in the table below:

Twilio SDKs supporting Codec Preferences
JavaScript SDK v1.3+
Android SDK v2.0+
iOS SDK v2.0+

The Codec Preferences based on the following principles:

  • It allows selecting both the preferred audio codecs and the preferred video codecs in which media tracks are published by a client SDK.
  • Selected codecs are provided as an ordered list of codec specifications, so that codecs specified first have higher preference.
  • Only codecs supported at a given SDK can be specified as preferred.
  • Codec Preferences are set when a participant connects to a room as part of the ConnectOptions.
  • The fact of a participant selecting a preferred codec does not guarantee the SDK to chose such codec. The preference just means that the codec shall have higher priority among the list of possible codecs to use. The actual chosen codec may depend on other aspects such as the codecs supported by the other parties of the communication.

The following code snippets illustrate how a Participant can connect to a room preferring iSAC as audio codec and H.264 as video codec.

Selecting preferred codecs in JavaScript SDK (required v1.3+)

// Web Javascript
const room = await connect(token, {
  preferredAudioCodecs: ['isac']
  preferredVideoCodecs: ['H264']
});

Selecting preferred codecs in Android SDK (required v2.0+)

// Android Java

//Check if H.264 is supported in this device
boolean isH264Supported = MediaCodecVideoDecoder.isH264HwSupported() &&
    MediaCodecVideoEncoder.isH264HwSupported();

// Prefer H264 if it is hardware available for encoding and decoding
VideoCodec videoCodec = isH264Supported ? (new H264Codec()) : (new Vp8Codec());
ConnectOptions connectOptions = new ConnectOptions.Builder(token)
    .preferAudioCodecs(Collections.singletonList(new IsacCodec()))
    .preferVideoCodecs(Collections.singletonList(videoCodec)
    .build();

Room room = Video.connect(context, connectOptions, listener);

Selecting preferred codecs in iOS SDK (required v2.0+)

// iOS Swift
let options = TVIConnectOptions.init(token: accessToken block: {(builder: TVIConnectOptionsBuilder) -> Void in
    builder.preferredAudioCodecs = [ TVIIsacCodec() ]
    builder.preferredVideoCodecs = [ TVIH264Codec() ]
}

var room = TwilioVideo.connect(with: options delegate: self)
// iOS Objective-C
TVIConnectOptions *options = [TVIConnectOptions optionsWithToken:self.accessToken
                                block:^(TVIConnectOptionsBuilder * _Nonnull builder) {
    builder.preferredAudioCodecs = @[ [TVIIsacCodec new] ];
    builder.preferredVideoCodecs = @[ [TVIH264Codec new] ];
}];

TVIRoom *room = [TwilioVideo connectWithOptions:options delegate:self];

For further information on how Codec Preferences work, please check the corresponding SDK reference documentation regarding the connect primitive and its ConnectOptions parameter.

Muti-codec rooms: using Codec Preferences for optimizing battery life

In Twilio Programmable Video, we say that a room is using Multi-codec capabilities when different participants send their video (or audio) using different codecs. Multi-codecs may be useful for optimizing the battery life of your applications as many vendors provide hardware acceleration support for specific codec suites. For example, all iOS devices have H.264 hardware acceleration while many modern Android devices provide it for both H.264 and VP8. When different devices have different hardware support, it may be interesting to sent media using one codec (i.e. the one supported by the local device hardware) while receiving in another (i.e. the one hardware-supported by the remote parties).

This can be achieved using Codec Preferences. For example, imagine three clients A, B and C being respectively Chrome, Firefox and an iOS smartphone. By default (see sections above) these clients negotiate VP8 as the video codec. However, we may prefer to use H.264 in C to optimize the battery life of the iOS smartphone. In this case, we just need to set H.264 as C’s preferred codec. The following code snippets illustrate how to it:

Participant A - Codec Preferences

// We don't really need this: VP8 is Chrome's default preferred video codec
const room = await connect(token, {
    preferredVideoCoedcs: ['VP8']
});

Participant B - Codec Preferences

// We don't really need this: VP8 is Firefox's default preferred video codec
const room = await connect(token, {
    preferredVideoCoedcs: ['VP8']
});

Participant C - Codec Preferences

// iOS Swift
let options = TVIConnectOptions.init(token: accessToken block: {(builder: TVIConnectOptionsBuilder) -> Void in
      builder.preferredVideoCodecs = [ TVIH264Codec() ]
}
var room = TwilioVideo.connect(with: options delegate: self)

In this case, if A, B and C connect to a P2P Room, the generated communication topology is the following:

Multi-codecs example in a P2P Room

On the other hand, if A, B and C are Group Room participants, the topology is the following:

Multi-codecs example in a Group Room

Muti-codecs limitations and known issues

  • In Group Rooms, Multi-codec support for video is currently limited to H.264 and VP8.
  • In Group Rooms, Multi-codec support for audio is currently limited to Opus and PCMU.
  • In Group Rooms, support for H.264 video is currently limited to the Constrained Baseline profile at level 3.1. This means that the maximum resolution of a video track is 1280x720 at 30 fps.
  • The use of Multi-codecs in a Group Room with recording activated causes recordings to be also Multi-codec. This means that a recording keeps the codec of the track originating it.

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd browsing the Twilio tag on Stack Overflow.