Menu

Expand
Rate this page:

Managing Codecs

This page is for reference only. We are no longer onboarding new customers to Programmable Video. Existing customers can continue to use the product until December 5, 2024.

We recommend migrating your application to the API provided by our preferred video partner, Zoom. We've prepared this migration guide to assist you in minimizing any service disruption.

Introduction

Last updated 27 September 2023

The term codec is a portmanteau for encoder and decoder. An encoder is a device or software that encodes a media signal typically compressing it in the process. A decoder performs the opposite operation and decodes the media for playback. Intuitively, a codec can be seen as “the language in which the media is represented”. Hence, for a multimedia communication to take place, the parties must support at least one shared codec. There are many audio and video codecs in the market each of which has different properties in terms of required computing resources, compression ratio and fidelity. In this guide, we show how Twilio’s Programmable Video Platform enables developers to select the most appropriate codec for two objectives:

  1. guaranteeing interoperability
  2. optimizing the quality of experience of their end-users.

Encoding and decoding video.

Twilio Video SDKs: Supported codecs

The codecs supported by Twilio’s client SDKs are platform dependent. In the JavaScript SDK it's up to the browser vendors to provide codec implementations, while on mobile SDKs it depends on the device’s capabilities.

Video Codecs Supported by Twilio’s Programmable Video SDKs.

SDK or Browser VP8 VP9 H.264
Chrome 57+ Yes Yes Yes
Chrome < 57 Yes Yes No (Enable with internal flag)
Safari 12.1+ Yes Yes1 Yes
Safari < 12.1 No No Yes
Firefox 55+ Yes Yes Yes
Video iOS 2.0+ Yes Yes Yes2
Video iOS 1.x Yes Yes No
Video Android 2.0+ Yes Yes Yes3 (If hardware supports it)
Video Android 1.x Yes No No

Check the browser support per platform table.

Note: Safari introduced webrtc support in version 11. The rest of the document will not refer to any earlier versions as they are not supported

1 VP9 support was added in Safari 15.

2 Twilio Video for iOS relies on hardware support for H.264. Hardware encode/decode is supported on all iOS devices.

3 Twilio Video for Android relies on hardware support for H.264, which depends on devices’ capabilities. Reference the following snippets to check if the device supports H.264

Video Android 2.x - 5.x


boolean isH264Supported = MediaCodecVideoDecoder.isH264HwSupported() &&
            MediaCodecVideoEncoder.isH264HwSupported();

Video Android 6.x+


HardwareVideoEncoderFactory hardwareVideoEncoderFactory =
        new HardwareVideoEncoderFactory(null, true, true);
HardwareVideoDecoderFactory hardwareVideoDecoderFactory =
        new HardwareVideoDecoderFactory(null);

boolean h264EncoderSupported = false;
for (VideoCodecInfo videoCodecInfo : hardwareVideoEncoderFactory.getSupportedCodecs()) {
    if (videoCodecInfo.name.equalsIgnoreCase("h264")) {
        h264EncoderSupported = true;
        break;
    }
}
boolean h264DecoderSupported = false;
for (VideoCodecInfo videoCodecInfo : hardwareVideoDecoderFactory.getSupportedCodecs()) {
    if (videoCodecInfo.name.equalsIgnoreCase("h264")) {
        h264DecoderSupported = true;
        break;
    }
}

boolean isH264Supported =  h264EncoderSupported && h264DecoderSupported;

Furthermore, not all browsers support H.264. Developers can evaluate H.264 browser support usig the following code snippet

let isH264Supported;

/**
 * Test support for H264 codec.
 * @returns {Promise<boolean>} true if supported, false if not
 */
function testH264Support() {
  if (typeof isH264Supported === 'boolean') {
    return Promise.resolve(isH264Supported);
  }
  if (typeof RTCRtpSender !== undefined
    && typeof RTCRtpSender.getCapabilities === 'function') {
    isH264Supported = !!RTCRtpSender.getCapabilities('video').codecs.find(({ mimeType }) => mimeType === 'video/H264');
    return Promise.resolve(isH264Supported);
  }
  if (typeof RTCPeerConnection === 'undefined') {
    isH264Supported = false;
    return Promise.resolve(isH264Supported);
  }

  let offerOptions = {}; 
  const pc = new RTCPeerConnection();
  try {
    pc.addTransceiver('video');
  } catch (e) {
    offerOptions.offerToReceiveVideo = true;
  }

  return pc.createOffer(offerOptions).then(offer => {
    isH264Supported = /^a=rtpmap:.+ H264/m.test(offer.sdp);
    pc.close();
    return isH264Supported;
  });
}

// Now we can call testH264Support to check if H.264 is supported
testH264Support().then(isSupported => {
  console.log(`This browser ${isSupported
    ? 'supports' : 'does not support'} H264 codec`);
});

Audio Codecs Supported by Twilio’s Programmable Video SDKs

SDK or Browser iSAC OPUS PCMU PCMA G.722
Chrome 110+ No Yes Yes Yes Yes
Safari Yes Yes Yes Yes Yes
Firefox 55+ No Yes Yes Yes Yes
Video iOS 2.0+ Yes Yes Yes Yes Yes
Video iOS 1.x Yes Yes Yes Yes Yes
Android 2.0+ Yes Yes Yes Yes Yes
Android 1.x No Yes No No No

Twilio Video Rooms: Codec Interoperability

Before establishing a multimedia communication, the involved parties need to agree on the codecs to be used. Twilio manages this through an automatic codec negotiation that imposes some interoperability restrictions.

Codec interoperability in P2P Rooms

In P2P Rooms, communications take place in a “one-to-one” manner meaning that if a client A is sharing media with clients B and C, then A will be effectively sending one media stream to B and another separated media stream to C. This means that A can use use different codec suites for each of these streams. This is illustrated in the following figure that represents a P2P video Room where A, B and C participants with Chrome, Firefox and Safari < 12.1 browsers respectively. As it can be observed Chrome and Firefox support VP8, VP9 and H.264. However, by default they communicate using VP8. On the other hand, Safari < 12.1 only supports H.264. Hence, Chrome and Firefox need to use H.264 when communicating with it

In P2P rooms, Twilio's SDKs can combine different codecs in order to achieve interoperability. For example Chrome and Firefox support both VP8 and H.264 but by default communicate using VP8. On the other hand, Safari < 12.1 only supports H.264.

Twilio’s Programmable Video default codecs for P2P Rooms are the following:

Default video codecs used by each pair of clients in P2P Rooms

Chrome Firefox Safari Android iOS
Chrome VP8 VP8 H.264 VP8 VP8
Firefox VP8 VP8 H.264 VP8 VP8
Safari H.264 H.264 H.264 H.264* H.264
Android VP8 VP8 H.264* VP8 VP8
iOS VP8 VP8 H.264 VP8 VP8

* Android H.264 relies on hardware support, which depends on devices’ capabilities.

Default audio codecs used by each pair of clients in P2P Rooms

Chrome Firefox Safari Android iOS
Chrome OPUS OPUS OPUS OPUS OPUS
Firefox OPUS OPUS OPUS OPUS OPUS
Safari OPUS OPUS OPUS OPUS OPUS
Android OPUS OPUS OPUS OPUS OPUS
iOS OPUS OPUS OPUS OPUS OPUS

Hence, in P2P Rooms, when using default settings, the following holds:

  • All the supported client SDKs interoperate their audio communications (using OPUS codec)
  • All the supported client SDKs interoperate their video communications (using either VP8 or H.264) except for Android-Safari < 12.1 video links, where interoperability is only guaranteed when the Android hardware ships H.264.

Codec interoperability in Group Rooms

In Group Rooms, an SFU (Selective Forwarding Unit) mediates among clients. This means that if client A wants to send media to clients B and C, then A effectively sends only one media stream to the SFU that, in turn, forwards it to B and C. Due to this, in Group Rooms the codec negotiation takes place based on the following principles:

  • For inbound tracks (i.e. tracks getting into the SFU), the SFU tries to accept the codec preferred by the client as long as this codec is supported by the SFU.
  • For outbound tracks (i.e. tracks getting out the SFU), the tracks are only offered in the codec in which they are being received.

These have interoperability implications. First because the only codecs allowed are the ones supported by the SFU. The following tables summarize them:

Video Codecs supported in Group Rooms SFU infrastructure

VP8 VP9 H.264
Group Rooms Support Yes No Yes

Audio Codecs supported in Group Rooms SFU infrastructure

iSAC OPUS PCMU PCMA G.722
Group Rooms Support No Yes Yes No No

Second, because the codecs negotiated at some point in a call may limit the interoperability with clients joining later. As a result, in Group Rooms (in opposition to P2P Rooms), the fact of a set of clients supporting a common codec does not guarantee that they will be actually able to communicate. To illustrate this, imagine an example scenario in which two participants A and B, using Chrome and Firefox respectively, start communicating using VP8. If later a Safari < 12.1 browser joins (as C) it will not be able to receive A's and B’s video tracks because Safari < 12.1 only supports H.264. This is unfortunate because both A and B could communicate H.264 but when they connected to the SFU negotiated their default that is VP8. This example scenario is represented on the following figure:

The use of Safari < 12.1 in Group Rooms may generate interoperability problems when using Twilio's default codec settings. For example, when Chrome and Firefox connect to a Group Room they negotiate VP8 by default. If a Safari < 12.1 client connects later to that room, it will not he able to receive Chrome and Firefox VP8 video tracks. However, thanks to its H.264 codec support, Chrome and Firefox can seamlessly receive the video track published by Safari < 12.1.

Using Safari < 12.1 in Group Rooms

For guaranteeing video interoperability with Safari < 12.1 we can force all the clients of a Group Room to publish their videos using H.264. This can be achieved using two alternative mechanisms:

  • When the Group Room is created using the Rooms REST API, you can set the following parameter: VideoCodecs=H264.
  • You can use the Room Settings menu located at your Twilio Video Console and set H.264 as the Group Rooms video codec, as shown on the following figure:
    Default Room Settings in Programmable Video Console: Video Codec specification

When doing so, the SFU rejects to accept tracks published in VP8 and forces all the clients to use H.264, which guarantees interoperability with Safari < 12.1. The resulting Group Room topology in this case is the one shown below:

If you want to guarantee video interoperability with Safari < 12.1 you must configure the Group Rooms SFU to accept only H.264. This can be done using Room Settings at the Twilio Console or through the Rooms REST API.

For further details about using Safari < 12.1 in Group Rooms see the Working with Safari < 12.1 Guide.

Controlling codecs client side: Codec Preferences

For optimizing applications in media quality or battery life sometimes it is interesting to override codecs defaults at the client side. Due to this, Twilio’s APIs have introduced a Codec Preferences capability that is available at the SDKs listed in the table below:

Twilio SDKs supporting Codec Preferences
JavaScript SDK v1.3+
Android SDK v2.0+
iOS SDK v2.0+

The Codec Preferences based on the following principles:

  • It allows selecting both the preferred audio codecs and the preferred video codecs in which media tracks are published by a client SDK.
  • Selected codecs are provided as an ordered list of codec specifications, so that codecs specified first have higher preference.
  • Only codecs supported at a given SDK can be specified as preferred.
  • Codec Preferences are set when a participant connects to a room as part of the ConnectOptions.
  • The fact of a participant selecting a preferred codec does not guarantee the SDK to chose such codec. The preference just means that the codec shall have higher priority among the list of possible codecs to use. The actual chosen codec may depend on other aspects such as the codecs supported by the other parties of the communication.

The following code snippets illustrate how a Participant can connect to a room preferring iSAC as audio codec and H.264 as video codec.

Selecting preferred codecs in JavaScript SDK (required v1.3+)

// Web Javascript
const room = await connect(token, {
  preferredAudioCodecs: ['isac'],
  preferredVideoCodecs: ['H264']
});

Selecting preferred codecs in Android SDK (required v2.0+)

Note that checking for H.264 hardware support changed from v5.x to v6.x. See this section to learn how to use the correct syntax.

// Android Java

// Prefer H264 if it is hardware available for encoding and decoding
VideoCodec videoCodec = isH264Supported ? (new H264Codec()) : (new Vp8Codec());
ConnectOptions connectOptions = new ConnectOptions.Builder(token)
    .preferAudioCodecs(Collections.singletonList(new IsacCodec()))
    .preferVideoCodecs(Collections.singletonList(videoCodec)
    .build();

Room room = Video.connect(context, connectOptions, listener);

Selecting preferred codecs in iOS SDK (required v2.0+)

// iOS Swift
let options = TVIConnectOptions.init(token: accessToken block: {(builder: TVIConnectOptionsBuilder) -> Void in
    builder.preferredAudioCodecs = [ TVIIsacCodec() ]
    builder.preferredVideoCodecs = [ TVIH264Codec() ]
}

var room = TwilioVideo.connect(with: options delegate: self)
// iOS Objective-C
TVIConnectOptions *options = [TVIConnectOptions optionsWithToken:self.accessToken
                                block:^(TVIConnectOptionsBuilder * _Nonnull builder) {
    builder.preferredAudioCodecs = @[ [TVIIsacCodec new] ];
    builder.preferredVideoCodecs = @[ [TVIH264Codec new] ];
}];

TVIRoom *room = [TwilioVideo connectWithOptions:options delegate:self];

For further information on how Codec Preferences work, please check the corresponding SDK reference documentation regarding the connect primitive and its ConnectOptions parameter.

Multi-codec rooms: using Codec Preferences for optimizing battery life

In Twilio Programmable Video, we say that a room is using Multi-codec capabilities when different participants send their video (or audio) using different codecs. Multi-codecs may be useful for optimizing the battery life of your applications as many vendors provide hardware acceleration support for specific codec suites. For example, all iOS devices have H.264 hardware acceleration while many modern Android devices provide it for both H.264 and VP8. When different devices have different hardware support, it may be interesting to send media using one codec (i.e. the one supported by the local device hardware) while receiving in another (i.e. the one hardware-supported by the remote parties).

This can be achieved using Codec Preferences. For example, imagine three clients A, B and C being respectively Chrome, Firefox and an iOS smartphone. By default (see sections above) these clients negotiate VP8 as the video codec. However, we may prefer to use H.264 in C to optimize the battery life of the iOS smartphone. In this case, we just need to set H.264 as C’s preferred codec. The following code snippets illustrate how to it:

Participant A - Codec Preferences

// We don't really need this: VP8 is Chrome's default preferred video codec
const room = await connect(token, {
    preferredVideoCodecs: ['VP8']
});

Participant B - Codec Preferences

// We don't really need this: VP8 is Firefox's default preferred video codec
const room = await connect(token, {
    preferredVideoCodecs: ['VP8']
});

Participant C - Codec Preferences

// iOS Swift
let options = TVIConnectOptions.init(token: accessToken block: {(builder: TVIConnectOptionsBuilder) -> Void in
      builder.preferredVideoCodecs = [ TVIH264Codec() ]
}
var room = TwilioVideo.connect(with: options delegate: self)

In this case, if A, B and C connect to a P2P Room, the generated communication topology is the following:

Multi-codecs example in a P2P Room

On the other hand, if A, B and C are Group Room participants, the topology is the following:

Multi-codecs example in a Group Room

Multi-codecs limitations and known issues

  • In Group Rooms, Multi-codec support for video is currently limited to H.264 and VP8.
  • In Group Rooms, Multi-codec support for audio is currently limited to Opus and PCMU.
  • In Group Rooms, support for H.264 video is currently limited to the Constrained Baseline profile at level 3.1. This means that the maximum resolution of a video track is 1280x720 at 30 fps.
  • The use of Multi-codecs in a Group Room with recording activated causes recordings to be also Multi-codec. This means that a recording keeps the codec of the track originating it.
Rate this page:

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd by visiting Twilio's Stack Overflow Collective or browsing the Twilio tag on Stack Overflow.

Thank you for your feedback!

Please select the reason(s) for your feedback. The additional information you provide helps us improve our documentation:

Sending your feedback...
🎉 Thank you for your feedback!
Something went wrong. Please try again.

Thanks for your feedback!

thanks-feedback-gif