Build the future of communications.
Start building for free

Announcing Programmable Video Recording Compositions API


We are excited to announce that the Recording Composition API is now available in Public Beta. It enables you to transcode, combine and mix your Group Room Video Recordings through an API.

What are compositions?

Twilio Group Room Video Recordings are stored as individual media files, containing either audio or video tracks because we don’t make any assumptions about how our customers want to use their recordings. Separating audio and video guarantees recording reliability and compactness. However, it makes session playback very complex, as it requires mixing the individual recorded tracks. This is not trivial, and it requires the appropriate synchronization, transcoding, scaling and layout operations. The Recordings Composition API has been created to solve this problem.

How do compositions work?

The creation of a composition works as follows:

  • A Video Group Room is created with recording enabled and the individual audio and video recordings get generated and are stored.
  • A Recording Composition API call is made to create a new composition on that Group Room. The call needs to specify the sources, the video layout, the resolution and all the rest of parameters required by the API.
  • Once the room is completed and all the recordings are available, the composition gets started. The media engine reads the appropriate sources and applies the required media processing and transformation operations. A new composed media file is generated and stored into the Twilio Cloud, where it can be downloaded.
  • The individual recordings are stored and can be used later for creating new compositions. Recordings are only removed when the owner deletes them through the REST API. This guarantees the reliability of our storage-and-compose process.


Getting started with compositions

Creating your first composition is simple. You just need to consider the following:

  • In a composition you can only include audio and video sources belonging to the same Group Room recording.
  • You can specify the included audio and video tracks through:
  • The unique ID of the stored recording (i.e. recording SID).
  • The unique ID of the original track (i.e. Track SID) in the room.
  • The name given to the original track (i.e. Track Name)
  • The included audio sources are mixed through an adder.
  • The included video sources are mixed using the specified layout that is organized in terms of regions. You can decide how many regions your layout contains, where they are placed and how they are displayed.
  • You can also customize the format and resolution of your compositions so that it adapts to your target display and device.

If you want to learn more, take a look at the examples illustrating different layout types.

Simple and predictable pricing model

Recording Compositions pricing is simple and easy to understand, at $0.01 per composition minute. This means that the price depends only on the duration of the final composed media file. Composing a Group Room lasting 10 minutes will have a fixed cost of $0.1. If you decide to only include audio and video tracks that do not span the whole duration of the original recording, the cost will be just the minutes output in the composed file.

Storage and download is charged at the normal video storage pricing:

  • $0.05 per stored GB and month
  • $0.1 per downloaded GB

What’s next for compositions?

We’re going to keep innovating on compositions:

  • Console controls for automating the composition of recordings as soon as they finish without requiring any API calls
  • Console controls for searching and visualizing compositions
  • FullHD (1920×1080) resolution compositions
  • Support for further media formats including MP3, MOV and GIF

If you are missing something or you think we can improve compositions service do not hesitate to reach us.

We can’t wait to see what you build!

Sign up and start building
Not ready yet? Talk to an expert.