Compositions
Overview
The Twilio Recording Composition API lets you transcode and combine the individual Track Recordings stored by the Twilio Video Recordings API. This API relies on the following REST resources:
- The Composition Instance Resource: a Composition represents a media file created as a result of applying a set of media processing operations onto a number of source Recordings.
- The Compositions List Resource represents the list of previously created Compositions.
These REST resources are located beneath of the following base URL:
https://video.twilio.com
Contents
- URI Schemes
- Composition Instance Resource
- Composition Instance Media Sub-Resource
URI Schemes
These are the URI schemes for the Recording Composition REST API and the supported methods:
/v1/Compositions/
GET
: List Composition resources.POST
: Create new composition resource.
/v1/Compositions/{CompositionSid}
GET
: Retrieve a Composition resource.DELETE
: Delete a Composition resource.
/v1/Compositions/{CompositionSid}/Media
GET
: Retrieve a Composition media file sub-resource.
Composition Instance Resource
Resource URI
/v1/Compositions/{CompositionSid}
Resource Properties
A Composition Instance Resource has the following properties
Resource Properties in REST API format | |
---|---|
account_sid
|
The SID of the Account that created the Composition resource. |
status
|
The status of the composition. Can be: |
date_created
|
The date and time in GMT when the resource was created specified in ISO 8601 format. |
date_completed
|
The date and time in GMT when the composition's media processing task finished, specified in ISO 8601 format. |
date_deleted
|
The date and time in GMT when the composition generated media was deleted, specified in ISO 8601 format. |
sid
|
The unique string that we created to identify the Composition resource. |
room_sid
|
The SID of the Group Room that generated the audio and video tracks used in the composition. All media sources included in a composition must belong to the same Group Room. |
audio_sources
|
The array of track names to include in the composition. The composition includes all audio sources specified in |
audio_sources_excluded
|
The array of track names to exclude from the composition. The composition includes all audio sources specified in |
video_layout
|
An object that describes the video layout of the composition in terms of regions. See Specifying Video Layouts for more info. |
resolution
|
The dimensions of the video image in pixels expressed as columns (width) and rows (height). The string's format is |
trim
|
Whether to remove intervals with no media, as specified in the POST request that created the composition. Compositions with |
format
|
The container format of the composition's media files as specified in the POST request that created the Composition resource. See POST Parameters for more information. |
bitrate
|
The average bit rate of the composition's media. |
size
|
The size of the composed media file in bytes. |
duration
|
The duration of the composition's media file in seconds. |
media_external_location
|
The URL of the media file associated with the composition when stored externally. See External S3 Compositions for more details. |
status_callback
|
The URL called using the |
status_callback_method
|
The HTTP method used to call |
url
|
The absolute URL of the resource. |
links
|
The URL of the media file associated with the composition. |
HTTP GET
Returns the single Composition identified by {CompositionSid}
.
HTTP POST
Not supported.
HTTP DELETE
Deletes the media file associated with the Composition identified by
{CompositionSID}
and sets the Composition status as deleted
.
In case the media file was stored in an external S3 bucket this request
has no effect on such file. Once a Composition has been deleted, its metadata
(i.e. it REST resource record) is kept during 30 days.
Composition Instance Media Sub-Resource
Resource URI
/v1/Compositions/{CompositionSid}/Media
HTTP GET
Retrieves the Composition media file through an HTTP redirection. The format of the provided media file is the one specified in the Format
property of the Composition (see table above). By default, the redirection URL is available for 600 seconds, but this can be configured to a value between 1 and 3600 seconds via the Ttl
request param. If the composition is not yet available, a 404
is returned.
The HTTP GET request accepts the following parameters
Name | Description |
---|---|
ContentDisposition | Optional. Sets the Content-Disposition header of the redirect_to URL. Possible values are attachment or inline . Default value attachment%3B%20filename%3D%22CJxx.xxx%22 (not PII) |
Ttl | Optional. Duration in seconds for which the redirect_to URL can be used to retrieve the media file. The default Ttl is 600 seconds. The minimum supported Ttl value is 1 second and the maximum supported value is 3600 seconds. (not PII) |
Remark that ContentDisposition
affects the content disposition of the
redirection URL. This parameter behaves as specified in
RFC-6266:
- The value
attachment
indicates that browsers should prompt the user to store the file locally. In this case, the specification of afilename
is mandatory. As shown in the table above, we use this as default and setfilename
to the Composition SID followed by the format extension. For example, for an MP4 Composition it will take the formCJXXXX.mp4
. - The value
inline
indicates default processing based on the media type. Hence, whenever the browser supports the composition format, the file will be played. Otherwise, the file is downloaded. Remark that wheninline
is used it is strongly recommended to provide afilename
for the latter case. When doing so, remember that theContentDisposition
parameter must be URLEncoded. For example:
inline%3B%20filename%3D%22MyFile.mp4%22
HTTP DELETE
Not supported.
HTTP POST
Not supported.
Compositions List Resource
Resource URI
/v1/Compositions/
HTTP POST
Creates a new Composition Instance Resource and, when appropriate, launches a media processing task. The result of this task is a composed media file that, by default, is stored in Twilio’s cloud.
Developers can create a new Composition as soon as its associated Group Room exists.
However, the processing task gets started only when the Room status is Completed
.
This guarantees that all required recording sources are available.
This HTTP POST always returns a 201
if the request is accepted (i.e. well formed),
and a 4xx
otherwise depending on the type of error.
Supported POST Parameters
The following table shows all the parameters that can be used when creating a new Composition Instance Resource.
Parameters in REST API format | |
---|---|
room_sid
Required
|
The SID of the Group Room with the media tracks to be used as composition sources. |
video_layout
Optional
|
An object that describes the video layout of the composition in terms of regions. See Specifying Video Layouts for more info. Please, be aware that either video_layout or audio_sources have to be provided to get a valid creation request |
audio_sources
Optional
|
An array of track names from the same group room to merge into the new composition. Can include zero or more track names. The new composition includes all audio sources specified in |
audio_sources_excluded
Optional
|
An array of track names to exclude. The new composition includes all audio sources specified in |
resolution
Optional
|
A string that describes the columns (width) and rows (height) of the generated composed video in pixels. Defaults to
Typical values are:
Note that the |
format
Optional
|
The container format of the composition's media files. Can be: |
status_callback
Optional
|
The URL we should call using the |
status_callback_method
Optional
|
The HTTP method we should use to call |
trim
Optional
|
Whether to clip the intervals where there is no active media in the composition. The default is |
Specifying Video Layouts
Video layouts are organized in terms of regions. A region is a rectangular
area where a set of video sources are displayed following the region placement
rules. The VideoLayout
of a Composition must contain at least one region but
it may contain many. Regions are independent meaning that the way placement
works in a region does not affect placement in other regions.
A Composition's VideoLayout
is specified as a JSON dictionary of regions
following this scheme:
VideoLayout = {
"a-region-name": {
region properties
},
"other-region-name": {
other region properties
}
...
}
The region properties define the size and position of the region, the video sources to include in the region and the placement rules. Regions support the following properties (recall that a “Yes” in the column “Default value/mandatory” indicates a mandatory property)
Parameter | Default value / mandatory | Description |
---|---|---|
x_pos |
0 |
X axis value (in pixels) of the region's upper left corner relative to the upper left corner of the Composition viewport. Regions cannot overflow the Composition's area, so x_pos has to be a positive integer less than or equal to the difference between the Composition's width and the width of the region. If the region’s width is missing from the request, it defaults to 16 pixels for this validation. |
y_pos |
0 |
Y axis value (in pixels) of the region's upper left corner relative to the upper left corner of the Composition viewport. Regions cannot overflow the composition's area, so y_pos has to be a positive integer less than or equal to the difference between the Composition's height and the height of this region. If the region’s height is missing from the request, it defaults to 16 pixels for this validation. |
z_pos |
0 |
Z position controlling the region's visibility in case of overlaps. Regions with higher values are stacked on top of regions with lower value for visibility purposes. z_pos must be in the range [-99, 99] . |
width |
Composition's width - x_pos |
Region's Width. It must be in the range [16, Composition's width - x_pos] . This constraint guarantees that the region fits into the Composition's viewport. |
height |
Composition's height - y_pos |
Region's Height. It must be in the range [16, Composition's height - y_pos] . This constraint guarantees that the region fits into the Composition's viewport. |
max_columns |
-- | Maximum number of columns of the region's placement grid. By default, the region has as many columns as needed to layout all the specified video sources. max_columns must be in the range [1, 1000] . |
max_rows |
-- | Maximum number of rows of the region's placement grid. By default, the region has as many rows as needed to layout all the specified video sources. max_rows must be in the range [1, 1000] . |
cells_excluded |
-- | A list of cell indices on the regions layout grid where no video sources can be assigned. Index of first cell (upper left) is 0. Indices grow from left to right and from top to bottom. These values must be in the range [0, 999999] . |
reuse |
show_oldest |
Defines how the region's grid cells are reused for placement purposes. Possible values are:
|
video_sources |
Yes | The array of video sources that should be placed in this region. All the specified sources must belong to the same Room. It can include:
|
video_sources_excluded |
-- | An array of video sources to exclude from this region. This region will attempt to display all sources specified in video_sources except for the ones specified in video_sources_excluded . This parameter may include:
|
The use of a VideoLayout
not compliant with this specification shall cause the
corresponding POST request to be answered with a 4xx
code.
Region Positioning and Size
The following figure illustrates how regions are positioned:
You may find useful to remember these rules:
- The Composition's
width
andheight
are defined through theResolution
parameter. - Regions are positioned relative to the Composition top-left corner using
x_pos
andy_pos
. - Region dimensions are defined through the
width
andheight
properties. - Regions must fit inside their Composition. This makes the following mandatory:
x_pos
+width
must not be over the Composition'swidth
.y_pos
+height
must not be over the Composition'sheight
.- In case
width
orheight
are not specified, by default the region shall occupy all the available remaining space on the Composition's viewport. - When multiple regions overlap, their visibility depend on the
z_pos
property. Regions with higherz_pos
will be visible on top of regions with lowerz_pos
.
The Region as a Grid
The placement of video_sources
in a region takes place through a grid (i.e. matrix)
where every cell is a container where one (and only one) video source may be
displayed at a time. Region grids are static meaning that their number of
rows and columns do not change during the Composition duration. The specific
number of rows and columns depends on the region's max_columns
and max_rows
properties. There are three different situations:
Unconstrained Grid
In this case, neither max_columns
nor max_rows
are specified in the VideoLayout
.
Twilio computes for you the grid dimensions to guarantee that all the provided
video_sources
are displayed. Due to this, the cells in the grid will be at least
equal to the maximum number of simultaneous video sources in the Composition
(video sources are considered to be simultaneous at a given time when their
media is active at that time).
For this, we try to keep the grid as square as possible making it to grow first in columns
and then in rows so that their difference is never over 1. The following table
illustrates this:
Maximum simultaneous video sources | Region's grid dimensions (rows x columns) |
---|---|
1 | 1x1 |
2 | 1x2 |
3 | 2x2 |
4 | 2x2 |
5 | 2x3 |
6 | 2x3 |
7 | 3x3 |
9 | 3x3 |
10 | 3x4 |
12 | 3x4 |
17 | 4x5 |
20 | 4x5 |
Unconstrained Dimension
In this case, only one of max_columns
or max_rows
is specified. The grid
dimensions are computed following the "Unconstrained Grid" algorithm (i.e. trying
to keep the grid as square as possible) but without exceeding the specified
maximum constraint. After that, the unconstrained dimension grows in order to
guarantee that all the specified video sources are displayed. The following
examples illustrate this:
Maximum simultaneous video sources | max_rows | max_columns | Region's grid dimensions (rows x columns) |
---|---|---|---|
1 | 1 | -- | 1x1 |
1 | -- | 1 | 1x1 |
2 | 1 | -- | 1x2 |
2 | -- | 1 | 2x1 |
3 | 1 | -- | 1x3 |
3 | -- | 1 | 3x1 |
4 | 2 | -- | 2x2 |
4 | -- | 2 | 2x2 |
5 | 2 | -- | 2x3 |
5 | -- | 2 | 3x2 |
6 | 2 | -- | 2x3 |
6 | -- | 2 | 3x2 |
7 | 2 | -- | 2x4 |
7 | -- | 2 | 4x2 |
9 | 2 | -- | 2x5 |
9 | -- | 2 | 5x2 |
12 | 2 | -- | 2x6 |
12 | -- | 2 | 6x2 |
Constrained Grid
In this case, both max_columns
and max_rows
are specified. The grid
dimensions are computed following the above specified algorithms but keeping
both dimensions under their limits. Due to this, the maximum number of cells
is max_columns
* max_rows
. If the number of simultaneous video sources
exceeds that value, then some video sources shall not be displayed. The following
example illustrates the effect.
Maximum simultaneous video sources | max_rows | max_columns | Region's grid dimensions (rows x columns) |
---|---|---|---|
1 | 1 | 1 | 1x1 |
1 | 1 | 2 | 1x1 |
1 | 2 | 2 | 1x1 |
2 | 1 | 1 | 1x1 (1 source not displayed) |
2 | 1 | 2 | 1x2 |
2 | 2 | 2 | 1x2 |
3 | 1 | 1 | 1x1 (2 sources not displayed) |
3 | 1 | 2 | 1x2 (1 source not displayed) |
3 | 2 | 2 | 2x2 |
4 | 1 | 1 | 1x1 (3 sources not displayed) |
4 | 1 | 2 | 1x2 (2 sources not displayed) |
4 | 2 | 2 | 2x2 |
5 | 1 | 1 | 1x1 (4 sources not displayed) |
5 | 1 | 2 | 1x2 (2 sources not displayed) |
5 | 2 | 2 | 2x2 (1 source not displayed) |
9 | 1 | 7 | 1x7 (2 sources not displayed) |
9 | 2 | 7 | 2x5 |
9 | 3 | 7 | 3x3 |
Displaying Video Sources
Video sources are displayed in region grid cells. Cells size and aspect ratio is controlled by:
- The region pixel dimensions, as defined by the
width
andheight
parameters. - The region grid dimensions (i.e. number of rows and columns), as introduced in the section above.
Hence, cells would have approximately (rounding effects not considered):
- Width equal to the region's
width
divided by the number of columns. - Heigh equal to the region's
height
divided by the number of rows.
The display of the original video track sources in cells is performed using "object-fit contain" CSS semantics. This means that the original video is rescaled as necessary to fit into the target aspect ratio and that the remaining areas of the cell are filled in black. The following images illustrate how this happens
Understanding Reuse
Video sources are assigned to cells from left-to-right and from top-to-bottom
in the region grid. However, this assignment depends on the value of the
reuse
property. For understanding how reuse
works, we need some definitions:
- We say a cell is fresh at a given time when it has not displayed any video source up to that time.
- We say a cell is used at a given time when it is displaying a video source at that time.
- We say a cell is idle at a given time when it has been used previously but its video source media has ended at that time.
Based on this, the possible values for reuse
are the following:
none
: in this case, used cells are never reused again and stay idle until the composition ends. Newer video sources are assigned only to fresh cells following the left-to-right top-to-bottom order. In constrained grids, we may run out of fresh cells. In that case, no further video sources are displayed.show_oldest
: in this case idle cells can be reused. Hence, newer video sources are assigned to both idle or fresh cells in left-to-right top-to-bottom order. In constrained grids, when running out of fresh cells, newer video sources will be displayed only as idle cells become available. In such constrained situation, this model gives display priority to older video sources (i.e. to the video sources starting first), which justifies itsshow_oldest
name.show_newest
: when this value is specified, video sources are displayed first in idle and fresh cells in the left-to-right top-to-bottom order. In constrained grids it may happen that we run out of both fresh and idle cells. In that case, used cells are reused so that newer video sources are displayed on top of older video sources. When there are several available used cells, the one whose media ends first is selected. As it can be understood, this model gives display priority to newer video sources (i.e. to video sources starting later), which justifies itsshow_newest
name.
The difference between the different reuse
modes can be visually appreciated
in the following figure:
As it can be observed, in this figure we assume a Room with 5 video tracks. These are numbered from 0 to 4. The Room timeline shows the time intervals (relative to the Room starting time) when such tracks are in published state. Below it, we show a number of compositions subject to the following constraints:
-
1x1 Region Composition: in this case, the Composition has a single region where
max_rows=1
andmax_columns=1
. Whenreuse=none
the first track (i.e. the one identified as 0) takes the single fresh cell of the region for display. In the gaps where this track is unpublished, the cell is never reused and the Composition stays black (in caseTrim=false
) or terminates (in caseTrim=true
). Ifreuse=show_newest
newer tracks have higher priority than older tracks. Due to this, track 1 is stacked on top of track 0 and displayed while it is active. Later, tracks 2, 3 and 4 take the single region cell as soon as their media is activated. In the case wherereuse=show_oldest
, the first track occupying the single region cell keeps it until it ends. After that, newer tracks can take it (observe how the end of track 2 allows track 3 to be displayed and how the end of track 3 does the same with track 4.) -
1x2 Region Composition: now the Composition has a single region with two cells (
max_rows=1
andmax_columns=2
). Due to this, tracks 0 and 1 can be displayed simultaneously. For tracks 2, 3 and 4 their display is controlled by the reuse model. Whenreuse=none
, tracks 0 and 1 take the two fresh cells. After that, no further tracks can reuse such cells and the Composition stays black (in caseTrim=false
) or terminates (in caseTrim=true
). Ifreuse=show_newest
2 and 3 can reuse the idle cells, but when 4 arrives it reuses the used cell with track ending first (in this case 2). Whenreuse=show_oldest
2 and 3 have priority and hence 4 needs to wait until an idle cell is made available, which takes place when 2 ends. -
Unconstrained Region Composition: in this case, the Composition has a single cell where neither
max_rows
normax_columns
have been specified. Hence, the region grid dimensions are automatically calculated to fit all the specified video sources. Whenreuse=none
, we need 5 cells given that video sources can only occupy fresh cells. Due to this, the system uses a 2x3 grid. Whenreuse=show_newest
orreuse=show_oldest
the required grid needs to have only 3 cells given that the maximum number of simultaneous tracks is 3 (i.e. what happens during the interval in which 2, 3 and 4 are published). Hence, the system computes a 2x2 grid. Observe that when using unconstrained grids, bothshow_newest
andshow_oldest
generate the same video placement for the region. This is due to the fact that in an unconstrained grid it is guaranteed that the number of video sources never exceeds the number of cells in the grid. Hence, no used cells need ever to be reused.
Understanding Trim
The Trim
Composition parameter controls what happens to the composition
when there is no active media. That is, during the gaps in which neither
audio tracks nor video tracks are published. We define two types of such
gaps:
- Initial gap: Compositions have an initial gap when at the beginning of the Room there is no active media. The initial gap ends when the first recording included in the composition starts. The initial gap is always trimmed in Compositions.
- Later gap: for any other gap not being an initial gap.
Later gaps are trimmed depending on the value of the trim
parameter.
- trim=false: The Composition keeps all the later gaps as black video with silent audio. Hence, these idle intervals with no media may appear at the end or in the middle of the Composition.
- trim=true (default): the Composition clips all gaps where there is no media published. Note that when an audio track is active in the composition during an interval it shall not be clipped even if there are no active video tracks on it.
The following figure illustrates how trimmed Compositions behave in the scenarios introduced in the section above.
HTTP GET
Retrieves the list Composition Instance Records belonging to the specified AccountSid
with paging data.
Supported GET Parameters
The following GET query string parameters allow you to limit the list returned. Note, parameters are case-sensitive.
Parameters in REST API format | |
---|---|
status
Optional
|
Read only Composition resources with this status. Can be: |
date_created_after
Optional
|
Read only Composition resources created on or after this ISO 8601 date-time with time zone. |
date_created_before
Optional
|
Read only Composition resources created before this ISO 8601 date-time with time zone. |
room_sid
Optional
|
Read only Composition resources with this Room SID. |
Note: deleted
Compositions are not returned by default. For retrieving the
deleted Compositions list you must explicitly specify Status=deleted
.
Examples
Creating Transcoding Compositions
Example: Transcode a Video Recording
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where a video Track with
MediaTrackSid=MTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
is published and recorded with RecordingTrackSid=RTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.
We want then to generate a Composition containing such video Track but
transcoded to the H.264/mp4
format. Considering that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use the default resolution (VGA =
640x480
)
You can create the desired Composition using the following:
Example: Transcode an Audio Recording
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where an audio Track with
MediaTrackSid=MTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
is publishes and recorded with RecordingTrackSid=RTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
.
We want then to generate a Composition containing such audio Track but
transcoded to the AAC/mp4
format. Considering that:
- Your application credentials are (
SKXXXX:your_api_key_secret
)
You can create the desired Composition using the following:
Remark that, in spite of this being an only-audio composition, it shows default video settings in the corresponding response parameters.
Creating Compositions with Simple Layouts
Example: Compose one Participant's Media
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where a Participant with
ParticipantSid=PAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
publishes both audio and video tracks to the Room. In
this Room there may be other Participants publishing audio and video without
affecting this example's result.
We want to generate a Composition showing the Participant's video Track and having as audio the one of that Participant. Considering that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use the default resolution (VGA =
640x480
)
You can create the desired Composition using the following:
Example: Compose One Participant's Video with all Room Participants' Audios
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where a Participant with
ParticipantSid=PAXXXX
publishes both audio and video Tracks. In that Room
others participant publish also their audio and video Tracks.
We want generate a Composition showing PAXXXX
video but having as audio
the complete set of Room audio Tracks mixed. Considering that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use the default resolution (VGA =
640x480
)
You can create the desired Composition using the following:
Example: Compose the Complete Room in a Grid
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where multiple
Participants publish both audio and video Tracks. We want generate a
Composition showing all the room videos in a grid, as shown in the figure
below, and with all audio tracks
mixed.
Considering that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use VGA resolution (
640x480
) - You want an unconstrained grid composition with the default cell reuse
strategy (
show_oldest
).
You can create the desired Composition using the following:
Example: Compose a Specific Set of Track Recordings in a Grid
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where different
Participants publish their audio and video Tracks. Imagine that we have
special interest in some of these tracks that we identify in the following way:
RTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
: the RecordingTrackSid of of video Track.MTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
: the MediaTrackSid of a video Track.teacher-webcast
: the Track name of a video track.
We want to generate a composition showing the three video Tracks in a single row and with no audio, as shown in the figure below.
Considering that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use the default resolution (VGA =
640x480
)
You can create the desired Composition using the following:
Compose a Specific Set of Track Recordings as a Sequence
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where a teacher presents
lessons to their students. These lessons consist on talks and screensharing
presentations that occur in sequence. These may overlap or have idle intervals
where no media is published. The Track names of the published media comply
with the following pattern:
- For video:
teacher-video-sess-1
,teacher-video-sess-2
, etc. - For audio:
teacher-audio-sess-1
,teacher-audio-sess-2
, etc.
In this context, we want to create a Compositions showing only one video source at a time. That one must match with the video track published later by the teacher at any time. We want the teacher's audio to be included in the Composition. We don't want the Composition to contain the idle intervals where the teacher is not publishing media. In this context, considering that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use the default resolution (VGA =
640x480
)
You can create the desired Composition using the following:
Creating Compositions with Complex Layouts
Example: Creating a PiP (Picture-in-Picture) Composition
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where an expert
presents a topic with two audio tracks:
- One track is from his microphone and has the
MediaTrackSid=
MTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- One acting as a background soundtrack with
Track name =
soundtrack
He also has two video tracks:
- A webcam video track with MediaTrackSid=
MTYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
- A screensharing video Track with Track name
screen-presentation
.
In this context, we want to create a PiP Composition including the above mentioned audio Tracks and showing:
- The video track
screen-presentation
occupying the complete Composition background. - The video track
MTYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
in a small box located at the top-left corner of the Composition overlayed on top ofscreen-presentation
as shown on the figure below.
Assuming that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use HD resolution (
1280x720
)
You can create the desired Composition using the following:
Example: Composing a Room with Natural Layouts
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where a lecture is
taking place. The Participants are the following:
- A teacher publishes the following Tracks:
- A webcam video Track with name
teacher-webcam-video
. - An screensharing video Track with name
teacher-screen-video
. - An audio Track with name
teacher-audio
. - A variable number of students publishing each the following (
i
varies from student to student): - A webcam video track with name
student-i-video
- An audio Track with name
student-i-audio
In this context, imagine that we want to create the following Composition:
- Track
teacher-screen-video
must be shown as Composition background occupying its complete viewport. A track with such disposition is sometimes called "main". - The rest of video tracks should be shown in a row at the bottom of the Composition, as shown in the figure below.
- All the audio tracks of the Room should be mixed for the Composition.
Assuming that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use HD resolution (
1280x720
)
You can create the desired Composition using the following:
Imagine now that you want the Composition to have a slightly different layout:
- Track
teacher-screen-video
must be shown as Composition background occupying its complete viewport. - Track
teacher-webcam-video
must be shown as a small window in the top-right corner of the Composition. - The rest of video tracks should be shown in a column on the left with no more than 5 rows as shown on the figure below.
- All the audio tracks of the Room should be mixed for the Composition.
In this case, the required code would be the following:
Example: Composing a Room with Mosaic Layout
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where an interview
is taking place through the following tracks:
- The interviewed publishes the following:
- A video Track named
interviewed-video
. - An audio Track named
interviewed-audio
. - A number of interviewers publishing each:
- A video Track named
interviewer-i-video
. - An audio Track named
interviewed-i-audio
. - An advisor, who can only be listened by the interviewed, publishes:
- An audio Track named
advisor-audio
.
We want to create the following Composition:
- Track
interviewed-video
must be shown centered in the middle of the composition. - The rest of video tracks should be shown around that one, as the figure indicates.
- All the audio tracks of the Room should be mixed for the Composition
except for
advisor-audio
that should not be included.
Assuming that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use HD resolution (
1280x720
)
You can create the desired Composition using the following:
Example: Creating a Chess-Table Layout Composition
In this example we assume a Room with RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
where a number of
participants publish their audio and video tracks. For fun, we want to
create the layout depicted on the following figure.
Assuming that:
- Your application credentials are (
SKXXXX:your_api_key_secret
) - You want to use
mp4
as target format - You want to use HD resolution (
1280x720
)
You can create the desired Composition using the following:
Getting compositions
Example: Get a Composition Instance Resource
For executing this example you need:
- Your application credentials (
SKXXXX:your_api_key_secret
) - The CompositionSid (
CJXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
)
Example: List a Page of Completed Compositions
For executing this example you need:
- Your application credentials (
SKXXXX:your_api_key_secret
)
Example: List all Compositions for a given Room
For executing this example you need:
- Your application credentials (
SKXXXX:your_api_key_secret
) - The target RoomSid (
RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
)
Example: Get a Composition Media File
For executing this example you need:
- Your application credentials (
SKXXXX:your_api_key_secret
) - The CompositionSid
(CJXXXX)
- In this example we also specify a Ttl of 3600 seconds.
Deleting Compositions
Example: Delete a Composition Instance
For executing this example you need:
- Your application credentials (
SKXXXX:your_api_key_secret
) - The Composition SID to delete (
CJXXXX
)
Known limitations
- The time taken to process a Composition depends on the duration of the Room and the load that the Composition service is under at the time of the request. No specific delivery time is guaranteed.
- The only supported formats are MP4 and WebM.
- The maximum size of all selected Recordings for a Composition is 40 GB. For estimation of Recording's size check this table.
- It is not allowed to delete a Composition Instance Resource with
Status=failed
. These instances will not count against your Storage capacity. - Compositions and Compositions Hooks may fail if one or more recordings has a short duration of 2 seconds or less. We recommend removing recordings with these short durations prior to creating compositions.
Need some help?
We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd by visiting Twilio's Stack Overflow Collective or browsing the Twilio tag on Stack Overflow.