Level up your Twilio API skills in TwilioQuest, an educational game for Mac, Windows, and Linux. Download Now

Menu

Expand
Rate this page:

Thanks for rating this page!

We are always striving to improve our documentation quality, and your feedback is valuable to us. How could this documentation serve you better?

Understanding Video Rooms

Overview

This guide introduces the concept of Video Room and helps developers decide which type of Room is most appropriate for their use-case:

  • Peer-to-peer (P2P) Room.
  • Small Group Room.
  • Group Room.

This guide also introduces the different alternatives for creating Rooms as well as their advantages and drawbacks:

  • Rooms created using the REST API.
  • Ad-hoc Rooms.

Contents

Signaling and Media

RTC (Real-Time Communication) services are typically architected in two layers:

  • Signaling Plane: It deals with the control information. The communicating entities typically exchange signaling messages for agreeing on what’s to be communicated (e.g. audio, video, etc) and how’s to be communicated (e.g. codecs, formats, etc.)
  • Media Plane: It deals with the media information itself. Media packets typically transport encoded and encrypted audio and video bits.

In Twilio Programmable Video signaling always takes place between clients and the Twilio’s cloud, which orchestrates the communication. Media in turn may be mediated by Twilio but might also be exchanged directly among clients.

Twilio Rooms

The notion of a Room is central to Twilio Programmable Video. Intuitively, a Room represents a virtual space where end-users communicate. Technically, a Room is a computing resource that provides Real-time Communications (RTC) services to client applications through a set of APIs. More specifically, a Room provides:

  • A session service: so that end-users can connect and disconnect from Rooms. When an end-user connects we say it is a Room Participant.
  • An RTC Service: so that Participants can communicate audio, video and data using WebRTC.

Twilio Rooms are based on a publish/subscribe model. This means that a Participant can publish media Tracks to the Room. The rest of Participants can then subscribe to such Tracks to start receiving the media information.

Twilio Programmable Video exposes three types of Rooms with different capabilities: P2P Rooms (Peer-to-Peer ), Small Group Rooms, and Group Rooms.

Twilio P2P Rooms

In a P2P Room Participants exchange media directly so that:

  • Media is encrypted end-to-end (E2E) using WebRTC security protocols.
  • Twilio does not mediate in the media exchange, which takes place through direct communication among Participants. The only exception is when media exchange requires TURN. In that case, a TURN server will blindly relay the encrypted media bits to guarantee connectivity. The TURN server cannot decrypt or manipulate the media.
  • As Twilio does not intercept the media in P2P Rooms, it is not possible to record or to transcode the media or to make it interoperate with other RTC services.
  • Despite not being in the media path, Twilio manages the signaling path making it possible for Participants to discover each other and to negotiate the communications in agreement with the application and SDK requirements. Hence, signaling connectivity to Twilio’s cloud is still necessary.

The following picture illustrates the architecture of a P2P Room.

The Architecture of a P2P Room

As seen above, in a P2P Room, clients need to send their media streams once per subscriber. As a result, upstream bandwidth (and typically battery consumption) scales as n-1, where n is the number of Participants. Because of this, P2P Rooms do not scale well with n.

Twilio Group Rooms

In a Group Room, Participants exchange media through Twilio. Group Room:

  • Participants publish media to a Twilio Selective Forwarding Unit (SFU). An SFU is a Media Server that decrypts the media, processes, re-encrypts and routes the media tracks to the correct destinations.
  • As a result, media is not E2E encrypted as the SFU keeps media unencrypted in memory, to process it.
  • As Twilio acts as media middleware, Group Rooms can provide services such as recordings and public switched telephone network (PSTN) interoperability.

The following picture illustrates the architecture of a Group Room

The Architecture of a Group Room

As shown above, in a Group Room clients only need to publish their media tracks once to the SFU, which clones and routes the media to the correct subscribers. Because of this, upstream bandwidth and battery consumption are independent of the number of Participants.

Small Vs Regular Group Rooms

In a Group Room, the computing resources required by an SFU scale with the square of the number of Participants (P). Hence, a large Room on a busy media server might steal resources from other Rooms and degrade their quality. To avoid this, when creating an SFU, Twilio reserves the required resources for it to scale. As a result, computing costs depend on the maximum number of participants a Group Room might host.

However, there are many use cases such as a video contact center, e-health, and one-to-one communication that do not require over four Participants. With this in mind, Twilio offers two types of Group Rooms to accommodate your use case:

  • Regular Group Rooms (aka Group Rooms): scale up to 50 participants. Twilio charges for Regular Group Rooms at a standard price.
  • Small Group Rooms: scale up to four participants. For these Rooms, Twilio charges a reduced price.

Comparing Room types

The following table illustrates the main properties of the different Twilio Rooms:

P2P Room Small Group Room (Regular) Group Room
E2E encryption Yes No No
Upstream BW scales with* n-1 Constant Constant
Downstream BW scales with* n-1 n-1 n-1
Max Downstream BW per Participant No limit 4Mbps 8Mbps
Screensharing supported Yes Yes Yes
Audio/Video/Data Tracks Yes Yes Yes
Max Participants 10 4 50
Rooms REST API Yes Yes Yes
Ad-hoc Rooms Yes Yes Yes
Participants API Yes Yes Yes
Published Track API Yes Yes Yes
Codec Preferences Yes Yes Yes
VP8 Simulcast No Yes Yes
Dominant Speaker Detection No Yes Yes
Network Quality API No Yes Yes
Track Subscription API No Yes Yes
Recordings No Yes Yes
PSTN Interoperability No Yes Yes

n denotes the number of Subscribers that, by default, is the same a the number of Participants

Creating Rooms: REST Vs Ad-hoc

There are two alternatives for creating Rooms: The Rooms REST API and Ad-hoc Rooms

The Rooms REST API

Developers can create Rooms by POSTing an HTTP message to Twilio. The Rooms REST API documentation provide reference information as well as examples on how this can be done for all our Room types.

Rooms created using the REST API comply with the following:

  • First join timeout: the first Participant must join within 5 minutes after Room creation. Otherwise the Room is destroyed.
  • Last leave timeout: the Room is destroyed 5 minutes after the last Participant leaves.
  • Max Participant duration: a Participant can be connected to the Room up to 4 hours. After that time the Participant is disconnected.
  • Max Room duration: a Room may exist up to 24 hours from creation time. After that time the Room is destroyed and all Participants get disconnected.

Ad-hoc Rooms

Rooms can also be created just-in-time when the first Participant connects. When a Room is created that way, we say it is an ad-hoc Room. In order to use ad-hoc Rooms, developers must enable "CLIENT-SIDE ROOM CREATION" in the Twilio Console Room Settings following these simple steps:

Programmable Video Console Room Settings

  • Set the STATUS CALLBACK URL to the URL where the status callbacks should be received (can be left empty).
  • Set the ROOM TYPE to the type of Room to be created: Group (for Group Rooms), Peer-to-peer (for P2P Rooms) or Group-Small (for Small Group Rooms)
  • Set the CLIENT-SIDE ROOM CREATION to ENABLED.
  • Press the Save button.

Once that’s done, a Room for the specified type will be created as soon as a Participant SDK connects. For example, the following code snippet illustrates how to do this in JavaScript:

connect('$TOKEN', {name: 'myFancyRoomName' }).then(room => {
  console.log(`Successfully joined a Room: ${room}`);
  room.on('participantConnected', participant => {
    console.log(`A remote Participant connected: ${participant}`);
  });
}, error => {
  console.error(`Unable to connect to Room: ${error.message}`);
});

Notice that a Room name must be specified. Names of active Rooms must be unique. Hence, subsequent Participants connecting with that name will just join that Room instead of creating a new one.

Ad-hoc Rooms comply with the following:

  • First join timeout: there isn’t any as the Room is just when the first participant connects.
  • Last leave timeout: the Room is destroyed just after the last Participant leaves. No waiting time here.
  • Max Participant duration: a Participant can be connected to the Room up to 4 hours. After that time the Participant is disconnected.
  • Max Room duration: a Room may exist up to 24 hours from creation time. After that time the Room is destroyed and all participants get Disconnected.

Comparing REST Vs Ad-hoc Rooms

The following table illustrates the main differences between Ad-hoc Rooms and Rooms created using the REST API

REST Rooms Ad-hoc Rooms
Room creation method POST request SDK connect primitive
Room creation time When POST is received When first participant connects
First join timeout 5 minutes NA
Last leave timeout 5 minutes 0
Max Participant duration 4 hours 4 hours
Max Room Duration 24 hours 24 hours

P2P or Group Rooms: Which Room Should I Use?

The following diagram may help decide which is the most appropriate for a given use-case:

What Room type do you need?

Do I need E2E Encryption?

E2E (End-to-End) encryption means only the sender and receiver can read what was sent. Twilio P2P Rooms are E2E Encrypted while Twilio Group Rooms are not. Twilio Group Rooms decrypt and re-encrypt the media to provide their routing capabilities. Some applications, because of policy or compliance reasons, require E2E Encryption. In that case, answer YES. Otherwise, answer NO.

Do I need Recordings?

If you need Twilio to store the data exchanged, answer YES. Otherwise, answer NO.

Do I need PSTN Interop?

If you need the PSTN (i.e. phone calls) to be able to dial into your room, answer YES. Otherwise, answer NO.

More than 2 Participants?

If your Room requires high quality video and can have up to 2 Participants, answer NO. If your Room can tolerate low quality video and can have up to 4 Participants, answer NO. If you Room does not require video (i.e. only audio Room) and can have up to 10 Participants, answer NO. Otherwise, answer YES.

More than 4 Participants?

If your application requires more than 4 Participants in the same Room, answer YES. Otherwise, answer NO.

Next steps

Want to get started with Rooms? The following links may help you:

Luis Lopez Alan Klein Craig Dennis Nahuel Sznajderhaus
Rate this page:

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd browsing the Twilio tag on Stack Overflow.