Rate this page:

Understanding Video Rooms


This guide introduces the concept of Video Room and helps developers decide which type of Room is most appropriate for their use-case:

  • WebRTC Go Room.
  • Peer-to-peer (P2P) Room.
  • Small Group Room.
  • Group Room.

This guide also introduces the different alternatives for creating Rooms as well as their advantages and drawbacks:

  • Rooms created using the REST API.
  • Ad-hoc Rooms.


Signaling and Media

RTC (Real-Time Communication) services are typically architected in two layers:

  • Signaling Plane: It deals with the control information. The communicating entities typically exchange signaling messages for agreeing on what’s to be communicated (e.g. audio, video, etc) and how’s to be communicated (e.g. codecs, formats, etc.)
  • Media Plane: It deals with the media information itself. Media packets typically transport encoded and encrypted audio and video bits.

In Twilio Programmable Video signaling always takes place between clients and the Twilio’s cloud, which orchestrates the communication. Media in turn may be mediated by Twilio but might also be exchanged directly among clients.

Video Rooms

The notion of a Room is central to Twilio Programmable Video. Intuitively, a Room represents a virtual space where end-users communicate. Technically, a Room is a computing resource that provides Real-time Communications (RTC) services to client applications through a set of APIs. More specifically, a Room provides:

  • A session service: so that end-users can connect and disconnect from Rooms. When an end-user connects we say it is a Room Participant.
  • An RTC Service: so that Participants can communicate audio, video and data using WebRTC.

Video Rooms are based on a publish/subscribe model. This means that a Participant can publish media Tracks to the Room. The rest of Participants can then subscribe to such Tracks to start receiving the media information.

Twilio Programmable Video exposes four types of Rooms with different capabilities: WebRTC Go Rooms, P2P Rooms (Peer-to-Peer ), Small Group Rooms, and Group Rooms.

Video WebRTC Go Rooms

Go Rooms can be used for one-on-one video calls. Participant minutes are FREE and 25 GB of TURN server usage per month is included. Go Rooms use a peer-to-peer topology and are similar to P2P Rooms, however, the maximum number of participants in a Go Room is 2. There can be a maximum of 100 concurrent participants at a time per account, for example, 50 rooms with 2 participants.

Video P2P Rooms

In a P2P Room Participants exchange media directly so that:

  • Media is encrypted end-to-end (E2E) using WebRTC security protocols.
  • Twilio does not mediate in the media exchange, which takes place through direct communication among Participants. The only exception is when media exchange requires TURN. In that case, a TURN server will blindly relay the encrypted media bits to guarantee connectivity. The TURN server cannot decrypt or manipulate the media.
  • As Twilio does not intercept the media in P2P Rooms, it is not possible to record or to transcode the media or to make it interoperate with other RTC services.
  • Despite not being in the media path, Twilio manages the signaling path making it possible for Participants to discover each other and to negotiate the communications in agreement with the application and SDK requirements. Hence, signaling connectivity to Twilio’s cloud is still necessary.

The following picture illustrates the architecture of a P2P Room.

The Architecture of a P2P Room

As seen above, in a P2P Room, clients need to send their media streams once per subscriber. As a result, upstream bandwidth (and typically battery consumption) scales as n-1, where n is the number of Participants. Because of this, P2P Rooms do not scale well with n.

Video Group Rooms

In a Group Room, Participants exchange media through Twilio. Group Room:

  • Participants publish media to a Twilio Selective Forwarding Unit (SFU). An SFU is a Media Server that decrypts the media, processes, re-encrypts and routes the media tracks to the correct destinations.
  • As a result, media is not E2E encrypted as the SFU keeps media unencrypted in memory, to process it.
  • As Twilio acts as media middleware, Group Rooms can provide services such as recordings and public switched telephone network (PSTN) interoperability.

The following picture illustrates the architecture of a Group Room

The Architecture of a Group Room

As shown above, in a Group Room clients only need to publish their media tracks once to the SFU, which clones and routes the media to the correct subscribers. Because of this, upstream bandwidth and battery consumption are independent of the number of Participants.

Small Vs Regular Group Rooms

In a Group Room, the computing resources required by an SFU scale with the square of the number of Participants (P). Hence, a large Room on a busy media server might steal resources from other Rooms and degrade their quality. To avoid this, when creating an SFU, Twilio reserves the required resources for it to scale. As a result, computing costs depend on the maximum number of participants a Group Room might host.

However, there are many use cases such as a video contact center, e-health, and one-to-one communication that do not require over four Participants. With this in mind, Twilio offers two types of Group Rooms to accommodate your use case:

  • Regular Group Rooms (aka Group Rooms): scale up to 50 participants. Twilio charges for Regular Group Rooms at a standard price.
  • Small Group Rooms: scale up to four participants. For these Rooms, Twilio charges a reduced price.

Comparing Room types

The following table illustrates the main properties of the different Twilio Rooms:

Go Room P2P Room Small Group Room (Regular) Group Room
E2E encryption Yes Yes No No
Upstream BW scales with1 n-1 n-1 Constant Constant
Downstream BW scales with1 n-1 n-1 n-1 n-1
Screensharing supported Yes Yes Yes Yes
Audio/Video/Data Tracks Yes Yes Yes Yes
Max Participants 2 32 4 50
Rooms REST API Yes Yes Yes Yes
Ad-hoc Rooms Yes Yes Yes Yes
Participants API Yes Yes Yes Yes
Published Track API Yes Yes Yes Yes
Codec Preferences Yes Yes Yes Yes
VP8 Simulcast No No Yes Yes
Dominant Speaker Detection No No Yes Yes
Network Quality API No No Yes Yes
Track Subscription API No No Yes Yes
Recordings No No Yes Yes
PSTN Interoperability No No Yes Yes
Track Priority API No No Yes Yes
Network Bandwidth Profile API No No Yes Yes

1n denotes the number of Subscribers that, by default, is the same as the number of Participants
2Can support up to 10 audio-only participants, but max 3 participants recommended when video is published

Creating Rooms: REST Vs Ad-hoc

There are two alternatives for creating Rooms: The Rooms REST API and Ad-hoc Rooms

The Rooms REST API

Developers can create Rooms by POSTing an HTTP message to Twilio. The Rooms REST API documentation provide reference information as well as examples on how this can be done for all our Room types.

Rooms created using the REST API comply with the following:

  • First join timeout: the first Participant must join within 5 minutes after Room creation. Otherwise the Room is destroyed.
  • Last leave timeout: the Room is destroyed 5 minutes after the last Participant leaves.
  • Max Participant duration: a Participant can be connected to the Room up to 4 hours. After that time the Participant is disconnected.
  • Max Room duration: a Room may exist up to 24 hours from creation time. After that time the Room is destroyed and all Participants get disconnected.

Ad-hoc Rooms

Rooms can also be created just-in-time when the first Participant connects. When a Room is created that way, we say it is an ad-hoc Room. In order to use ad-hoc Rooms, developers must enable "CLIENT-SIDE ROOM CREATION" in the Twilio Console Room Settings following these simple steps:

Programmable Video Console Room Settings

  • Set the STATUS CALLBACK URL to the URL where the status callbacks should be received (can be left empty).
  • Set the ROOM TYPE to the type of Room to be created: Go (for WebRTC Go Rooms), Peer-to-peer (for P2P Rooms), Group (for Group Rooms), or Group-Small (for Small Group Rooms)
  • Press the Save button.

Once that’s done, a Room for the specified type will be created as soon as a Participant SDK connects. For example, the following code snippet illustrates how to do this in JavaScript:

connect('$TOKEN', {name: 'myFancyRoomName' }).then(room => {
  console.log(`Successfully joined a Room: ${room}`);
  room.on('participantConnected', participant => {
    console.log(`A remote Participant connected: ${participant}`);
}, error => {
  console.error(`Unable to connect to Room: ${error.message}`);

Notice that a Room name must be specified. Names of active Rooms must be unique. Hence, subsequent Participants connecting with that name will just join that Room instead of creating a new one.

Ad-hoc Rooms comply with the following:

  • First join timeout: there isn’t any as the Room is just when the first participant connects.
  • Last leave timeout: the Room is destroyed just after the last Participant leaves. No waiting time here.
  • Max Participant duration: a Participant can be connected to the Room up to 4 hours. After that time the Participant is disconnected.
  • Max Room duration: a Room may exist up to 24 hours from creation time. After that time the Room is destroyed and all participants get Disconnected.

Comparing REST Vs Ad-hoc Rooms

The following table illustrates the main differences between Ad-hoc Rooms and Rooms created using the REST API

REST Rooms Ad-hoc Rooms
Room creation method POST request SDK connect primitive
Room creation time When POST is received When first participant connects
First join timeout 5 minutes NA
Last leave timeout 5 minutes 0
Max Participant duration 4 hours 4 hours
Max Room Duration 24 hours 24 hours

P2P or Group Rooms: Which Room Should I Use?

The following diagram may help decide which is the most appropriate for a given use-case:

What Room type do you need?

Do I need E2E Encryption?

E2E (End-to-End) encryption means only the sender and receiver can read what was sent. P2P Rooms are E2E Encrypted while Group Rooms are not. Group Rooms decrypt and re-encrypt the media to provide their routing capabilities. Some applications, because of policy or compliance reasons, require E2E Encryption. In that case, answer YES. Otherwise, answer NO.

Do I need Recordings?

If you need Twilio to store the data exchanged, answer YES. Otherwise, answer NO.

Do I need PSTN Interop?

If you need the PSTN (i.e. phone calls) to be able to dial into your room, answer YES. Otherwise, answer NO.

More than 2 Participants?

If your Room requires high quality video and can have up to 2 Participants, answer NO. If your Room can tolerate low quality video and can have up to 4 Participants, answer NO. If you Room does not require video (i.e. only audio Room) and can have up to 10 Participants, answer NO. Otherwise, answer YES.

More than 4 Participants?

If your application requires more than 4 Participants in the same Room, answer YES. Otherwise, answer NO.

Next steps

Want to get started with Rooms? The following links may help you:

Luis Lopez Alan Klein Craig Dennis Chris Barrow Donal Toomey Nahuel Sznajderhaus
Rate this page:

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd browsing the Twilio tag on Stack Overflow.


        Thank you for your feedback!

        We are always striving to improve our documentation quality, and your feedback is valuable to us. How could this documentation serve you better?

        Sending your feedback...
        🎉 Thank you for your feedback!
        Something went wrong. Please try again.

        Thanks for your feedback!

        Refer us and get $10 in 3 simple steps!

        Step 1

        Get link

        Get a free personal referral link here

        Step 2

        Give $10

        Your user signs up and upgrade using link

        Step 3

        Get $10

        1,250 free SMSes
        OR 1,000 free voice mins
        OR 12,000 chats
        OR more