Menu

Expand
Rate this page:

TwiML™️ Voice: <Stream>

This Twiml Verb is not currently available when using Twilio Regions Ireland (IE1) or Australia (AU1). This is currently only supported with the default US1 region. A full list of unsupported products and features with Twilio Regions is documented here.

The <Stream> instruction allows you to receive raw audio streams from a live phone call over WebSockets in near real-time.

The most basic use of <Stream>:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
   <Start>
       <Stream url="wss://mystream.ngrok.io/audiostream" />
   </Start>
</Response>

This TwiML will instruct Twilio to fork the audio stream of the current call and send it in real-time over WebSocket to wss://mystream.ngrok.io/audiostream.

The <Start> verb starts the audio <Stream> asynchronously and immediately continues with the next TwiML instruction. If there is no instruction, the call will be disconnected. In order to avoid this, provide a TwiML instruction to continue the call.

If you'd prefer a synchronous bi-directional stream, you should use the <Connect> verb.

There is a one to one mapping of a stream to a WebSocket connection, therefore there will be at most one call being streamed over a single WebSocket connection. Metadata will be provided so that you can handle multiple inbound connections and manage the mappings between the unique stream identifier or StreamSid.

If communication issues are encountered with your WebSocket server, they will be reported in the Twilio Debugger with additional information about the failure.

There are a maximum of 4 forked streams allowed per call. Each track, inbound or outbound, is a forked stream.

Attributes

<Stream> supports the following attributes:

Attribute Name Allowed Values Default Value
url relative or absolute URL none
name Optional. Unique name for the Stream none
track Optional. inbound_track, outbound_track, both_tracks inbound_track
statusCallback Optional. Relative or absolute URL none
statusCallbackMethod Optional. GET or POST POST

url

The url attribute accepts a relative or absolute url. On successful execution, a WebSocket connection to the url will be established and audio will start streaming.wss is the only supported protocol.

The url does not support query string parameters. To pass custom key value pairs to the WebSocket, make use of Custom Parameters instead.

name

Providing a name will allow you to reference the stream directly. This name must be unique per Call. For instance, stopping a Stream by name.

track

The track attribute allows you to optionally request to receive a specific track of a call. On any given active call, there are inbound and outbound tracks. Inbound represents the audio Twilio receives from the call and outbound represents the audio generated by Twilio to the call.

By default, Twilio always streams the inbound track of a call. When a Stream is started using the synchronous bi-directional <Connect> verb, this is your only option.

However, if you are using an asynchronous Stream by using the <Start> verb, you can request the audio it generates by choosing outbound_track. To receive both tracks of a call use both_tracks. If both_tracks is used, you will receive both the inbound media event and outbound media event.

statusCallback

The statusCallback attribute takes an absolute or relative URL as value. Whenever a stream is started or stopped, Twilio will make a request to this URL with the following parameters:

Parameter Description
AccountSid The unique identifier of the Account responsible for this Stream.
CallSid The unique identifier of the Call
StreamSid The unique identifier for this Stream
StreamName If defined, this is the unique name of the Stream. Defaults to the StreamSid
StreamEvent One of stream-started, stream-stopped, or stream-error (see StreamError for the message)
StreamError If an error has occurred, this will contain a detailed error message.
Timestamp The time of the event in ISO 8601 format

statusCallbackMethod

The HTTP method to use when requesting the statusCallback URL. Default is POST.

Custom Parameters

It is possible to include additional key value pairs that will passed along with your stream. You can do this by using the nested <Parameter> TwiML noun.

<?xml version="1.0" encoding="UTF-8"?>
<Response>
   <Start>
     <Stream url="wss://mystream.ngrok.io/example" >
        <Parameter name="FirstName" value ="Jane"/>
        <Parameter name="LastName" value ="Doe" />
        <Parameter name="RemoteParty" value ="Bob" />
      </Stream>
    </Start>
</Response>

These values will be sent along with the Start WebSocket message.

There is a currently a size limitation in the amount of data that can be sent from custom parameters. Aim to keep the total characters of your parameters (both name and value) under 500 characters.

Stopping a Stream

It is possible to stop an asynchronous stream by name. For instance by naming the Stream my_first_stream.

<Start>
    <Stream name="my_first_stream" url="wss://mystream.ngrok.io/audiostream" />
</Start>

You can later use the unique name of my_first_stream to stop the stream.

<Stop>
   <Stream name="my_first_stream" />
</Stop>

WebSocket Messages - From Twilio

There are separate types of events that occur during the Stream's life cycle. These events are represented via WebSocket Messages: Connected, Start, Media, Stop, and the bi-directional only Mark.

Each message sent is a JSON string. You can determine which type of event is occurring by using the event property of every JSON object.

The following messages are sent from Twilio to your application:

Connected Message

The first message sent once a WebSocket connection is established is the Connected event. This message describes the protocol to expect in the following messages.

Parameter Description
event The value of connected
protocol Defines the protocol for the WebSocket connections lifetime. eg: "Call"
version Semantic version of the protocol.

Example Connected Message

{ 
 "event": "connected",  
 "protocol": "Call", 
 "version": "1.0.0"
}

Start Message

This message contains important metadata about the stream and is sent immediately after the Connected message. It is only sent once at the start of the Stream.

Parameter Description
event The value of start
sequenceNumber Number used to keep track of message sending order. First message starts with "1" and then is incremented.
start An object containing Stream metadata
start.streamSid The unique identifier of the Stream
start.accountSid The Account identifier that created the Stream
start.callSid The Call identifier from where the Stream was started.
start.tracks An array of values that indicates what media flows to expect in subsequent messages. Values include inbound, outbound.
start.customParameters An object that represents the Custom Parameters that where set when defining the Stream
start.mediaFormat An object containing the format of the payload in the Media Messages.
start.mediaFormat.encoding The encoding of the data in the upcoming payload. Value will always be audio/x-mulaw.
start.mediaFormat.sampleRate The Sample Rate in Hertz of the upcoming audio data. Value is always 8000
start.mediaFormat.channels The number of channels in the input audio data. Value will always be 1
streamSid The unique identifier of the Stream

Example Start Message

{ 
 "event": "start",  
 "sequenceNumber": "2", 
 "start": { 
   "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0", 
   "accountSid": "AC123", 
   "callSid": "CA123", 
   "tracks": [ 
     "inbound", 
     "outbound" 
   ],
   "customParameters": {
     "FirstName": "Jane",
     "LastName": "Doe",
     "RemoteParty": "Bob", 
   },
   "mediaFormat": { 
     "encoding": "audio/x-mulaw", 
     "sampleRate": 8000, 
     "channels": 1 
   } 
 },
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}

Media Message

This message type encapsulates the raw audio data.

Parameter Description
event The value of media
sequenceNumber Number used to keep track of message sending order. First message starts with "1" and then is incremented for each message.
media An object containing media metadata and payload
media.track One of inbound or outbound
media.chunk The chunk for the message. The first message will begin with "1" and increment with each subsequent message.
media.timestamp Presentation Timestamp in Milliseconds from the start of the stream.
media.payload Raw audio in encoded in base64
streamSid The unique identifier of the Stream

Example Media Messages

Outbound Track
{ 
 "event": "media",
 "sequenceNumber": "3", 
 "media": { 
   "track": "outbound", 
   "chunk": "1", 
   "timestamp": "5",
   "payload": "no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiI2ZuyYSCwcGCA0YNamUi4eGiIyXrywVDAcGBwwVLK+XjIiGh4uUqTUYDQgGBwsSJruZjYiGh4qRokQbDgkGBgoQH9KcjomGhomPnv8eDwkGBgkOHFKfkIqGhomOm8QiEQoHBggNGTumkouHhoiNmLUpFAsHBggMFy+slYyHhoeMlawvFwwIBgcLFCm1mI2IhoeLkqY7GQ0IBgcKESLEm46JhoaKkJ9SHA4JBgYJDx7/no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiA=="
 } ,
 "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}
Inbound Track

{ 
 "event": "media",
 "sequenceNumber": "4",
 "media": { 
   "track": "inbound", 
   "chunk": "2", 
   "timestamp": "5",
   "payload": "no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiI2ZuyYSCwcGCA0YNamUi4eGiIyXrywVDAcGBwwVLK+XjIiGh4uUqTUYDQgGBwsSJruZjYiGh4qRokQbDgkGBgoQH9KcjomGhomPnv8eDwkGBgkOHFKfkIqGhomOm8QiEQoHBggNGTumkouHhoiNmLUpFAsHBggMFy+slYyHhoeMlawvFwwIBgcLFCm1mI2IhoeLkqY7GQ0IBgcKESLEm46JhoaKkJ9SHA4JBgYJDx7/no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiA=="                        
 },
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0" 
}

Stop Message

A stop message will be sent when the Stream is either <Stop>ped or the Call has ended.

Parameter Description
event The value of stop
sequenceNumber Number used to keep track of message sending order. First message starts with "1" and then is incremented for each message.
stop An object containing Stream metadata
stop.accountSid The Account identifier that created the Stream
stop.callSid The Call identifier that started the Stream
streamSid The unique identifier of the Stream

Example Stop Message

{ 
 "event": "stop",
 "sequenceNumber": "5",
 "stop": {
    "accountSid": "AC123",
    "callSid": "CA123"
  },
  "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0" 
}

Mark Message

Th mark event is sent only during bi-directional streaming by using the <Connect> verb. It is used to track, or label, when media has completed.

Parameter Description
event The value of mark
sequenceNumber Number used to keep track of message sending order. First message starts with "1" and then is incremented for each message.
mark An object containing the mark metadata
mark.name The value specified when creating the mark message to Twilio

Mark Message Example

{ 
 "event": "mark",
 "sequenceNumber": "4",
 "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
 "mark": {
   "name": "my label"
 }
}

WebSocket Messages - To Twilio

The events that you can send back to Twilio are Media, Mark, and Clear.

In order to send data back to Twilio, your stream must have been initialized using the <Connect> TwiML verb. This will give you a bi-directional Stream which allows you to pipe audio and control the flow.

Media Message

To send media back to Twilio, you must provide a similarly formattedmedia message. The payload must be encoded audio/x-mulaw with a sample rate of 8000 and base64 encoded. The audio can be of any size.

The media messages will be buffered and played in the order received. If you'd like interrupt the buffered audio, see the clear event message.

The media payload should not contain audio file type header bytes. Providing header bytes will cause the media to be streamed incorrectly.

Parameter Description
event The value of media
streamSid The SID of the Stream that should play back the audio
media An object containing media metadata and payload
media.payload Raw mulaw/8000 audio in encoded in base64
{
  "event": "media",
  "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
  "media": {
    "payload": "a3242sadfasfa423242... (a base64 encoded string of 8000/mulaw)"
  }
}

Mark Message

Send a mark event message after sending a media event message to be notified when the audio that you have sent has been completed. You'll receive a mark event with a matching name from Twilio when the audio ends (or if there is no audio buffered).

You will also receive an incoming mark event message if the buffer was cleared using the clear event message.

Parameter Description
event The value of mark
streamSid The SID of the Stream that should receive the mark
mark An object containing mark metadata and payload
mark.name A name specific to your needs that will assist in recognizing future received mark event

Example Mark Message

{ 
 "event": "mark",
 "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
 "mark": {
   "name": "my label"
 }
}

Clear Message

Send the clear event message if you would like to interrupt the audio that has been sent various media event messages. This will empty all buffered audio and cause any mark event messages to be sent back to you.

Parameter Description
event The value of clear
streamSid The SID of the Stream that should receive the mark

Example Clear Message

{ 
 "event": "clear",
 "streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
}

Examples

Start a new asynchronous MediaStream named "Example Audio Stream". The stream will begin sending messages to the url specified over a WebSocket connection.

Loading Code Sample...
        
        

        Start a MediaStream

        You can send additional custom information by using the <Parameter> TwiML noun. These values will be delivered in the Start WebSocket message in the customParameters section.

        Loading Code Sample...
              
              

              Provide Custom Parameters to a MediaStream on creation

              Bi-directional Media Streams

              If you want to send media back to the call, the Stream *must* be bi-directional. To do this initialize the stream using the <Connect> TwiML verb as opposed to the <Start> verb. The <Stream> noun's url attribute must be set to a secure websocket server (wss).

              Loading Code Sample...
                    
                    
                    The <Connect> verb sets up a synchronous bi-directional stream

                    Connect call to a bi-directional MediaStream

                    The <Connect> verb sets up a synchronous bi-directional stream

                    Media Servers

                    Media Streams must communicate with Twilio's cloud in order to function. Listed below are the IP address ranges used to communicate with Twilio's cloud. If necessary, use this information to configure your firewall to enable communication with Twilio.

                    These IP address ranges will change November 16, 2023. See what you need to do for this change and learn more here.

                    Edge Location Server IP Address Range
                    ashburn US East Coast (Virginia) 34.203.254.0/24
                    3.235.111.128/25

                    Traffic is secure web socket over TCP

                    Rate this page:

                    Need some help?

                    We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd by visiting Twilio's Stack Overflow Collective or browsing the Twilio tag on Stack Overflow.

                    Loading Code Sample...
                          
                          
                          

                          Thank you for your feedback!

                          Please select the reason(s) for your feedback. The additional information you provide helps us improve our documentation:

                          Sending your feedback...
                          🎉 Thank you for your feedback!
                          Something went wrong. Please try again.

                          Thanks for your feedback!

                          thanks-feedback-gif