TwiML™️ Voice: <Stream>
This Twiml Verb is not currently available when using Twilio Regions Ireland (IE1) or Australia (AU1). This is currently only supported with the default US1 region. A full list of unsupported products and features with Twilio Regions is documented here.
The <Stream>
instruction allows you to receive raw audio streams from a live phone call over WebSockets in near real-time.
The most basic use of <Stream>:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Start>
<Stream url="wss://mystream.ngrok.io/audiostream" />
</Start>
</Response>
This TwiML will instruct Twilio to fork the audio stream of the current call and send it in real-time over WebSocket to wss://mystream.ngrok.io/audiostream.
The <Start>
verb starts the audio <Stream>
asynchronously and immediately continues with the next TwiML instruction. If there is no instruction, the call will be disconnected. In order to avoid this, provide a TwiML instruction to continue the call.
If you'd prefer a synchronous bi-directional stream, you should use the <Connect>
verb.
There is a one to one mapping of a stream to a WebSocket connection, therefore there will be at most one call being streamed over a single WebSocket connection. Metadata will be provided so that you can handle multiple inbound connections and manage the mappings between the unique stream identifier or StreamSid
.
If communication issues are encountered with your WebSocket server, they will be reported in the Twilio Debugger with additional information about the failure.
There are a maximum of 4 forked streams allowed per call. Each track, inbound
or outbound
, is a forked stream.
Attributes
<Stream>
supports the following attributes:
Attribute Name | Allowed Values | Default Value |
---|---|---|
url | relative or absolute URL | none |
name | Optional. Unique name for the Stream | none |
track | Optional. inbound_track , outbound_track , both_tracks |
inbound_track |
statusCallback | Optional. Relative or absolute URL | none |
statusCallbackMethod | Optional. GET or POST |
POST |
url
The url
attribute accepts a relative or absolute url. On successful execution, a WebSocket connection to the url will be established and audio will start streaming.wss
is the only supported protocol.
The url
does not support query string parameters. To pass custom key value pairs to the WebSocket, make use of Custom Parameters instead.
name
Providing a name
will allow you to reference the stream directly. This name must be unique per Call. For instance, stopping a Stream by name.
track
The track
attribute allows you to optionally request to receive a specific track of a call. On any given active call, there are inbound and outbound tracks. Inbound represents the audio Twilio receives from the call and outbound represents the audio generated by Twilio to the call.
By default, Twilio always streams the inbound track of a call. When a Stream is started using the synchronous bi-directional <Connect>
verb, this is your only option.
However, if you are using an asynchronous Stream by using the <Start>
verb, you can request the audio it generates by choosing outbound_track
. To receive both tracks of a call use both_tracks
. If both_tracks
is used, you will receive both the inbound media event and outbound media event.
statusCallback
The statusCallback
attribute takes an absolute or relative URL as value. Whenever a stream is started or stopped, Twilio will make a request to this URL with the following parameters:
Parameter | Description |
---|---|
AccountSid | The unique identifier of the Account responsible for this Stream. |
CallSid | The unique identifier of the Call |
StreamSid | The unique identifier for this Stream |
StreamName | If defined, this is the unique name of the Stream. Defaults to the StreamSid |
StreamEvent | One of stream-started , stream-stopped , or stream-error (see StreamError for the message) |
StreamError | If an error has occurred, this will contain a detailed error message. |
Timestamp | The time of the event in ISO 8601 format |
statusCallbackMethod
The HTTP method to use when requesting the statusCallback
URL. Default is POST
.
Custom Parameters
It is possible to include additional key value pairs that will passed along with your stream. You can do this by using the nested <Parameter>
TwiML noun.
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Start>
<Stream url="wss://mystream.ngrok.io/example" >
<Parameter name="FirstName" value ="Jane"/>
<Parameter name="LastName" value ="Doe" />
<Parameter name="RemoteParty" value ="Bob" />
</Stream>
</Start>
</Response>
These values will be sent along with the Start WebSocket message.
There is a currently a size limitation in the amount of data that can be sent from custom parameters. Aim to keep the total characters of your parameters (both name
and value
) under 500 characters.
Stopping a Stream
It is possible to stop an asynchronous stream by name. For instance by naming the Stream
my_first_stream
.
<Start>
<Stream name="my_first_stream" url="wss://mystream.ngrok.io/audiostream" />
</Start>
You can later use the unique name
of my_first_stream
to stop the stream.
<Stop>
<Stream name="my_first_stream" />
</Stop>
WebSocket Messages - From Twilio
There are separate types of events that occur during the Stream's life cycle. These events are represented via WebSocket Messages: Connected, Start, Media, Stop, and the bi-directional only Mark.
Each message sent is a JSON string. You can determine which type of event is occurring by using the event
property of every JSON object.
The following messages are sent from Twilio to your application:
Connected Message
The first message sent once a WebSocket connection is established is the Connected
event. This message describes the protocol to expect in the following messages.
Parameter | Description |
---|---|
event | The value of connected |
protocol | Defines the protocol for the WebSocket connections lifetime. eg: "Call" |
version | Semantic version of the protocol. |
Example Connected Message
{
"event": "connected",
"protocol": "Call",
"version": "1.0.0"
}
Start Message
This message contains important metadata about the stream and is sent immediately after the Connected
message. It is only sent once at the start of the Stream.
Parameter | Description |
---|---|
event | The value of start |
sequenceNumber | Number used to keep track of message sending order. First message starts with "1" and then is incremented. |
start | An object containing Stream metadata |
start.streamSid | The unique identifier of the Stream |
start.accountSid | The Account identifier that created the Stream |
start.callSid | The Call identifier from where the Stream was started. |
start.tracks | An array of values that indicates what media flows to expect in subsequent messages. Values include inbound , outbound . |
start.customParameters | An object that represents the Custom Parameters that where set when defining the Stream |
start.mediaFormat | An object containing the format of the payload in the Media Messages. |
start.mediaFormat.encoding | The encoding of the data in the upcoming payload. Value will always be audio/x-mulaw . |
start.mediaFormat.sampleRate | The Sample Rate in Hertz of the upcoming audio data. Value is always 8000 |
start.mediaFormat.channels | The number of channels in the input audio data. Value will always be 1 |
streamSid | The unique identifier of the Stream |
Example Start Message
{
"event": "start",
"sequenceNumber": "2",
"start": {
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
"accountSid": "AC123",
"callSid": "CA123",
"tracks": [
"inbound",
"outbound"
],
"customParameters": {
"FirstName": "Jane",
"LastName": "Doe",
"RemoteParty": "Bob",
},
"mediaFormat": {
"encoding": "audio/x-mulaw",
"sampleRate": 8000,
"channels": 1
}
},
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}
Media Message
This message type encapsulates the raw audio data.
Parameter | Description |
---|---|
event | The value of media |
sequenceNumber | Number used to keep track of message sending order. First message starts with "1" and then is incremented for each message. |
media | An object containing media metadata and payload |
media.track | One of inbound or outbound |
media.chunk | The chunk for the message. The first message will begin with "1" and increment with each subsequent message. |
media.timestamp | Presentation Timestamp in Milliseconds from the start of the stream. |
media.payload | Raw audio in encoded in base64 |
streamSid | The unique identifier of the Stream |
Example Media Messages
Outbound Track
{
"event": "media",
"sequenceNumber": "3",
"media": {
"track": "outbound",
"chunk": "1",
"timestamp": "5",
"payload": "no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiI2ZuyYSCwcGCA0YNamUi4eGiIyXrywVDAcGBwwVLK+XjIiGh4uUqTUYDQgGBwsSJruZjYiGh4qRokQbDgkGBgoQH9KcjomGhomPnv8eDwkGBgkOHFKfkIqGhomOm8QiEQoHBggNGTumkouHhoiNmLUpFAsHBggMFy+slYyHhoeMlawvFwwIBgcLFCm1mI2IhoeLkqY7GQ0IBgcKESLEm46JhoaKkJ9SHA4JBgYJDx7/no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiA=="
} ,
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}
Inbound Track
{
"event": "media",
"sequenceNumber": "4",
"media": {
"track": "inbound",
"chunk": "2",
"timestamp": "5",
"payload": "no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiI2ZuyYSCwcGCA0YNamUi4eGiIyXrywVDAcGBwwVLK+XjIiGh4uUqTUYDQgGBwsSJruZjYiGh4qRokQbDgkGBgoQH9KcjomGhomPnv8eDwkGBgkOHFKfkIqGhomOm8QiEQoHBggNGTumkouHhoiNmLUpFAsHBggMFy+slYyHhoeMlawvFwwIBgcLFCm1mI2IhoeLkqY7GQ0IBgcKESLEm46JhoaKkJ9SHA4JBgYJDx7/no+JhoaJjpzSHxAKBgYJDhtEopGKh4aIjZm7JhILBwYIDRg1qZSLh4aIjJevLBUMBwYHDBUsr5eMiIaHi5SpNRgNCAYHCxImu5mNiIaHipGiRBsOCQYGChAf0pyOiYaGiY+e/x4PCQYGCQ4cUp+QioaGiY6bxCIRCgcGCA0ZO6aSi4eGiI2YtSkUCwcGCAwXL6yVjIeGh4yVrC8XDAgGBwsUKbWYjYiGh4uSpjsZDQgGBwoRIsSbjomGhoqQn1IcDgkGBgkPHv+ej4mGhomOnNIfEAoGBgkOG0SikYqHhoiNmbsmEgsHBggNGDWplIuHhoiMl68sFQwHBgcMFSyvl4yIhoeLlKk1GA0IBgcLEia7mY2IhoeKkaJEGw4JBgYKEB/SnI6JhoaJj57/Hg8JBgYJDhxSn5CKhoaJjpvEIhEKBwYIDRk7ppKLh4aIjZi1KRQLBwYIDBcvrJWMh4aHjJWsLxcMCAYHCxQptZiNiIaHi5KmOxkNCAYHChEixJuOiYaGipCfUhwOCQYGCQ8e/56PiYaGiY6c0h8QCgYGCQ4bRKKRioeGiA=="
},
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}
Stop Message
A stop message will be sent when the Stream is either <Stop>
ped or the Call has ended.
Parameter | Description |
---|---|
event | The value of stop |
sequenceNumber | Number used to keep track of message sending order. First message starts with "1" and then is incremented for each message. |
stop | An object containing Stream metadata |
stop.accountSid | The Account identifier that created the Stream |
stop.callSid | The Call identifier that started the Stream |
streamSid | The unique identifier of the Stream |
Example Stop Message
{
"event": "stop",
"sequenceNumber": "5",
"stop": {
"accountSid": "AC123",
"callSid": "CA123"
},
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0"
}
Mark Message
Th mark
event is sent only during bi-directional streaming by using the <Connect>
verb. It is used to track, or label, when media has completed.
Parameter | Description |
---|---|
event | The value of mark |
sequenceNumber | Number used to keep track of message sending order. First message starts with "1" and then is incremented for each message. |
mark | An object containing the mark metadata |
mark.name | The value specified when creating the mark message to Twilio |
Mark Message Example
{
"event": "mark",
"sequenceNumber": "4",
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
"mark": {
"name": "my label"
}
}
WebSocket Messages - To Twilio
The events that you can send back to Twilio are Media, Mark, and Clear.
In order to send data back to Twilio, your stream must have been initialized using the <Connect> TwiML verb. This will give you a bi-directional Stream which allows you to pipe audio and control the flow.
Media Message
To send media back to Twilio, you must provide a similarly formattedmedia
message. The payload must be encoded audio/x-mulaw
with a sample rate of 8000 and base64 encoded. The audio can be of any size.
The media messages will be buffered and played in the order received. If you'd like interrupt the buffered audio, see the clear
event message.
The media payload should not contain audio file type header bytes. Providing header bytes will cause the media to be streamed incorrectly.
Parameter | Description |
---|---|
event | The value of media |
streamSid | The SID of the Stream that should play back the audio |
media | An object containing media metadata and payload |
media.payload | Raw mulaw/8000 audio in encoded in base64 |
{
"event": "media",
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
"media": {
"payload": "a3242sadfasfa423242... (a base64 encoded string of 8000/mulaw)"
}
}
Mark Message
Send a mark
event message after sending a media
event message to be notified when the audio that you have sent has been completed. You'll receive a mark
event with a matching name
from Twilio when the audio ends (or if there is no audio buffered).
You will also receive an incoming mark
event message if the buffer was cleared using the clear
event message.
Parameter | Description |
---|---|
event | The value of mark |
streamSid | The SID of the Stream that should receive the mark |
mark | An object containing mark metadata and payload |
mark.name | A name specific to your needs that will assist in recognizing future received mark event |
Example Mark Message
{
"event": "mark",
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
"mark": {
"name": "my label"
}
}
Clear Message
Send the clear
event message if you would like to interrupt the audio that has been sent various media
event messages. This will empty all buffered audio and cause any mark
event messages to be sent back to you.
Parameter | Description |
---|---|
event | The value of clear |
streamSid | The SID of the Stream that should receive the mark |
Example Clear Message
{
"event": "clear",
"streamSid": "MZ18ad3ab5a668481ce02b83e7395059f0",
}
Examples
Start a new asynchronous MediaStream named "Example Audio Stream"
. The stream will begin sending messages to the url
specified over a WebSocket connection.
You can send additional custom information by using the <Parameter>
TwiML noun. These values will be delivered in the Start WebSocket message in the customParameters
section.
Bi-directional Media Streams
If you want to send media back to the call, the Stream *must* be bi-directional. To do this initialize the stream using the <Connect> TwiML verb as opposed to the <Start> verb. The <Stream> noun's url
attribute must be set to a secure websocket server (wss).
Media Servers
Media Streams must communicate with Twilio's cloud in order to function. Listed below are the IP address ranges used to communicate with Twilio's cloud. If necessary, use this information to configure your firewall to enable communication with Twilio.
These IP address ranges will change November 16, 2023. See what you need to do for this change and learn more here.
Edge | Location | Server IP Address Range |
---|---|---|
ashburn | US East Coast (Virginia) | 34.203.254.0/24 3.235.111.128/25 |
Traffic is secure web socket over TCP
Need some help?
We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd by visiting Twilio's Stack Overflow Collective or browsing the Twilio tag on Stack Overflow.