<Play> verb plays an audio file back to the caller. Twilio retrieves
the file from a URL that you provide.
The media URL is provided between
<Play>'s opening and closing tags, as shown in the example below.
<Play> verb supports the following attributes that modify its behavior:
|Attribute Name||Allowed Values||Default Value|
|loop||integer >= 0||1|
|digits||integer >= 0, w||no default digits for Play|
loop attribute specifies how many times the audio file is played.
The default behavior is to play the audio once. Specifying
0 will cause
<Play> verb to loop until either the call is hung up or 1000 iterations are performed.
The example below causes Twilio to play the audio from
https://api.twilio.com/cowbell.mp3 10 times.
digits attribute lets you play DTMF tones during a call.
For example, if you need to test an IVR system, you can use this feature to simulate digits being pressed to navigate through the menu options.
w to introduce a
0.5s pause between DTMF tones. For example,
1w2 will tell Twilio to pause 0.5s before playing DTMF tone 2. To include a one-second pause, use
If you are dialing a phone number and need to play DTMF tones to enter the extension, you should use <Number>'s sendDigits attribute.
Twilio does not send its standard HTTP parameters when making requests to
The TwiML example below shows the use of
<Play> and its
digits attribute. The value of the
digits attribute is
www3, which causes Twilio to pause for 1.5 seconds before playing the DTMF tone for
Twilio supports the following audio MIME types for audio files retrieved
|audio/mpeg||mpeg layer 3 audio|
|audio/wav||wav format audio|
|audio/wave||wav format audio|
|audio/x-wav||wav format audio|
|audio/aiff||audio interchange file format|
|audio/x-aifc||audio interchange file format|
|audio/x-aiff||audio interchange file format|
|audio/x-gsm||GSM audio format|
|audio/gsm||GSM audio format|
|audio/ulaw||μ-law audio format|
You can't nest any verbs within
<Play>. You can nest
<Play> within a <Gather> verb, with one major exception - you can't play "digits" within a
This TwiML document tells Twilio to download the cowbell.mp3 file and play the audio to the caller.
We are going to test our IVR menu to make sure users can navigate properly. We know that the length of the initial greeting and the menu number we need to enter. We can add a few leading 'w' characters to add a pause. Each 'w' character tells Twilio to wait 0.5 seconds instead of playing a digit. This lets you adjust the timing of when the digits begin playing to suit the phone system you are dialing.
- Twilio will attempt to cache the audio file the first time it is played. This means the first attempt may be slow to play due to the time spent downloading the file from your remote server. Twilio may play a processing sound while the file is being downloaded.
- Twilio will cache files when HTTP headers allow it (via ETag and Last-Modified headers). Responding with
Cache-Control: no-cachewill ensure Twilio always checks if the file has changed, allowing your your web server to respond with a new version or with a 304 Not Modified to instruct Twilio to use its cached version.
- We recommend hosting your media in AWS S3 in us-east-1, eu-west-1, or ap-southeast-2 depending on which Twilio Region you are using. No matter where you host your media files, always ensure that you’re setting appropriate Cache Control headers. Twilio uses a caching proxy in its webhook pipeline and will cache media files that have cache headers. Serving media out of Twilio’s cache can take 10ms or less. Keep in mind that we run a fleet of caching proxies so it may take multiple requests before all of the proxies have a copy of your file in cache.
- Audio played over the telephone network is transcoded to a format the telephone network understands. Regardless of the quality of the file you provide us, we will transcode so it plays correctly. This may result in lower quality because the telephone number does not support high bitrate audio.
- High bitrate, lossy encoded files, such as 128kbps mp3 files, will take longer to transcode and potentially sound worse than files that are in lossless 8kbps formats. This is due to the inevitable degradation that occurs when converting from lossy compressed formats and the processing involved in converting from higher bit rates to low bit rates.
<Play>ing a file that is longer than 40 minutes can result in a dropped call. If you need to
<Play>a file longer than 40 minutes, consider splitting it up into smaller chunks.