TwiML™ Voice: <Gather>
You can use TwiML's <Gather>
verb to collect digits or transcribe speech during a call.
The following example shows the most basic use of <Gather>
TwiML:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather/>
</Response>
You can always send Twilio plain TwiML, or leverage the helper libraries to add TwiML to your web applications:
When Twilio executes this TwiML, it will pause for up to five seconds to wait for the caller to enter digits on their keypad. A few things might happen next:
- The caller enters digits followed by a
#
symbol or 5 seconds of silence. Twilio then sends the user's input as a parameter in a POST request to the URL hosting the<Gather>
TwiML (if no action attribute is provided) or to theaction
URL. - The caller doesn't enter any digits and 5 seconds pass. Twilio will move on to the next TwiML verb in the document – since there are no more verbs here, Twilio will end the call.
By nesting <Say> or <Play> in your <Gather>, you can read some text or play music for your caller while waiting for their input. See "Nest other verbs" below for examples and more information.
<Gather> Attributes
<Gather>
supports the following attributes that change its behavior:
Attribute name | Allowed values | Default value |
---|---|---|
action | URL (relative or absolute) | current document URL |
finishOnKey | 0 -9 , # , * , and '' (the empty string). Note: finishOnKey only works when a digit sequence precedes it. It cannot be used as a mechanism to stop a <Gather> without previous digit input |
# |
hints | "words, phrases that have many words", class tokens | none |
input | dtmf , speech , dtmf speech |
dtmf |
language | BCP-47 language tags | en-US |
method | GET , POST |
POST |
numDigits | positive integer | unlimited |
partialResultCallback | URL (relative or absolute) | none |
partialResultCallbackMethod | GET , POST |
POST |
profanityFilter | true , false |
true |
speechTimeout | positive integer or auto |
timeout attribute value |
timeout | positive integer | 5 |
speechModel | default , numbers_and_commands , phone_call , experimental_conversations , experimental_utterances |
default |
enhanced | true,false | false |
actionOnEmptyResult | true , false |
false |
Use one or more of these attributes in a <Gather>
verb like so:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather input="speech dtmf" timeout="3" numDigits="1">
<Say>Please press 1 or say sales for sales.</Say>
</Gather>
</Response>
action
The action
attribute takes an absolute or relative URL as a value. When the caller finishes entering digits (or the timeout is reached), Twilio will make an HTTP request to this URL. That request will include the user's data and Twilio's standard request parameters.
If you do not provide an action
parameter, Twilio will POST to the URL that houses the active TwiML document.
Twilio may send some extra parameters with its request after the <Gather>
ends:
If you gather digits from the caller, Twilio will include the Digits
parameter containing the numbers your caller entered.
If you specify speech as an input with input="speech"
, Twilio will include SpeechResult
and Confidence
:
SpeechResult
contains the transcribed result of your caller's speech.Confidence
contains a confidence score between 0.0 and 1.0. A higher confidence score means a better likelihood that the transcription is accurate.
Note: Your code should not expect confidence
as a required field as it is not guaranteed to be accurate, or even set, in any of the results.
After <Gather>
ends and Twilio sends its request to your action
URL, the current call will continue using the TwiML you send back from that URL. Because of this, any TwiML verbs that occur after your <Gather>
are unreachable.
However, if the caller did not enter any digits or speech, call flow would continue in the original TwiML document.
Without an action
URL, Twilio will re-request the URL that hosts the TwiML you just executed. This can lead to unwanted looping behavior if you're not careful. See our example below for more information.
If you started or updated a call with a twiml
parameter, the action
URLs for <Record>
, <Gather>
, and <Pay>
must be absolute.
The Call Resource API Docs have language-specific examples of creating and updating Calls with TwiML:
- See "Create a Call resource with TwiML" under Create a Call Resource to see examples of creating a call with a
twiml
parameter. - See "Update a Call resource with TwiML" under Update a Call Resource to see examples of updating a call with a
twiml
parameter.
Imagine you have the following TwiML hosted at http://example.com/complex_gather.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather>
<Say>
Please enter your account number,
followed by the pound sign
</Say>
</Gather>
<Say>We didn't receive any input. Goodbye!</Say>
</Response>
Scenario 1: If the caller:
- does not press the keypad or say anything for five seconds, or
- enters '#' (the default
finishOnKey
value) before entering any other digits
then they will hear, "We didn't receive any input. Goodbye!"
Scenario 2: If the caller:
- enters a digit while the call is speaking "Please enter your account number..."
then the <Say>
verb will stop speaking and wait for the user's action.
Scenario 3: If the caller:
- enters
12345
and then presses#
, or - allows 5 seconds to pass
then Twilio will submit the digits and request parameters to the URL hosting this TwiML (http://example.com/complex_gather.xml
). Twilio will fetch this same TwiML again and execute it, getting the caller stuck in this <Gather>
loop.
To avoid this behavior, it's best practice to point your action
URL to a new URL that hosts some other TwiML for handling the duration of the call.
The following code sample is almost identical to the TwiML above, but we've added the action
and method
attributes:
Now when the caller enters their input, Twilio will submit the digits and request parameters to the process_gather.php
URL.
If we wanted to read back this input to the caller, our code hosted at /process_gather.php
might look like:
<?php
// page located at http://yourserver/process_gather.php
echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
echo "<Response><Say>You entered " . $_REQUEST['Digits'] . "</Say></Response>";
?>
finishOnKey
finishOnKey
lets you set a value that your caller can press to submit their digits.
For example, if you set finishOnKey
to #
and your caller enters 1234#
, Twilio will immediately stop waiting for more input after they press #
.
Twilio will then submit Digits=1234
to your action
URL (note that the #
is not included).
Allowed values for this attribute are:
#
(this is the default value)*
- Single digits
0
–9
- An empty string (
''
)
If you use an empty string, <Gather>
will capture all user input and no key will end the <Gather>
. In this case, Twilio submits the user's digits to the action
URL only after the timeout is reached.
If the following TwiML is used, finishOnKey
has no impact once the caller starts speaking.
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather input="speech dtmf" finishOnKey="#" timeout="5">
<Say>
Please say something or press * to access the main menu
</Say>
</Gather>
<Say>We didn't receive any input. Goodbye!</Say>
</Response>
hints
You can improve Twilio's recognition of the words or phrases you expect from your callers by adding hints
to your <Gather>
.
The hints
attribute contains a list of words or phrases that Twilio should expect during recognition.
You may provide up to 500 words or phrases in this list, separating each entry with a comma. Your hints may be up to 100 characters each, and you should separate each word in a phrase with a space, e.g.:
hints="this is a phrase I expect to hear, keyword, product name, name"
We have also implemented Google's class token list to improve recognition. You can pass a class token directly in the hints.
hints="$OOV_CLASS_ALPHANUMERIC_SEQUENCE"
input
Specify which inputs (DTMF or speech) Twilio should accept with the input
attribute.
The default input
for <Gather>
is dtmf
. You can set input
to dtmf
, speech
, or dtmf speech
.
If you set input
to speech
, Twilio will gather speech from the caller for a maximum duration of 60 seconds.
Please note that <Gather>
speech recognition is not yet optimized for Alphanumeric inputs (e.g. ABC123), this could lead to inaccurate results and thus, we do not recommend it.
If you set dtmf speech
for your input, the first detected input (speech
or dtmf
) will take precedence. If speech
is detected first, finishOnKey
will be ignored.
The following example shows a <Gather>
that specifies speech input from the user. When this TwiML executes, the caller will hear the <Say>
prompt. Twilio will then collect speech input for up to 60 seconds.
Once the caller stops speaking for five seconds, Twilio posts their transcribed speech to your action
URL.
language
The language
attribute specifies the language Twilio should recognize from your caller.
This value defaults to en-US
, but you can set your language
to any of our supported languages: see the full list.
method
The method
you set on <Gather>
tells Twilio whether to request your action URL via HTTP GET
or POST
.
POST
is <Gather>
's default method.
numDigits
You can set the number of digits you expect from your caller by including numDigits
in <Gather>
.
The numDigits
attribute only applies to DTMF input.
For example, you might wish to set numDigits="5"
when asking your caller to enter their 5-digit zip code. Once the caller enters the final digit of 94117
, Twilio will immediately submit the data to your action
URL.
partialResultCallback
If you provide a partialResultCallback
URL, Twilio will make requests to this URL in real-time as it recognizes speech. These requests will contain a parameter labeled UnstableSpeechResult
which contains partial transcriptions. These transcriptions may change as the speech recognition progresses.
The webhooks Twilio makes to your partialResultCallback
are asynchronous. They do not accept any TwiML in response. If you want to take more actions based on this partial result, you need to use the REST API to modify the call.
profanityFilter
The profanityFilter
specifies if Twilio should filter profanities out of your speech transcription. This attribute defaults to true
, which replaces all but the initial character in each filtered profane word with asterisks, e.g., 'f***.'
If you set this attribute to false
, Twilio will no longer filter any profanities in your transcriptions.
speechTimeout
When collecting speech from your caller, speechTimeout
sets the limit (in seconds) that Twilio will wait after a pause in speech before it stops its recognition. After this timeout is reached, Twilio will post the speechResult
to your action
URL.
If you use both timeout
and speechTimeout
in your <Gather>
, timeout
will take precedence for DTMF input and speechTimeout
will take precedence for speech.
If you set speechTimeout
to auto
, Twilio will stop speech recognition when there is a pause in speech and return the results immediately.
timeout
timeout
allows you to set the limit (in seconds) that Twilio will wait for the caller to press another digit or say another word before it sends data to your action
URL.
For example, if timeout
is 3
, Twilio wait three seconds for the caller to press another key or say another word before submitting their data.
Twilio will wait until all nested verbs execute before it begins the timeout
period.
The default timeout
value is 5
.
speechModel
speechModel
allows you to select a specific model that is best suited for your use case to improve the accuracy of speech to text. The attribute currently supports default
, numbers_and_commands
, phone_call
, experimental_conversations
, and experimental_utterances
.
numbers_and_commands
and phone_call
are best suited for the use cases where you'd expect to receive short queries such as voice commands or voice search. phone_call
is best for audio that originated from a PSTN phone call (typically an 8khz sample rate).
<Gather input="speech" enhanced="true"speechModel="phone_call">
<Say>Please tell us why you're calling.</Say>
</Gather>
The phone_call
value for speechModel currently only supports a set of languages, they are: en-US
, en-GB
, en-AU
, fr-FR
, fr-CA
, ja-JP
, ru-RU
, es-US
, es-ES
, and pt-BR
. You must also set the speechTimeout
value to a positive integer, rather than using auto
.
Experimental models are designed to give access to the latest speech technology and machine learning research, and can provide higher accuracy for speech recognition over other available models. However, some features that are supported by other available models are not yet supported by the experimental models such as confidence scores.
The experimental_utterances
model is for short utterances that are a few seconds in length and is useful for trying to capture commands or other single shot directed speech use cases; think "press 0 or say 'support' to speak with an agent". The experimental_conversations
model supports spontaneous speech and conversations; think "tell us why you're calling today".
Both experimental_conversations
and experimental_utterances
values for speechModel currently only supports a set of languages, they are listed here.
Please explore all options to see which works best for your use case.
enhanced
The enhanced
attribute instructs <Gather>
to use a premium speech model that will improve the accuracy of transcription results. The premium speech model is only supported with phone_call
speechModel
. It costs 50% more at $0.03 per 15s of <Gather> than the standard phone_call
model. The premium phone_call
model was built using thousands of hours of training data. It ensures 54% fewer errors when transcribing phone conversations when compared to the basic phone_call
model.
The following TwiML instructs <Gather>
to use premium phone_call
model:
<Gather input="speech" enhanced="true"speechModel="phone_call">
<Say>Please tell us why you're calling.</Say>
</Gather>
<Gather>
will ignore the enhanced
attribute if any other speechModel
, other than phone_call
is used.
For example, in the following TwiML, <Gather>
ignores the enhanced
attribute and applies standard numbers_and_commands
speechModel
.
<Gather input="speech" enhanced="true" speechModel="numbers_and_commands">
<Say>Please tell us why you're calling.</Say>
</Gather>
The premium enhanced phone_call
model is priced at $0.03 per utterance while the standard phone_call
model is priced at $0.02 per utterance.
Read the Language appendix to see if the enhanced model is available for your language.
actionOnEmptyResult
actionOnEmptyResult
allows you to force <Gather>
to send a webhook to the action url even when there is no DTMF input. By default, if <Gather>
times out while waiting for DTMF input, it will continue on to the next TwiML instruction.
For example, in the following TwiML when <Gather>
times out, <Say>
instruction is executed.
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather>
<Say>
Please enter your account number,
followed by the pound sign
</Say>
</Gather>
<Say>We didn't receive any input. Goodbye!</Say>
</Response>
To always force <Gather>
to send a webhook to the action
Url use the following TwiML,
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather actionOnEmptyResult="true" action="/gather-action">
<Say>
Please enter your account number,
followed by the pound sign
</Say>
</Gather>
</Response>
Nest other verbs
You can nest the following verbs within <Gather>:
<Say>
The following example shows a <Gather>
with a nested <Say>
. This will read some text to the caller, and allows the caller to enter input at any time while that text is read to them:
When a <Gather>
contains nested <Say>
or <Play>
verbs, the timeout
begins either after the audio completes or when the caller presses their first key. If <Gather>
contains multiple <Play>
verbs, the contents of all files will be retrieved before the <Play>
begins.
Cache static media for <Play> verbs
If you are using <Play>
verbs, we recommend hosting your media in AWS S3 in us-east-1, eu-west-1, or ap-southeast-2 depending on which Twilio Region you are using. No matter where you host your media files, always ensure that you’re setting appropriate Cache Control headers. Twilio uses a caching proxy in its webhook pipeline and will cache media files that have cache headers. Serving media out of Twilio’s cache can take 10ms or less. Keep in mind that we run a fleet of caching proxies so it may take multiple requests before all of the proxies have a copy of your file in cache.
Manage timeouts
When a <Gather>
reaches its timeout without any user input, call control will fall through to the next verb in your original TwiML document.
If you wish to have Twilio submit a request to your action
URL even if <Gather>
times out, include a <Redirect>
after the <Gather>
like this:
With this code, Twilio will move to the next verb in the document (<Redirect>
) when <Gather>
times out. In our example, we instruct Twilio to make a new GET request to /process_gather.php?Digits=TIMEOUT
Troubleshooting
A few common problems users face when working with <Gather>
:
Problem: <Gather>
doesn't receive caller input when the caller is using a VoIP phone.
Solution: Some VoIP phones have trouble sending DTMF digits. This is usually because these phones use compressed bandwidth-conserving audio protocols that interfere with the transmission of the digit's signal. Consult your phone's documentation on DTMF problems.
Problem: Twilio does not send the Digits
parameter to your <Gather>
URL.
Solution: Check to ensure your application is not responding to the action
URL with an HTTP 3xx redirect. Twilio will follow this redirect, but won't resend the Digits
parameter.
If you encounter other issues with <Gather>
, please reach out to our support team for assistance.
Language appendix
Click here to download a .csv
file of all available languages.
Language | Language Tag | Supports enhanced model | Supports experimental models |
---|---|---|---|
Afrikaans (South Africa) | af-ZA | No | No |
Albanian (Albania) | sq-AL | No | No |
Amharic (Ethiopia) | am-ET | No | No |
Arabic (Algeria) | ar-DZ | No | Yes |
Arabic (Bahrain) | ar-BH | No | Yes |
Arabic (Egypt) | ar-EG | No | Yes |
Arabic (Iraq) | ar-IQ | No | Yes |
Arabic (Israel) | ar-IL | No | Yes |
Arabic (Jordan) | ar-JO | No | Yes |
Arabic (Kuwait) | ar-KW | No | Yes |
Arabic (Lebanon) | ar-LB | No | Yes |
Arabic (Mauritania) | ar-MR | No | Yes |
Arabic (Morocco) | ar-MA | No | Yes |
Arabic (Oman) | ar-OM | No | Yes |
Arabic (Qatar) | ar-QA | No | Yes |
Arabic (Saudi Arabia) | ar-SA | No | Yes |
Arabic (State of Palestine) | ar-PS | No | Yes |
Arabic (Tunisia) | ar-TN | No | Yes |
Arabic (United Arab Emirates) | ar-AE | No | Yes |
Arabic (Yemen) | ar-YE | No | Yes |
Armenian (Armenia) | hy-AM | No | No |
Azerbaijani (Azerbaijani) | az-AZ | No | No |
Basque (Spain) | eu-ES | No | No |
Bengali (Bangladesh) | bn-BD | No | No |
Bengali (India) | bn-IN | No | No |
Bosnian (Bosnia and Herzgovina) | bs-BA | No | No |
Bulgarian (Bulgaria) | bg-BG | No | No |
Burmese (Myanmar) | my-MM | No | No |
Catalan (Spain) | ca-ES | No | No |
Chinese, Cantonese (Traditional, Hong Kong) | yue-Hant-HK | No | No |
Chinese, Mandarin (Simplified, China) | cmn-Hans-CN | No | No |
Chinese, Mandarin (Simplified, Hong Kong) | cmn-Hans-HK | No | No |
Chinese, Mandarin (Traditional, Taiwan) | cmn-Hant-TW | No | No |
Croatian (Croatia) | hr-HR | No | No |
Czech (Czech Republic) | cs-CZ | No | No |
Danish (Denmark) | da-DK | No | Yes |
Dutch (Netherlands) | nl-NL | No | Yes |
Dutch (Belgium) | nl-BE | No | No |
English (Australia) | en-AU | No | Yes |
English (Canada) | en-CA | No | No |
English (Ghana) | en-GH | No | No |
English (Hong Kong) | en-HK | No | No |
English (India) | en-IN | No | Yes |
English (Ireland) | en-IE | No | No |
English (Kenya) | en-KE | No | No |
English (New Zealand) | en-NZ | No | No |
English (Nigeria) | en-NG | No | No |
English (Pakistan) | en-PK | No | No |
English (Philippines) | en-PH | No | No |
English (Singapore) | en-SG | No | No |
English (South Africa) | en-ZA | No | No |
English (Tanzania) | en-TZ | No | No |
English (United Kingdom) | en-GB | Yes | Yes |
English (United States) | en-US | Yes | Yes |
Estonian (Estonia) | et-EE | No | No |
Filipino (Philippines) | fil-PH | No | No |
Finnish (Finland) | fi-FI | No | Yes |
French (Belgium) | fr-BE | No | No |
French (Canada) | fr-CA | No | Yes |
French (France) | fr-FR | Yes | Yes |
French (Switzerland) | fr-CH | Yes | No |
Galician (Spain) | gl-ES | No | No |
Georgian (Georgia) | ka-GE | No | No |
German (Austria) | de-AT | No | No |
German (Germany) | de-DE | No | Yes |
German (Switzerland) | de-CH | No | No |
Greek (Greece) | el-GR | No | No |
Gujarati (India) | gu-IN | No | No |
Hebrew (Israel) | he-IL | No | No |
Hindi (India) | hi-IN | No | Yes |
Hungarian (Hungary) | hu-HU | No | No |
Icelandic (Iceland) | is-IS | No | No |
Indonesian (Indonesia) | id-ID | No | No |
Italian (Italy) | it-IT | No | No |
Italian (Switzerland) | it-CH | No | No |
Japanese (Japan) | ja-JP | Yes | Yes |
Javanese (Indonesia) | jv-ID | No | No |
Kannada (India) | kn-IN | No | No |
Kazakh (Kazakhistan) | kk-KZ | No | No |
Khmer (Cambodian) | km-KH | No | No |
Korean (South Korea) | ko-KR | No | Yes |
Lao (Laos) | lo-LA | No | No |
Latvian (Latvia) | lv-LV | No | No |
Lithuanian (Lithuania) | lt-LT | No | No |
Macedonian (North Macedonia) | mk-MK | No | Yes |
Malay (Malaysia) | ms-MY | No | No |
Malayalam (India) | ml-IN | No | No |
Marathi (India) | mr-IN | No | No |
Mongolian (Mongolia) | mn-MN | No | No |
Nepali (Nepal) | ne-NP | No | No |
Norwegian Bokmål (Norway) | nb-NO | No | Yes |
Persian (Iran) | fa-IR | No | No |
Polish (Poland) | pl-PL | No | Yes |
Portuguese (Brazil) | pt-BR | No | Yes |
Portuguese (Portugal) | pt-PT | No | Yes |
Punjabi (Gurmukhi India) | pa-guru-IN | No | No |
Romanian (Romania) | ro-RO | No | Yes |
Russian (Russia) | ru-RU | Yes | Yes |
Serbian (Serbia) | sr-RS | No | No |
Sinhala (Sri Lanka) | si-LK | No | No |
Slovak (Slovakia) | sk-SK | No | No |
Slovenian (Slovenia) | sl-SI | No | No |
Spanish (Argentina) | es-AR | No | No |
Spanish (Bolivia) | es-BO | No | No |
Spanish (Chile) | es-CL | No | No |
Spanish (Colombia) | es-CO | No | No |
Spanish (Costa Rica) | es-CR | No | No |
Spanish (Dominican Republic) | es-DO | No | No |
Spanish (Ecuador) | es-EC | No | No |
Spanish (El Salvador) | es-SV | No | No |
Spanish (Guatemala) | es-GT | No | No |
Spanish (Honduras) | es-HN | No | No |
Spanish (Mexico) | es-MX | No | No |
Spanish (Nicaragua) | es-NI | No | No |
Spanish (Panama) | es-PA | No | No |
Spanish (Paraguay) | es-PY | No | No |
Spanish (Peru) | es-PE | No | No |
Spanish (Puerto Rico) | es-PR | No | No |
Spanish (Spain) | es-ES | Yes | Yes |
Spanish (United States) | es-US | Yes | Yes |
Spanish (Uruguay) | es-UY | No | No |
Spanish (Venezuela) | es-VE | No | No |
Sundanese (Indonesia) | su-ID | No | No |
Swahili (Kenya) | sw-KE | No | No |
Swahili (Tanzania) | sw-TZ | No | No |
Swedish (Sweden) | sv-SE | No | No |
Tamil (India) | ta-IN | No | No |
Tamil (Malaysia) | ta-MY | No | No |
Tamil (Singapore) | ta-SG | No | No |
Tamil (Sri Lanka) | ta-LK | No | No |
Telugu (India) | te-IN | No | No |
Thai (Thailand) | th-TH | No | Yes |
Turkish (Turkey) | tr-TR | No | Yes |
Ukrainian (Ukraine) | uk-UA | No | Yes |
Urdu (India) | ur-IN | No | No |
Urdu (Pakistan) | ur-PK | No | No |
Uzbek (Uzbekistan) | uz-UZ | No | No |
Vietnamese (Vietnam) | vi-VN | No | Yes |
Zulu (South Africa) | zu-ZA | No | No |
Need some help?
We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd by visiting Twilio's Stack Overflow Collective or browsing the Twilio tag on Stack Overflow.