The <Say> verb converts text to speech that is read back to the caller.
<Say> is useful for development or saying dynamic text that is difficult to pre-record.
The "Noun" of a Twilio verb is the body of the element, the thing the verb acts upon. In the case of <Say>, the noun is the text you wish spoken to the caller.
The <Say> verb does not submit any information. It always moves to the next verb after completing.
| Attribute Name | Allowed Values | Default Value |
|---|---|---|
| voice | man,woman | man |
| language | en,es,fr,de | en |
| loop | integer >= 0 | 1 |
The voice attribute allows you to choose a male or female voice to read the text back. The default value is "man".
The language attribute allows you pick a voice with a specific language's accent and pronunciations. The currently supported languages are "en" (English), "es" (Spanish), "fr" (French), and "de" (German). The default is "en".
The loop attribute specifies how many times you'd like the text repeated. The default is once.
Specifying 0 will cause the the <Say> verb to loop until the call is hung up.
The <Say> verb can be nested in the following elements
The following verbs can be nested within <Say>
<?xml version="1.0" encoding="UTF-8" ?>
<Response>
<Say>Hello World</Say>
</Response>
When a call is directed to the following TwiML document, the caller will hear "hello world" spoken once in a male voice.
<?xml version="1.0" encoding="UTF-8" ?>
<Response>
<Say voice="woman" loop="2">Hello</Say>
</Response>
This TwiML document tells Twilio to say Hello twice in a row with a female voice to the caller.
When translating text to speech, the <Say> tag will make assumptions about how to pronounce numbers, dates, times, amounts of money, and other abbreviations.
When saying numbers: 12345 will be spoken as "twelve thousand three hundred forty five". 1 2 3 4 5 will be spoken as "one two three four five".
Punctuation, such as commas and periods will be interpreted as natural pauses by the speech engine.
<Say> is useful for saying dynamic text that would be difficult to pre-record. In cases where
the contents of <Say> are static, you might consider recording a live person saying the phrase
and using the <Play> verb instead.
If you want to insert a longer pause try using the <Pause>
verb. <Pause> should be placed outside <Say> tags not nested inside them.