Picking a voice for your ConversationRelay application helps create an engaging user experience. Twilio supports text-to-speech voices from Google, Amazon Polly, and ElevenLabs. Text-to-Speech (TTS) voice quality varies significantly by provider and voice type. While generative voices often offer higher fidelity and more natural-sounding responses, they might increase response latency and process TTS at a slower rate.
Default voice settings
If your <ConversationRelay> lacks a voice attribute, ConversationRelay applies a default voice based on the language setting defined with the language or ttsLanguage attribute and the selected TTS provider. Twilio uses ElevenLabs as the default TTS provider.
View list of default voice settings by language and locale
Language (Locale)
ISO code
Voice ID
TTS provider
Speech model
Transcriber
Bulgarian
bg-BG
AB9XsbSA4eLG12t2myjN
ElevenLabs
long
Google
Czech
cs-CZ
uYFJyGaibp4N2VwYQshk
ElevenLabs
long
Google
Danish
da-DK
ygiXC2Oa1BiHksD3WkJZ
ElevenLabs
long
Google
German
de-DE
FTNCalFNG5bRnkkaP5Ug
ElevenLabs
telephony
Google
English (Australia)
en-AU
9Ft9sm9dzvprPILZmLJl
ElevenLabs
telephony
Google
English (UK)
en-GB
Fahco4VZzobUeiPqni1S
ElevenLabs
telephony
Google
English (India)
en-IN
mCQMfsqGDT6IDkEKR20a
ElevenLabs
long
Google
English (US)
en-US
UgBBYS2sOqTuMpoF3BR0
ElevenLabs
telephony
Google
Castilian (Spain)
es-ES
6xftrpatV0jGmFHxDjUv
ElevenLabs
telephony
Google
Spanish (US)
es-US
CaJslL1xziwefCeTNzHv
ElevenLabs
telephony
Google
Finnish
fi-FI
6xPz2opT0y5qtoRh1U1Y
ElevenLabs
long
Google
French (Québec)
fr-CA
IPgYtHTNLjC7Bq7IPHrm
ElevenLabs
telephony
Google
French (France)
fr-FR
a5n9pJUnAhX4fn7lx3uo
ElevenLabs
telephony
Google
Hindi
hi-IN
IvLWq57RKibBrqZGpQrC
ElevenLabs
long
Google
Hungarian
hu-HU
TumdjBNWanlT3ysvclWh
ElevenLabs
long
Google
Indonesian
id-ID
1k39YpzqXZn52BgyLyGO
ElevenLabs
long
Google
Italian
it-IT
uScy1bXtKz8vPzfdFsFw
ElevenLabs
telephony
Google
Japanese
ja-JP
3JDquces8E8bkmvbh6Bc
ElevenLabs
telephony
Google
Kannada
kn-IN
kn-IN-Standard-A
Google
long
Google
Korean
ko-KR
uyVNoMrnUku1dZyVEXwD
ElevenLabs
telephony
Google
Malayalam
ml-IN
ml-IN-Standard-A
Google
long
Google
Marathi
mr-IN
mr-IN-Standard-A
Google
long
Google
Dutch (Belgium)
nl-BE
s7Z6uboUuE4Nd8Q2nye6
ElevenLabs
telephony
Google
Dutch (Netherlands)
nl-NL
UNBIyLbtFB9k7FKW8wJv
ElevenLabs
telephony
Google
Polish
pl-PL
W0sqKm1Sfw1EzlCH14FQ
ElevenLabs
long
Google
Portuguese (Brazil)
pt-BR
CstacWqMhJQlnfLPxRG4
ElevenLabs
telephony
Google
Portuguese (Portugal)
pt-PT
TsZfI8Nbn2Xd7ArC76n9
ElevenLabs
telephony
Google
Romanian
ro-RO
OlBp4oyr3FBAGEAtJOnU
ElevenLabs
long
Google
Russian
ru-RU
AB9XsbSA4eLG12t2myjN
ElevenLabs
long
Google
Swedish
sv-SE
4xkUqaR9MYOJHoaC1Nak
ElevenLabs
long
Google
Tamil
ta-IN
ZhJ5LanYnCmLKQUXvsV7
ElevenLabs
long
Google
Telugu
te-IN
te-IN-Standard-A
Google
long
Google
Thai
th-TH
th-TH-Standard-A
Google
long
Google
Turkish
tr-TR
IuRRIAcbQK5AQk1XevPj
ElevenLabs
long
Google
Ukranian
uk-UA
nCqaTnIbLdME87OuQaZY
ElevenLabs
long
Google
Vietnamese
vi-VN
foH7s9fX31wFFH2yqrFa
ElevenLabs
long
Google
The Twilio internal configuration defines these default settings. The defaults get updated periodically. For a complete and current list of supported languages, default voices, and detailed settings, see the Twilio TTS Voices documentation.
Choose a specific voice for your app
Each provider offers a variety of languages and styles. Choose one that best reflects your app's voice experience.
From the Text-to-Speech page, you can choose default voices for when you haven't set specific values in a TwiML file for:
An language or voice attribute as the Default Voice
A voice attribute for a combination of language and locale as Language Mappings
When setting these default values, you can listen to a sample of a specific voice. This procedure explains how to listen to a voice sample while setting the Default Voice.
To filter the available choices to Amazon Polly only, set Provider to Polly.
You can filter the available choices using these parameters:
Language and locale
Type
Gender
Highlight the Voice ID in the Voice column in the table. An example would be Joey-Neural. Don't copy the * in that column. That symbol identifies a language as bilingual.
Copy this value.
Use an Amazon Polly voice in your TwiML file
Create or open an existing TwiML file.
Create a new <Connect> tag or go to an existing one.
In the <Connect> tag, add an unpaired <ConversationRelay> noun.
Add a url attribute set to the value of your Websocket server.
Add a ttsProvider attribute set to the value of Polly.
Add a voice attribute set to the value of the voice ID copied in the previous section.
Example TwiML app that uses Amazon Polly for ConversationRelay
1
<Connect>
2
<ConversationRelay
3
url="wss://example.com/websocket"
4
ttsProvider="Polly"
5
voice="Joey-Neural"
6
... />
7
</Connect>
To learn more about the <ConversationRelay> noun, see the <ConversationRelay> documentation.
Save your TwiML file.
Choose an ElevenLabs voice
ElevenLabs defaults to their Flash 2.5 model.
Sample an ElevenLabs voice
Browse the list of ElevenLabs voices.
View the list of ElevenLabs voices
Find a voice that matches your requirements. You can filter the list by language, accent, voice quality (category), age, gender, and any tags applied to the language.
Click the play button (▶) next to the voice row.
Use an ElevenLabs voice in your TwiML file
From the table, click Copy VoiceID in the row of your chosen voice.
Create or open an existing TwiML file.
Create a new <Connect> tag or go to an existing one.
In the <Connect> tag, add an unpaired <ConversationRelay> noun.
Add a url attribute set to the value of your Websocket server.
Add a ttsProvider set to the value of "ElevenLabs".
Add a voice attribute set to the value of the copied VoiceID.
Add one or both optional settings for the voice attribute.
Add an audio model. The voices from ElevenLabs default to the Flash 2.5 model. You can choose other models and could improve the quality or performance of your application depending on your use case. You can use a different model by appending a hyphen to the VoiceID followed by the model ID. The supported model IDs include flash_v2, turbo_v2_5, turbo_v2, and the default, flash_v2_5. Some models only work with a specific set of languages. To learn about the strengths and the supported languages of each model, see the ElevenLabs website.
Add the speed, stability, and similarity of your chosen ElevenLabs voice. At the end of your selected voice or audio model value, add a hyphen, then follow it with an underscore-separated string with values for speed, stability, and similarity in that order. You can't set only one of these values. If you set one, you must set them all.
The speed value should be a value between 0.7 and 1.2.
The stability and similarity values can range from 0.0 to 1.0.
For example: A voice value of XrExE9yKIg1WjnnlVkGX-1.2_0.6_0.8 sets its speed to 1.2, its stability to 0.6, and its similarity to 0.8.
Setting
Type
Necessity
Accepted values
Separator
Voice ID
String
Required
20 alphanumeric character identifier from ElevenLabs
-
Audio Model
String
Optional
flash_v2, flash_v2.5, turbo_v2, turbo_v2.5
_
Speed
Number
Optional
0.7 to 1.2 inclusive
_
Stability
Number
Optional
0.0 to 1.0 inclusive
_
Similarity
Number
Optional
0.0 to 1.0 inclusive
_
To learn more about how these settings affect your application's voice, see [the ElevenLabs documentation][].
Save your TwiML file.
Example TwiML app that uses ElevenLabs for ConversationRelay