Generative Voices Available for <Say> in Public Beta

Tags

News

Products

Voice

Time to read:

June 23, 2025

Written by

Ramón Ulldemolins

Twilion

Reviewed by

Michael Carpenter

Twilion

Generative Voices Available for in Public Beta

We’re excited to announce new Generative Voices for <Say> in Public Beta, starting with Google’s Chirp3-HD voices and Amazon Polly Generative voices.

The Generative voices are powered by the latest technology and innovation in synthesized speech to offer the most human-like, emotionally engaged, and adaptive context-aware voices by "interpreting" the text-input and adjust speech accordingly (e.g., render context-dependent prosody, tone, emotion, pausing, spelling, dialectal properties, foreign word pronunciation, etc.). These synthetic voices are remarkably similar to a human voice, and make them the optimal option for Conversational AI applications and Virtual Agents.

You can try new enhanced voices by selecting them in the Text-to-Speech section of the Twilio Console, by setting them in your TwiML <Say> attributes, or by using Studio <Say> Widget.

The Value of Text-to-Speech

Nowadays, the need to improve efficiency is top of mind for service-based companies, and technology is the only path forward. Developers and business users use Text-To-Speech to turn traditional human-to-human interactions into seamless, machine-to-human interactions, making every interaction over voice a frictionless and first-class experience.

Twilio Voice aims to make it effortless for all organizations to create delightful and personalized experiences over voice channel. However, that doesn’t necessarily mean that all of them would always be human-to-human talking over a phone. In that sense, Text-To-Speech technology can be used for scaling Contact Centers efficiently leveraging AI and automation with Interactive Voice Response (IVR) or Virtual Agents; or delivering critical messages over a phone call with Voice Notifications. From startups to global enterprises, Twilio is the one-stop shop for continuous innovation in the digital era to drive sustainable growth and efficiency at scale without compromising CX.

For more information about available voices, please visit Twilio Text-to-Speech docs.

Get started with Text-to-Speech and Generative voices

Text-To-Speech (TTS) converts text into a human-sounding voice. For example, instead of recording audio files with human voices to play back in a call, which has limited flexibility and is not a scalable option, Text-to-Speech prompts can be programmatically generated from the raw text as a response to events in the application.

You can provide text in a TwiML <Say> instruction, and Twilio will synthesize speech in real-time and speak back the audio in any call or conference. To use a Google Voice, the voice property must have Google. as a prefix followed by the official Google Voice name. Likewise, to use a Amazon Polly Voice, the voice property must have Polly. as a prefix followed by the official Amazon Polly Voice name.

The TwiML sample below causes Twilio to play audio of a Google synthesized voice:

<Response>
<Say language="en-US" voice="Google.en-US-Chirp3-HD-Leda">Hello I am a Google Generative voice and I speak American English!</Say>
</Response>

The TwiML sample below causes Twilio to play audio of an Amazon Polly synthesized voice:

<Response>
<Say language="en-US" voice="Polly.Joanna-Generativel">Hello I am an Amazon Generative voice and I speak American English!</Say>
</Response>

Alternatively, the Say/Play Widget lets you add Text-to-Speech capabilities to your application with ease, including new Google voices. To learn more about it, see TTS in Studio documentation.

Text-to-Speech Settings: Set a default provider, voice and language

The Text-to-Speech page in Twilio Console allows you to define a default voice and language for your Account. These defaults are used when no language or voice attribute is provided in your <Say> TwiML.

In addition, you can define a Language Mapping and set a voice for every locale; and only specify the language and the text in the TwiML code. Twilio will automatically select the right voice when using <Say> in your application.

When changing the default provider, this will update your default voice and language, and the Language Mapping for all locales supported by this provider. Twilio will automatically select the best available voice from Generative, Neural and Standard tiers in this order.

To learn more about it, see Default TTS documentation.

What’s next

Here at Twilio, we believe in the power of voice to connect us intimately. We aim to ensure that voice interactions strengthen human relationships in our global economy. We want to make it effortless for all organizations to create delightful and personalized experiences over voice the channel that make every one of their customers feel valued and understood.

Twilio Text-To-Speech helps organizations like yours to design and deliver efficient, automated, and personalized voice experiences for your customers. And we’re committed to evolving our Text-to-Speech functionality and services and regularly updating our voices catalog, so Twilio continues to be the industry-leading and trusted platform that efficiently powers your customer engagement innovation.

Stay tuned for more to come, and we can’t wait to see what you build!

Related Resources

Twilio Docs

From APIs to SDKs to sample apps

API reference documentation, SDKs, helper libraries, quickstarts, and tutorials for your language and platform.

Resource Center

The latest ebooks, industry reports, and webinars

Learn from customer engagement experts to improve your own communication.

Ahoy

Twilio's developer community hub

Best practices, code samples, and inspiration to build communications and digital engagement experiences.

2025 Gartner® Magic Quadrant™ for CPaaS

2025 State of Customer Engagement Report