Speech recognition

Convert speech to text and analyze its intent during any voice call. Available with pay-as-you-go pricing.

How speech-to-text works

      

			<Response>
    <Gather input=”speech”
             action=”/finalresult” 
             partialResultCallback=”/partialResult”> 
    <Say>Say ahoy to Twilio Speech Recognition!</Say>
    </Gather>
<Response>
		

Using a simple <Gather> command, the Speech Recognition API captures your speech in real-time, transcribes it, and returns text.

View docs

Real-time transcription

Add automatic speech recognition (ASR) the easy way.

No training required

Transcribe a wide range of industry-specific words and phrases out of the box, without any pre-training.

Streaming results

Build responsive voice applications that act on partial recognition results as your customer speaks.

Multiple languages

Recognizes 119 languages and dialects (and more coming soon) to support your global user base.

Use cases

Give customers the choice to use their natural language to navigate menus and collect information.

IVR

Turn nested phone trees into simple “what can I help you with” voice prompts
Voice search

Allow customers to dial into your knowledge base and get the answers they need

Form fills

Ask customers questions and capture their answers using ASR to fill out forms and qualify leads.

The Twilio difference

Business connecting to customer through preferred communication channels

Experience a 99.95% uptime SLA made possible with automated failover and zero maintenance windows.

Extend the same app you write once to new markets with configurable features for localization and compliance.

Use the same platform you know for voice, SMS, video, chat, two-factor authentication, and more.

Get to market faster with pay-as-you-go pricing, free support, and the freedom to scale up or down without contracts.

Contact sales View pricing

*Only phone_call model is available for premium

Speech recognition

How speech-to-text works

Real-time transcription

No training required

Streaming results

Multiple languages

Use cases

The Twilio difference

Reliability

Scalability

Multichannel

No shenanigans

2025 Gartner® Magic Quadrant™ for CPaaS

2025 State of Customer Engagement Report

2025 Gartner® Magic Quadrant™ for CPaaS

2025 State of Customer Engagement Report

Speech recognition

How speech-to-text works

Real-time transcription

No training required

Streaming results

Multiple languages

Use cases

The Twilio difference

Reliability

Scalability

Multichannel

No shenanigans