Twilio ConversationRelay
New

Human-friendly voice AI that keeps customers from shouting "live agent!"

Easily integrate voice AI into your stack for smooth, personalized customer conversations—no complicated infrastructure or awkward AI moments.

Smiling woman speaking on the phone with a virtual agent interface overlay displayed.

How Twilio ConversationRelay works

Diagram showing integration of Twilio Voice with a ConversationRelay API connecting to an app with TTS and STT components.

Our scalable solution combines fast speech-to-text (STT) and text-to-speech (TTS) capabilities with the AI of your choice, seamlessly orchestrated through a WebSocket API.

  • Avoid long pauses or waiting for voice AI to finish

    Low latency keeps conversations flowing with quick responses, while interruption handling lets users jump in for instant answers.

  • Deliver natural voice, pacing, and intonation

    Provide customer interactions that sound like a real person, with the option to seamlessly transfer to a live agent for complex issues.

  • Add context for personalized support

    Enable smooth input/output with your live language model (LLM) so your AI agent can recognize customers and recall interactions.

Build AI support that understands your customers

Create customer experiences that are engaging, friendly, and always on point.

Deliver effortless self-service support

Enable context-aware, intelligent virtual agents that handle inquiries
efficiently—and know exactly when to bring in a human.

  • Handle routine inquiries while keeping customers engaged and frustration-free. 

  • Escalate complex or sensitive issues to live agents when needed.

  • Orchestrate customer data to provide personalized, context-rich interactions at scale.

Flowchart showing an incoming call, user data collection, virtual agent interaction, and sentiment analysis.

ConversationRelay features

Streamline complexity while enabling human-like interactions at scale.

A dashboard showing a virtual agent and sentiment analysis results indicating positive sentiment.

Sign up now

Measure and optimize AI virtual agent performance

ConversationRelay now integrates with Conversational Intelligence for natively supported AI agent observability. Transform unstructured conversational data into actionable insights that can be used to assess and enhance the performance of your voice AI agent.

  • LLM integration that works for you

    Get the flexibility to bring your own LLM so you can control your UX, manage costs, and adopt new tech as it's released.

  • Speech recognition STT

    Convert spoken words into text in real time to supply your LLM with accurate transcription for responsive conversations.

  • Natural human-sounding TTS

    Our TTS models analyze generated text to get pronunciation, intonation, and rhythm just right, like a real human agent. 

  • Interruption handling

    Utilize our proprietary orchestration algorithm to manage interruptions so you don’t have to handle them yourself.

  • Global connectivity

    Access flexible, secure connectivity that includes number provisioning porting compliance.

  • Low-latency infrastructure

    Minimize latency to improve the quality of voice AI interactions and ensure a better customer experience.

  • HIPAA-eligible framework

    We handle the complexities of voice within a HIPAA-eligible framework—so you can deploy AI models and build compliant solutions faster.

Start bringing your voice AI agent to life

Explore our comprehensive APIs, documentation, and other developer-friendly tools for creating custom voice AI solutions.

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Connect action="https://myhttpserver.com/connect_action">
    <ConversationRelay url="wss://mywebsocketserver.com/websocket" welcomeGreeting="Hi! Ask me anything!" />
  </Connect>
</Response>

Need help setting up ConversationRelay?

Work with one of our trusted partners to set up your voice AI solution and start delivering amazing engagement. View partners

Your AI powers the conversations. We handle the voice.

Start building with ConversationRelay today. Discover voice STT, TTS, and seamless GenAI orchestration—so you can focus on designing the smart, meaningful interactions your virtual agents need to deliver.

A smiling man holding a phone to his ear, wearing a dark jacket and green shirt with a red background.

ConversationRelay FAQ

Customers often face:

  • High Complexity: Managing real-time communications, websockets, and codecs.
  • Latency Issues: Balancing performance with user experience.
  • Integration Pain Points: Orchestrating TTS, STT, and LLM solutions while maintaining scalability.
  • ConversationRelay addresses these issues with a streamlined, ready-to-use platform that minimizes technical barriers.

Latency directly impacts the quality of voice AI interactions. High latency causes unnatural pauses and disruptions, which can frustrate customers and undermine trust. VoxRay is optimized to minimize latency, ensuring smooth, human-like conversations that are critical for high-stakes interactions in customer support and sales.

  • Best-of-breed providers integrated directly into the Twilio platform
  • Dedicated, single-tenant, customized infrastructure colocated with call and media edges.
  • Proprietary orchestration algorithms for handling interruptions, prefetching results, and batching text tokens.

ConversationRelay is a conversational AI product offering designed to make building production-quality voice AI agents easy. It simplifies the development process by integrating key components like Speech-to-Text (STT), Text-to-Speech (TTS), and Large Language Model (LLM) orchestration. Unlike Media Streams, which requires customers to manage their own media servers, orchestration, and integrations, ConversationRelay provides a ready-to-use websocket interface with lower latency and greater control, making it easier to build and scale voice AI solutions.

Tex-to-Speech Providers

  • Google Voices
  • Amazon Voice
  • ElevenLabs Voices

Automated Speech Recognition

  • Google Speech API
  • Amazon Speech
  • DeepGram

ConversationRelay provides pre-configured packages and APIs that simplify the setup process, allowing customers to focus on their AI models and user experiences instead of dealing with the underlying infrastructure. These quickstart configurations are tailored to common use cases, enabling faster time-to-value.

It depends on which provider options are selected.

  • Regionalized: Amazon, Google
  • US1: Deepgram
  • Conversation relay is now HIPAA eligible 
  • We are currently evaluating PCI compliance