Build Multimodal Conversational AI Experiences with Twilio and the OpenAI Realtime API

August 28, 2025

Build Multimodal Conversational AI Experiences with Twilio and the OpenAI Realtime API

Today, we’re excited to celebrate with our friends at OpenAI as they bring their Realtime API—powered by their flagship multilingual and multimodal GPT Realtime model—to general availability. With this milestone, we’re excited to see the innovative Voice AI experiences you can create using OpenAI and Twilio together.

OpenAI’s Realtime API reduces latency, delivers advanced analysis of tone and pitch, and quickly enables sophisticated conversational features like pacing, interruption handling, and turn-taking. These capabilities provide richer context for understanding sentiment, emotional undertones, and even sarcasm – elevating the quality of every conversation. 

Voice AI continues to gain traction as technologies advance, helping developers create increasingly natural and human-like virtual agent experiences. At Twilio, we’re obsessed with helping you deliver exceptional voice experiences for your customers at scale. The advancements in multimodal models like with gpt-realtime, OpenAI's most capable speech-to-speech model, the Realtime API represent an exciting shift in the Voice AI space, and will enable organizations to build richer customer experiences. We’re excited to see what you’ll build with it, and that’s why we’re thrilled to release tutorials, sample applications, and a new TwiML noun so you can take advantage of OpenAI’s Realtime API GA release with Twilio today!

Integrating Twilio's APIs and OpenAI's Realtime API

Ready to dive in?

Here are all of the tutorials, sample apps, and repositories showing how to build with Twilio APIs and the OpenAI Realtime API we have available:

Tutorials

In these tutorials, learn how to integrate OpenAI's Realtime API with Twilio Voice to build a GenAI-powered virtual agent using Media Streams.

Sample apps

Over on Code Exchange, we have two sample applications for you to check out using the OpenAI Realtime API – our Flex integration, demonstrating how OpenAI's Realtime API might be used to translate between a caller and an agent, and a sample app wrapping our Voice and Media Streams demo.

Integration repos

You can access our repos demonstrating the integrations directly here:

AiSession

  • We are hard at work on a new TwiML feature called AiSession (currently in pilot) that simplifies your connection to the OpenAI Realtime API, offloading the orchestration of some components needed for multimodal conversational AI experiences, such as the WebSocket server and interruption handling.

If you’re interested in joining the pilot, learn more about AiSesson here.

Get started today

We're thrilled to be on this journey with OpenAI, helping to unlock the full potential of this technology for our customers. AI virtual agents, especially when personalized, deliver much better customer experiences, increased efficiencies for business, and better outcomes.

The world of conversational AI is moving at an incredible pace, and this is just the beginning. We can’t wait to see what you build with Twilio and OpenAI!