Live Translation with OpenAI's Realtime API

Created by: Twilio

Flex
JavaScript

This application demonstrates how to use Twilio and OpenAI's Realtime API for bidirectional voice language translation between a caller and a contact center agent.

The AI Assistant intercepts voice audio from one party, translates it, and speaks the audio in the other party's preferred language. Use of the Realtime API from OpenAI offers significantly reduced latency that is conducive to a natural two-way voice conversation.

See here for a video demo of the real time translation app in action.

Below is a high level architecture diagram of how this application works: Realtime Translation Diagram:

Architecture of Live Translation with OpenAI's Realtime API in Twilio Flex.

This application uses the following Twilio products in conjunction with OpenAI's Realtime API, orchestrated by this middleware application:

Voice
Studio
Flex
Task Router

Two separate Voice calls are initiated, proxied by this middleware service. The caller is asked to choose their preferred language, then the conversation is queued for the next available agent in Twilio Flex. Once connected to the agent, this middleware intercepts the audio from both parties via Media Streams and forwards to OpenAI Realtime for translation. The translated audio is then forwarded to the other party.

Report this template

Currently only available in JavaScript. Request hosting support for other languages

Get the code for this project
The code for this sample is available on GitHub to view.
View on GitHub
Read additional documentation for this sample
Get Twilio credentials
You will need an Account SID and Auth Token in order to run this code.
Checking for existing account...
Learn more about Twilio API authentication
Set up the code sample locally
Follow the setup instructions in the README to get the sample up and running.