Add Voice AI to your website with the Twilio Voice JavaScript SDK and ConversationRelay

July 14, 2025
Written by
Reviewed by
Zack Pitts
Twilion
Paul Kamp
Twilion

Spin up a Demo Application that shows you how you can deploy talk-to-AI-agent buttons on your websites and mobile applications.

Voice is quickly becoming a primary interface for applications using Generative Artificial Intelligence. We at Twilio are seeing massive demand for Voice AI Agents in most corporate verticals.

That’s why we built ConversationRelay, which helps you build real-time, human-like voice experiences using the LLM of your choice on any channel: WebRTC, PSTN, or SIP. Twilio has been a leader in voice communications for over a decade, powering applications in everything from the browser to the call center to the PSTN – and now, powering AI-driven conversations.

In this demo, I’ll show you how to build a simple web interface with a TALK-TO-AGENT button that utilizes Twilio’s Voice JavaScript SDK. Calls from a web interface connect to Twilio ConversationRelay, which combines fast speech-to-text (STT) and text-to-speech (TTS) capabilities with the OpenAI API or your choice of model on Amazon Bedrock. Additionally, you’ll be able to try out multi-channel capabilities with Twilio Messaging and emails with Twilio SendGrid.

Sound good? Well, you make the call when you talk to the agent. Let’s get started.

Diagram showing integration of AI agents with Twilio and business applications.
Diagram showing voice communication process from users to application via Twilio's Programmable Voice and SDKs.

Some of the key benefits of our ConversationRelay Voice AI solution offered are the choices we provide you.

Twilio handles the things we are good at: Voice channels ( WebRTC, PSTN, SIP), scale, latency, orchestration, and direct relationships with speech-to-text and text-to-speech providers. We leave the important business decisions to the people building on our platform: the AI Provider, LLM Interaction, choice of SST and TTS providers, and the overall “agentic” or “conversational” experiences (applications) that you want to build.

What to see what we are building?

Here is a diagram of the components of this demo application, along with a high-level sequence that shows how Twilio can connect a WebRTC client to your application and the LLM provider of your choice:

Diagram of a demo application architecture using React, Express, and a datastore with notes on setup and usage steps.

A web or mobile application using the Twilio Voice JavaScript SDK needs to get an authentication token, and then it can place a call into Twilio Programmable Voice where you can unlock the magic of the Twilio Platform!

The Demo Application is NOT production-ready. It is meant to be straightforward to deploy and to run on your local machine.The Demo Application can point to OpenAI or Amazon Bedrock (and it’s extendable to other LLM providers!), and comes with a few existing AI use cases and the ability to add your own.

Let’s get started…

Want to watch a video of the installation instead?

Prerequisites

This is not a beginner level build! You should also be comfortable with some basic programming and command line skills to be able to spin up this demo application.

Let’s Build it!

1. Download the Code for this Application

Download the code from this repo, and then open up the folder in your preferred development environment.

GitHub repository page showing the main branch with initial commit files for ConversationRelay-WebRTC-Demo

The repo is divided into three folders:

  • /client ⇒ Contains a React Single Page Application (SPA) and contains the necessary code to show a button on your website to initiate a call. In addition, the client contains tools to help you build and experiment.
  • /data ⇒ We use a simple file system JSON object database for the Demo Application. You can certainly choose your own database for your application, but this simple solution will allow you to play with the functionality.
  • /server ⇒ We chose Express.js to power the backend of our Demo Application. Express is written in familiar Node.js and can easily deploy and run locally. It handles REST APIs and WebSockets well and is straightforward to follow. It should be very approachable for anyone interested in experimenting and building POCs!

2. Build the Client and the Server

From the root directory of the project, run the following:

$ cd client
$ npm install
$ npm run build

This will install all of the node.js dependencies and build a complete version of the client application (Single Page Application). The Express.js app, in the /server directory, uses the client application built in this step.

3. Start ngrok

In order for this demo application to work locally, Twilio needs to be able to communicate with your local machine. Ngrok is a good option for this task, but you can use any alternative that does the same thing.

We will assume that you are using ngrok. Open a new terminal window and start ngrok on your local machine and point it to port 3000, `using ngrok http 3000. Copy down your URL – you will need it later!

Terminal window showing a command to start Ngrok with a specified URL.

4. Create a Twilio API Key

The Twilio Voice JavaScript SDK requires a Twilio API Key. From your Twilio Console, find Admin in the upper right corner and then select Account management.

Dropdown menu with options for account billing, account management, trust hub, admin, and nonprofit benefits.

On the next screen, select API keys & tokens in the left column. Create a Standard API Key.

Be sure to take note of API Key and Secret! Twilio will only show you the secret once.

5. Create a Twilio TwiML App

The Twilio Voice JavaScript SDK needs to point to a TwiML app so that your client application knows where to get instructions to handle inbound calls.

From your Twilio Console, find Voice in the left column under Develop. If Voice is not in your left column you can add it by clicking on Explore Products at the bottom of the left column .

Under Voice, select Manage → TwiML apps :

Screenshot of Twilio Voice console showing options like Overview, Try it out, and Manage with TwiML apps highlighted.

Select Create new TwiML Appand enter a friendly name. Then, in Request URL under Voice Configuration, enter your ngrok url copied in step 3 with the path /twiml appended. It should look like this:

Screenshot of the 'Create new TwiML App' page with fields for Friendly Name, Voice URL, and Messaging URL

Enter in the following information, and be sure to change these fields for your local setup (and taste):

  • Friendly Name: Your Awesome Twilio WebRTC Demo Application
  • Voice Configuration
  • Request URL: https://<your-ngrok-domain>/twiml

Click Create.

Be sure to copy the TwiML App SID from your newly created TwiML App because you will need it shortly.

Twilio console displaying TwiML app SID and friendly name.

Your client application references this TwiML App when placing a call. The TwiML App directs calls placed from your client application (hosted on your local machine) to a URL of your choice. In the case of this app, the destination URL is also on your local machine.

Your local server (the Express.js application) accepts the inbound POST from Twilio and returns the TwiML needed to spin up a ConversationRelay session.

6. Configure your environment variables

All of the environment variables for this Demo Application are stored in a single “shell” script. You need to edit this shell script with your environment variables. Start by doing the following from a command prompt in the root directory of your project.

$ cd server
$ npm install
$ cp start-local-server.sh.sample start-local-server.sh

Open the file ‘ start-local-server.sh’ and you will see the following:

Code snippet showing export commands for configuring Twilio, NGROK, OpenAI, and AWS environment variables.

In the # TWILIO ENVIRONMENT VARIABLES section, enter your Twilio Account SID, the Twilio API Key details from step 4, and the TwiML App Sid from step 5.

In the # NGROK ENVIRONMENT VARIABLES section, domain initiated for your ngrok (or similar) tunnel to your local machine.

In the # AI PLATFORM SELECTION section, you can specify whether you want your application to call AWS Bedrock (invokeBedrock) or OpenAI (invokeOpenAI) for LLM processing. Whatever choice you make, you will need to have the necessary account and API credentials for the chosen platform and you will need to fill out the details in the corresponding section: respectively, the OPENAI ENVIRONMENT VARIABLES for OpenAI, and AWS ENVIRONMENT VARIABLES for AWS Bedrock. You can leave the default values for the AI Platform that you do not choose.

The STACK_USE_CASE variable is the default use case. You can change this as needed here – but also know that switching use cases is easy to do once you have the UI.

6a. (Optional) Configure Twilio Messaging and SendGrid environment variables

In the # TWILIO SEND MESSAGE ENVIRONMENT VARIABLES section, optionally enter Twilio account credentials and a phone number that is verified to send SMS messages. This will allow you to send messages (SMS) via tool calls. This could be the same Twilio account as the first variable block. You can add in Messaging Capabilities later if you just want to get up and running.

In the # TWILIO SENDGRID EMAILS ENVIRONMENT VARIABLES section, optionally enter a Twilio SendGrid API Key and verified FROM email address. This will allow you to send emails via tool calls. You can add in Email Capabilities later if you just want to get up and running.

7. Make a copy of the users.json file

The app allows you to set and save settings for your local user. Create a copy of the sample file for your local use.

From the root directory:

$ cp data/users.json.sample data/users.json

8. Start the Server and open a browser

With those settings in place, you are ready to fire it up! Start the server by executing the shell script you edited in step 6.

From the root directory:

$ cd server
$ ./start-local-server.sh

Your local server should start up at http://localhost:3000. Let’s start using it!

Start talking to your Voice AI Agents

Phew, you’re done loading the app – now it’s time for the fun part, using it. I’ll walk you through the features you’ve now got at your fingertips (or your ears).

Use your new TALK-TO-AGENT button

From your web browser pointed to http://localhost:3000, click on the Talk to ConversationRelay Agent button to kick things off!

Webpage with ConversationRelay WebRTC Quickstart and a button labeled Talk to ConversationRelay Agent

By default, the application loads an Albert Einstein use case. Try asking a few questions to get a feel for the responsiveness and cadence. You will notice the conversation audio visualized as you and your agents converse. The conversation transcription will run in real time, and will even show interruptions and latency metrics can be toggled on and off.

Try a different pre-loaded Use Case or build your own.

Talking about Albert Einstein with a voice AI Agent is exhilarating – but you’ll probably want to keep exploring what agents can do when you’re Einsteined out.

The demo application comes with pre-loaded use cases. From the UI, select a different use case, and save your configuration to launch a different experience:

Screenshot of a dropdown menu with options for different demo configurations like Albert Einstein Guide.

Click on the Use Cases tab to review the pre-configured uses cases and then clone a use case to build your own experience. A working POC is within your reach!

Navigation bar featuring Demo, Use Cases, and Call History options with red arrow pointing to Use Cases.

Try different Text-to-Speech Voices

The voices that you choose for your voice AI Applications are a central component of the user experience that your business provides to your customers. ConversationRelay has relationships with top text-to-speech providers and allows you to choose – and you definitely should sample and test many voices to get this element correct.

This application has preloaded English voices across three providers ( Google, Amazon, and ElevenLabs) The available voices are increasing rapidly. Refer to the Twilio Docs for the latest voices.

It is straightforward to add or edit the available voices. Open the JSON file located at /data/tts-providers.json and follow the convention to add the voices and languages required for your use cases.

Call your new application using the PSTN

As diagramed in Image 2 earlier in this blog post, ConversationRelay works with Twilio Voice and you can connect to Twilio Programmable Voice via WebRTC, SIP, and the PSTN.

Here is how to connect to the PSTN.

From your Twilio Console Phone Numbers -> Manage -> Active numbers page, select the number that you want to use, and set the A call comes in to WEBHOOK and then enter your ngrok url as shown below:

Screenshot of Twilio Voice Configuration page showing the webhook URL setup with an arrow pointing to the URL.

Give the phone number a call and it will connect to your ConversationRelay app, where you can chat with an AI Agent trained (well, prompted) on the use case you picked!

Note that it will select the default use case – you can add a user in /data/users.json if you want to control the experience for specific phone numbers calling into your application.

You can create a similar setup using Twilio SIP Domains.

Tool Calling

Agentic and conversational AI applications ultimately need to use tool calling to be able to accomplish real work. Review the code in server/lib/tools to get a sense of how tool calling works in this application.

Note that tools need to be configured in each use case and there are examples that you can follow in the tool-calling-example directory.

This is a Demo Application and not intended for production use! It is intended to show you how awesome it is to use Twilio to build WebRTC applications to connect to AI-backed voice applications. As shown in the section above, you aren’t limited to a single channel, it can also be used with PSTN and SIP.

ConversationRelay and WebRTC: Not just for JavaScript

This post has focused on adding AI-powered voice agent capabilities to your website, but this same concept applies to mobile applications. Check out our Voice Android and iOS SDKs to accomplish similar functionality in your mobile apps, with the same TwiML application.

Conclusion

In this blog post, we covered how you can build Voice-AI-backed applications using Twilio ConversationRelay and Twilio Programmable Voice to handle the elements that Twilio is good at while leaving important elements that differentiate your business up to you. You can be confident that your application can be reliably connected to the key voice channels – WebRTC, SIP and the PSTN – via Twilio’s industry leading communications platform.

Additional resources


Dan Bartlett has been building web applications since the first dotcom wave. The core principles from those days remain the same but these days you can build cooler things faster. He can be reached at dbartlett [at] twilio.com.

Charlie Avila has been guiding customers to enhanced CX solutions that unlock improved efficiency, cost benefits and customer satisfaction… He can be reached at cavila [at] twilio.com.

Ben Johnstone is a Principal Solutions Engineer at Twilio, based in Toronto. He works with enterprise retail customers to design and deliver scalable communications solutions, drawing on experience in cloud platforms, contact center technologies and complex system integrations. He can be reached at bjohnstone [at] twilio.com.