Integrate ElevenLabs Voices with Twilio's ConversationRelay

May 29, 2025
Written by

Integrate ElevenLabs Voices with Twilio's ConversationRelay

Twilio’s ConversationRelay service allows you to integrate Speech-to-Text (STT), Text-to-Speech (TTS) and AI Large Language Models (LLMs) with your Twilio Voice Applications. It handles the complexities of synchronous voice calls so that you can focus on processing conversational AI logic in the backend. With this setup you can create AI chatbots or personalized assistants that can respond to customer queries through natural voice interactions without having to connect with an agent.

For a high-level overview of ConversationRelay, take a look at this Twilio Blog post: Ride the AI Wave with ConversationRelay: Effortless Voice AI, Made Human.

Recently, ConversationRelay has been updated to include ElevenLabs as a Text-to-Speech provider. With over 1000 voices to choose from, you can deliver and customize expressive human-like voices that feel natural and engaging. You can find a list of ElevenLab voices available with ConversationRelay here.

In this tutorial, you’ll learn how to integrate and customize ElevenLabs voices with ConversationRelay in Node.js.

Prerequisites

Set up ConversationRelay Service

Clone Quickstart Repo

To speed things up, we’ll be building off a ConversationRelay demo application that you can find here: Voice Assistant with Twilio and Open AI (Node.js) Github repository.

If you're curious on the implementation of this demo and how it was built, I highly recommend you check out the tutorial that this demo is based off of: Integrate OpenAI with Twilio Voice Using ConversationRelay blog post.

To clone this repo, open your terminal/shell, navigate to your preferred directory and paste the following command:

git clone https://github.com/robinske/cr-demo.git

Run Ngrok Tunnel

Your ConversationRelay demo application will spin up a WebSocket server using Fastify which will be used to communicate between your application and Twilio. However, this server is only locally hosted on your computer which means it's not publicly accessible. ngrok will be used to connect your Fastify server to the internet by generating a public URL that will tunnel all requests directly to your local server on your computer.

Open a new tab on your terminal and execute the following command to spin up a tunnel to port 8080 (which is where your server will be hosted on):

ngrok http 8080

After executing the command, your terminal should look like the following:

Terminal screenshot showing ngrok tunnel with Forwarding URL

Copy the Forwarding URL as you’ll need it for the next section.

Import Environment Variables

Open up the /cr-demo project directory on VSCode (or your preferred IDE) and navigate to the .env.example file. Rename this file (by right-clicking and selecting Rename) to .env. Replace the OPENAI_API_KEY placeholder with your actual API key and replace the NGROK_URL placeholder with the Forwarding URL from the previous section. After pasting the URL, remove the https:// from the URL as this won’t be needed in your application.

Now that your environment variables are set, let's add an ElevenLabs voice to your ConversationRelay application.

Integrating ElevenLabs

Within your project directory, navigate to the server.js file. This file handles the core communication setup between your phone call, Twilio, and the WebSocket it will set up.

This file defines a /twiml route that returns TwiML instructions for Twilio to initiate a WebSocket connection via the /ws route. Through this WebSocket, messages are exchanged between Twilio and your app which will enable both receiving input and sending text responses for Text-to-Speech.

Locate lines 27 to 32 where you’ll see the following TwiML instructions that will connect the phone call to the ConversationRelay service:

<?xml version="1.0" encoding="UTF-8"?>
    <Response>
      <Connect>
        <ConversationRelay url="${WS_URL}" welcomeGreeting="${WELCOME_GREETING}" />
      </Connect>
    </Response>

Line 4 in the above code snippet is where you’ll adjust the TwiML to add an ElevenLabs voice to your service. Before we add ElevenLabs as a TTS provider for your service, let's choose a voice and customize it.

Customizing your ElevenLabs Voice

With ElevenLabs, you have access to over 1000 voices from their library. In addition to providing realistic and natural-sounding voice synthesis, they also allow you to customize voice settings, including the model, speed, similarity and stability of the voices.

You’ll be customizing these voices in the voice attribute of the ConversationRelay TwiML noun by adding a hyphen to the end of the attribute followed by an underscore-separated string with values for speed, stability, and similarity respectively. This may sound confusing but you’ll see how the final string will look at the end of this section.

Voice

You can find a list of voices to use here: Available ElevenLabs Voices.

For this tutorial we’ll be using Amelia – a young British English woman's voice that's expressive and enthusiastic. However, you can choose whichever voice you’d like for your service from the linked list above. Once you’ve found a voice, take note of its voice ID. Amelia’s voice ID is ZF6FPAbjXT4488VcRRnw.

Audio Model

The default audio model that's used for the voices is the Flash 2.5 model. It’s ElevenLabs fastest speech model with high-quality speech and ultra-low latency.

For this tutorial, I'll be using the Turbo 2.5 model which is similar to the Flash 2.5 model but compromises a bit of latency for a higher quality voice generation.

To add this non-default model to your voice settings, you’ll add a hyphen followed by the model ID after the voice ID. The model ID for the Turbo 2.5 model is flash_v2_5.

Speed

Adjusting the speed setting allows you to control how fast or slow the generated speech sounds. The default speed is set to 1.0. The lowest speed you can set is 0.7 and the highest speed you can set is 1.2.

I tend to find that the speech in automated calls often feels too slow, so for this tutorial I'll set the speed to 1.2.

Stability

The stability setting allows you to control how stable the voice is and the randomness of the generation. Lowering the stability makes the voice more emotional and dramatic and increasing the stability will make the voice more monotone and serious.

For this tutorial, I’ll be using the default stability value of 1.0.

Similarity

Changing the similarity setting will change how close the AI will mimic the original voice when attempting to replicate it. I’ll also be using the default value of 1.0 for this setting.

Text Normalization

You can enable text normalization using the elevenlabsTextNormalization TwiML attribute. By default this setting is off but by setting this attribute to ”on”, ElevenLabs’ will take the extra step of normalizing the text for you. Although it will increase the latency a bit, I'll be turning this attribute to ”on” to ensure the quality of pronunciations.

Applying the Voice Settings

Now that we’ve chosen the settings, let's adjust the code to add our customized ElevenLabs voice. To recap, the following are the values we will be choosing for our voice settings:

  • Voice ID: ZF6FPAbjXT4488VcRRnw (Amelia)
  • Model: flash_v2_5
  • Speed: 1.2
  • Stability: 1.0
  • Similarity: 1.0

To apply these settings, add a hyphen to the end of the voice ID followed by an underscore-separated string with values for speed, stability, and similarity respectively. This final string will be the value of the voice attribute:

voice="ZF6FPAbjXT4488VcRRnw-flash_v2_5-1.2_1.0_1.0"

To turn on text normalization, the elevenlabsTextNormalization TwiML attribute will be set to ”on”:

elevenlabsTextNormalization="on"

Finally, you’ll need to set the ttsProvider attribute to ”ElevenLabs”:

ttsProvider="ElevenLabs"

Add the above attributes to the ConversationRelay noun on your code (line 30) and it should look like the following:

<ConversationRelay url="${WS_URL}" ttsProvider="ElevenLabs" voice="ZF6FPAbjXT4488VcRRnw-flash_v2_5-1.2_1.0_1.0" elevenlabsTextNormalization="on" welcomeGreeting="${WELCOME_GREETING}"/>

Save the file and your ConversationRelay service should be ready to test!

Connect your Application to Twilio

Your ngrok tunnel should already be running so navigate back to where it's running on your terminal and copy the Forwarding address (this time with the https://).

You’ll now hook up your application to your Twilio number using this forwarding URL; this URL will forward all requests to your Node.js application.

Navigate to the Active Numbers section of your Twilio Console. You can head there by clicking Phone Numbers > Manage > Active numbers from the left tab on your Console.

Now, click on the Twilio number you’d like to use for your ConversationRelay service. Within the Voice Configuration section, select Webhook for the “A call comes in” dropdown and then within the next textbox, enter your forwarding URL given by ngrok followed by " /twiml" (see below how the URL should look like). Lastly, ensure the HTTP dropdown has HTTP GET selected

.

Voice configuration section of a Twilio number showing the forwarding URL in the webhook URL textbox.

Once added, scroll down and click Save configuration.

Run and Test the Application

Navigate back to your terminal and leave the ngrok tunnel running. Open a new tab (ensure you’re in the cr-demo directory) and run the following command to install the needed dependencies for your application (you can find a list of the packages in the package.json file):

npm install

Your application is now ready to run! Execute the following command to start your service:

node index

Now call your Twilio number to hear how your chosen ElevenLabs voice sounds! Feel free to adjust the settings on the voice attribute to refine the voice, then restart your node server to hear the updated voice.

Conclusion

By integrating ElevenLabs with Twilio’s ConversationRelay, you can create rich, human-like voices that bring your voice applications to life. Now that you’re ready to explore and experiment with your voice application, take it a step further with our other integrations with ConversationRelay:

Dhruv Patel is a Developer on Twilio’s Developer Voices team. You can find Dhruv working in a coffee shop with a glass of cold brew or he can either be reached at dhrpatel [at] twilio.com.