Voice Bot Integration with Twilio Video

June 27, 2025
Written by
Reviewed by
Paul Kamp
Twilion

Imagine a customer finishing a support session in a Video Room – and instead of the session simply ending, a voice bot silently joins the conversation to collect spoken feedback and trigger next steps. In this setup, the voice bot isn’t a physical user or visible participant. It’s a Twilio Phone Number making an outbound voice API call that joins the Video Room via SIP. 

Think of this as a “ghost leg”: a virtual participant connected behind the scenes, designed to listen using Twilio’s speech-to-text <Gather> verb. The bot can capture a keyword, such as “done” or “unsatisfied,” and based on that input, trigger actions like ending the Video Room, sending a notification, or routing data to your CRM.

In this blog post, you’ll learn how to use a Twilio Phone Number to dial a voice bot into a Video Room. We’ll walk through using Twilio’s Programmable Video and Voice APIs, plus built-in SIP support, along with TwiML for instructions, Twilio’s serverless Functions to host our logic, and the Programmable Voice API to make it happen.

Prerequisites

In order to follow along, you’ll need:

Once you have the prerequisites completed and a running Video application, we’re ready to begin the tutorial. Let’s get started!

Step 1: Create Two TwiML Bins

We’ll start by creating two TwiML responses using TwiML Bins. To create a TwiML Bin, navigate to TwiML Bins in your Twilio Console, and hit the blue Plus (+) or Create button to continue.

TwiML Bin 1 - Connect to the video room

This TwiML connects the Voice Bot (Incoming Voice Call) to the Video Room with unique room name “my-video-room”:

<Response>
  <Connect>
    <Room>my-video-room</Room>
  </Connect>
</Response>
Associate TwiML Bin 1 with the Twilio Phone Number you are using to make the Outbound voice API call.

TwiML Bin 2 - Collect feedback via the Voice Bot

This TwiML prompts the user to say a word, pauses briefly and sends the result to a Twilio Function:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Gather input="speech" timeout="5" action="Your_Function_URL_From_The_Next_Step">
<Say>Please give us feedback on your experience and then say "disconnect" to end the session.</Say>
 <Pause length="20"/>
 </Gather>
 <Say>We did not receive your response. Goodbye.</Say>
 <Hangup/>
</Response>
Dropdown menu with options to copy URL, delete, or rename the collect-feedback function.

Here, be sure to change Your_Function_URL_From_The_Next_Step to the … after you write the Function in the next step. You’ll find it in the pulldown menu on the right of the Function edit screen:

Diagram showing connection of a bot to a video room using Twilio API and a special call routing method.

Step 2: Create a Twilio Function to act on input

Nice work! You now have two TwiML Bins handling the logic for incoming phone calls and collecting feedback. Now we’re going to move onto building a Function which handles our “hang up” logic.

Create a new Twilio Function (you can find a longer tutorial here, as well) that listens for the <Gather> input and ends the Video Room if the trigger word is detected:

exports.handler = async function(context, event, callback) {
const client = context.getTwilioClient();
const speechResult = event.SpeechResult || '';
  if (speechResult.toLowerCase().includes('disconnect')) {
    try {
      await client.video.rooms('my-video-room')
        .update({status: 'completed'});
      return callback(null, 'Room ended successfully.');
    } catch (error) {
      return callback(error);
    }
  }
  return callback(null, 'Keyword not detected.');
};
});
Set the URL for this function as the action in your TwiML 2 Bin

Step 3: Make an API call to initiate the Voice Bot

Use a curl command (or, if you prefer, any backend logic) to make an API call that dials your Twilio number and kicks off the interaction.

curl -X POST https://api.twilio.com/2010-04-01/Accounts/ACXXXXXXX/Calls.json \
--data-urlencode "To=YOUR_TWILIO_PHONE_NUMBER" \
--data-urlencode "From=YOUR_PHONE_NUMBER" \
--data-urlencode "Url=https://handler.twilio.com/twiml/COLLECT_FEEDBACK_VIA_VOICE_BOT_TWIML_BIN" \
-u ACXXXXXXX:your_auth_token

This call triggers TwiML Bin 2, which initiates the feedback prompt.

Step 4: Flow in action

  1. A user is in a Twilio Video Room with room name “my-video-room”.
  2. An Outbound Voice API call is made to a Twilio number which has the TwiML 1(Connect to Video Room) associated with the phone number, connecting the voice bot to this Room.
  3. Once the bot is in the Room, the TwiML Bin 2 (Collect Feedback via Voice Bot) will execute, asking the user to say the word “disconnect” to end the Room.
  4. If the word “disconnect” is spoken, the Twilio Function is triggered and the Video Room is closed.

Try it now, you should be able to connect to the video room, then call your bot into the room (the “ghost leg”), provide feedback, then ask the bot to close the room by saying “disconnect”. Pretty neat, right?

Conclusion

With Twilio’s Programmable APIs, combining Video and Voice unlocks powerful opportunities to create intelligent, automated, and deeply engaging customer interactions. In this tutorial, I walked through how a voice call can be used to trigger a bot to join a Video Room, collect participant feedback using Twilio’s <Gather> TwiML verb, and then take action – such as ending the Video Room – based on that input.

This approach demonstrates how you can build more responsive and streamlined user journeys without relying on complex front-end workflows or additional manual intervention. By leveraging a Voice Bot to manage feedback and drive actions in real time, developers can deliver video experiences that adapt dynamically to customer input.

And now that you’ve seen one way to enhance your callers’ experience by inviting a survey taking bot into a room, see what may be next – my colleague Paul developed an application to have an LLM power a video avatar, so you can video call an AI Agent.

Khushbu Shaikh is a dedicated Technical Lead, Principal Technical Account Manager , serving as an invaluable asset to the Personalized Support team. With a wealth of experience, Khushbu not only excels in managing multiple customer accounts and driving impactful solutions, but also plays a key leadership role in guiding and supporting her team. For any inquiries or assistance, Khushbu can be reached at kshaikh [at] twilio.com.