How to Build an Interactive Voice Response System with Java Spring Boot

December 01, 2023
Written by
Diane Phan
Twilion
Reviewed by

In this article, you will learn how to use Twilio's Programmable Voice tools to build an IVR, or interactive voice response with speech recognition using Java and Maven.

Prerequisites

  • IntelliJ IDEA Community Edition for convenient and fast Java project development work. The community edition is sufficient for this tutorial.
  • Java Development Kit, Twilio Helper Library works on all versions from Java 8 up to the latest. We've used the multiline String feature from Java 15 in the code in this post, too.
  • ngrok, also known as a handy utility to connect the development version of the Java application running on your system to a public URL that Twilio can connect to.
  • A Twilio account. If you are new to Twilio click here to create a free account now.
  • A phone capable of receiving SMS to test the project (or you can use the Twilio Dev Phone)

Set up the Project Directory

Follow the tutorial on how to start a Java Spring Boot application as a base for this project.

You can name the project after "phonetree" and create the directory structure src/main/java/com/twilio/phonetree

Create a subfolder named "ivr" on the same level as the PhonetreeApplication.java file.

Add the Twilio Dependency to the Project

Open the pom.xml file and add the following to the list of dependencies:

                <dependency>
                        <groupId>com.twilio.sdk</groupId>
                        <artifactId>twilio</artifactId>
                        <version>9.14.0</version>
                </dependency>

We always recommend using the latest version of the helper libraries. At the time of writing this is 9.14.0. Newer versions are released frequently and you can always check MvnRepository for updates.

Create an Interactive Voice Response App

Navigate to the src/main/java/com/twilio/phonetree/ivr subfolder and create a file named IVREndpoints.java. 

Start off the file with the required import statements and create a class to hold all of the endpoints for a functioning IVR:

package com.twilio.phonetree.ivr;

import com.twilio.twiml.VoiceResponse;
import com.twilio.twiml.voice.Gather;
import com.twilio.twiml.voice.Say;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class IVREndpoints {

    @RequestMapping(value = "/welcome")
    public VoiceResponse welcome() {
        return new VoiceResponse.Builder()
                .gather(new Gather.Builder()
                        .action("/menu")
                        .inputs(Gather.Input.SPEECH)
                        .say(amySay("""
                            Hello, you're through to the Party Cookies store.
                            What are you calling for today?
                            Say "collection" or "delivery".
                        """))
                        .build())
                .build();
    }
}

This class is annotated with a @RestController, indicating that it will handle incoming HTTP requests and produce JSON responses.

The first endpoint that the caller will hit is /welcome which will create the Twilio Programmable Voice object. The VoiceResponse object is returned with instructions for Twilio on various actions to interact with the caller.

A new VoiceResponse.Builder() is created to build the Twilio XML response.

The gather method is used to collect user input.

The action("/menu") parameter specifies the URL to which Twilio should send the user's input.

The inputs(Gather.Input.SPEECH) indicates that the input should be collected through the caller's actual speech.

The say() function is used by Twilio to say the string message to the caller. Add the `amySay object is defined as follows:

    private VoiceResponse getVoiceResponse(String message) {
        return new VoiceResponse.Builder()
                .say(amySay(message)).build();
    }

    private Say amySay(String message){
        return new Say.Builder(message)
                .voice(Say.Voice.POLLY_AMY)
                .language(Say.Language.EN_GB)
                .build();
    }

The function currently uses the Twilio supported Amazon Polly Voice, but this can be changed to other TwiML voice attributes.

The code above will generate the following TwiML.

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Gather action="/menu" input="speech">
    <Say language="en-GB" voice="Polly.Amy">
    Hello, you're through to the Party Cookies store.
    What are you calling for today?
    Say "collection" or "delivery".
    </Say>
  </Gather>
</Response>

Build Out the IVR Menu System

Once the gather action has finished its speech recognition, Twilio makes another webhook request to /menu. In order to process the caller's response, a switch statement is used to handle each different possible case. Each case represents a different option that the caller can say out loud, and the corresponding method is called to handle that option. If none of the recognized options is found, it defaults to a welcome() method.

This switch statement processes lower case input, so the toLowerCase() function is applied to the gatheredSpeech parameter for proper input.

Copy and paste the following code below to add the `menu

    @RequestMapping(value = "/menu")
    public VoiceResponse menu(@RequestParam("SpeechResult") String gatheredSpeech) {

        return switch(gatheredSpeech.toLowerCase()){
            case "delivery"   -> getDelivery();
            case "collection" -> getCollection();
            case "sparkles"   -> getSecretSparkles();
            default           -> welcome();
        };
    }

The @RequestMapping annotation is used in Spring to map HTTP requests to specific methods or controllers. In this case, it indicates that this method should handle requests to the "/menu" endpoint. These methods can be called with HTTP GET or POST requests.

Allow the Caller to Respond to the IVR System

Currently the application prompts the caller to say either "collection" or "delivery" to move forward in the interactive voice response system.  

Write the following code to define the functions to handle the respective cases:

    private VoiceResponse getDelivery() {
        String message = """
                The kitchen is baking as quickly as possible for the holiday season.
                Your cookies will be delivered within 2 hours, with a dash of magic that will blow your mind.
                In the meantime, prepare your taste buds.
                The kitchen appreciates your patience.
                """;
        return getVoiceResponse(message);
    }

    private VoiceResponse getCollection() {
        String message = """
                Congratulations, you're about to experience cookie perfection!
                I've got your batch ready and waiting for pickup.
                Just a heads up, after one bite, you might question every cookie you've ever had before.
                Swing by whenever you're ready to upgrade your taste buds.
                """;
        return getVoiceResponse(message);
    }

However, we can spice up the IVR system by including an option to say a secret code to prompt another case.

Copy and paste the following code below to handle the case if the caller responds with "sparkles":

    private VoiceResponse getSecretSparkles() {
        String message = """
                Oh, you've heard whispers about the legendary secret holiday menu, have you?
                Well, you're in luck because today is your lucky day!
                Buckle up for a taste adventure that transcends ordinary holidays.
                But fair warning, once you experience it, the regular holiday fare might feel a bit lackluster.
                Ask and you shall receive, my friend. Prepare to be dazzled by our exclusive holiday magic!
                """;
        return getVoiceResponse(message);
    }

Use a TwiML Message Converter

Spring Web returns Java objects into JSON responses, however this project handles a lot of VoiceResponse objects, which Spring Boot is not used to handling. Thus, this article regarding how to return custom types in HTTP responses using Spring Web is here to help call the .toXml() functions to appropriately convert to TwiML.

Navigate to the TwiMLMessageConverter.java file and paste in the following code snippet:

package com.twilio.phonetree.ivr;

import com.twilio.twiml.TwiML;
import org.springframework.http.HttpInputMessage;
import org.springframework.http.HttpOutputMessage;
import org.springframework.http.MediaType;
import org.springframework.http.converter.AbstractHttpMessageConverter;
import org.springframework.http.converter.HttpMessageNotReadableException;
import org.springframework.http.converter.HttpMessageNotWritableException;
import org.springframework.stereotype.Component;

import java.io.IOException;
import java.nio.charset.StandardCharsets;

@Component
public class TwiMLMessageConverter extends AbstractHttpMessageConverter<TwiML> {

    public TwiMLMessageConverter() {
        super(MediaType.APPLICATION_XML, MediaType.ALL);
    }

    @Override
    protected boolean supports(Class<?> clazz) {
        return TwiML.class.isAssignableFrom(clazz);
    }

    @Override
    protected boolean canRead(MediaType mediaType) {
        return false; // we don't ever read TwiML
    }

    @Override
    protected TwiML readInternal(Class<? extends TwiML> clazz, HttpInputMessage inputMessage) throws IOException, HttpMessageNotReadableException {
        return null;
    }

    @Override
    protected void writeInternal(TwiML twiML, HttpOutputMessage outputMessage) throws IOException, HttpMessageNotWritableException {
        outputMessage.getBody().write(twiML.toXml().getBytes(StandardCharsets.UTF_8));
    }
}

Compile and Run the Application

If you want to double check your code matches ours, view the full code in this GitHub repository.

In your IDE, navigate to the PhonetreeApplication.java file and click on the green play button next to the public class definition and select the Run option.  You can also run it in the terminal with ./mvnw spring-boot:run.

As the app is running on http://localhost:8080, expose the application to a public port such as ngrok using the command ngrok http 8080.

Ngrok is a great tool because it allows you to create a temporary public domain that redirects HTTP requests to our local port 8080. If you do not have ngrok installed, follow the instructions on this article to set up ngrok.

null

Your ngrok terminal will now look like the picture above. As you can see, there are URLs in the “Forwarding” section. These are public URLs that ngrok uses to redirect requests into our flask server.

Configure Twilio Service

Go to the Twilio Console and navigate to the Phone Numbers section in order to configure the webhook.

Add the ngrok URL in the text field under the A call comes in section. Make sure the URL is in a "https://xxxx.ngrok.app/welcome" format as seen below:

active numbers dashboard on the twilio console

Test out the Interactive Voice Response App

Grab your cellular device and dial the phone number to test out Party Cookie's hotline. It's time to fulfill your customers' sweet tooths by selling desserts to them!

gif of hallmark channel plates of cookies

What's Next for Interactive Voice Response Applications in Java?

Congratulations on building your own Party Cookie hotline!

Now that you have an IVR up and running, check out this article on how you can implement best practices for your call center.  

If you are looking for a customizable product to use at scale and build faster, you can build with Flex. You can also build the IVR system with Gradle and Java Servlets instead.

For those looking to build faster with a team, consider building with Twilio Studio which requires no coding experience.

Diane Phan is a developer on the Twilio Voices team. She loves to help programmers tackle difficult challenges that might prevent them from bringing their projects to life. She can be reached at dphan [at] twilio.com or LinkedIn.