Build Interactive Voicemail for Sports Fans with Twilio, MongoDB, Angular, and node.js (Part One)

April 17, 2014
Written by

Target Field: Home of Joe Mauer and the Minnesota Twins

In the summer of last year, we noticed a really cool, telephone-powered, interactive advertising experience created for Nike and basketball star LeBron James. In the short commercial spot, notable people like Spike Lee, Warren Buffet, and Dr. Dre are heard leaving voicemail messages for the NBA finals MVP, congratulating him on the championship. If you haven’t already, you can check out the video below, and a quick write up on the spot as the “Creativity Pick of the Day” in Advertising Age:

The ad ends with a live phone number that fans could call to leave a message for LeBron. We thought this was a really neat way to allow fans to interact with a celebrity or brand – and with Twilio, it’s actually pretty easy to do.

In this tutorial, we’re going to show how you can build an interactive voicemail (and text message) box to support a marketing campaign around a brand or celebrity. Fans can call or text a number you set up to communicate personally with their favorite sports stars or consumer brands. This mode of interaction can be more meaningful than a YouTube comment or a Facebook “like” – users view text messages and phone calls as means of personal communication. With a tool like voicemail or texting, you can provide your audience with a way to have that personal interaction with the subject of your marketing campaign.

With baseball season in full swing (see what I did there?), I thought a great demonstration of this technique would be setting up a voicemail and text message number for my personal favorite sports star: 3x batting champion, 6x All-Star, and 2009 AL MVP Joe Mauer of the Minnesota Twins. It includes both a public-facing web page that displays all the messages sent for Joe, as well as an admin UI for approving messages; this is the Internet after all, and we want to make sure all the messages displayed are at least PG-13.

You can view the live site (responsive layout works in mobile browsers also) at http://www.leavejoeamessage.com. Complete source code for the application is available on GitHub.

site screenshot

Ingredients for This Tutorial

The technologies and tools used for this application are:

We’ll break up this tutorial into three pieces. In this piece, we’ll explore how we use Twilio to accept incoming text messages and record voicemail messages. In the next piece, we’ll look at how we created an admin and public-facing UI to display text messages and play back Twilio call recordings. In the third and final piece, we’ll explore how we could use the human-powered transcription APIs of Rev.com to add (close to) 100% accurate, human-powered transcriptions for the messages left for Joe.

Let’s get started by understanding the basic architecture of this application.

How To Build A Voicemail Box in the Cloud

This application has two primary interactions where a user can leave a message – one that happens via voice call, and another that happens via text message. When a user sends a text message to Joe, it is saved in a database pending review, to be displayed on the website later. When a user calls in via voice, we record their spoken message. When the recording is complete, Twilio sends our node.js app a URL to the recorded message, and we save it to our MongoDB database.

For both text and voice messages, we have an admin UI to review them and a public UI to display them, but we’ll explain that in the next few parts of this tutorial series.

Here’s a basic diagram of how this interaction looks for a text message:

text message flow

For a voicemail message, it’s a bit more complex, but not by much:

voicemail flow

Twilio communicates events, such as an incoming call or text, to your web application via HTTP requests. Twilio expects HTTP responses to these requests to contain XML written in a dialect we call TwiML. This type of HTTP callback is called a webhook – if you’re not sure how Twilio uses webhooks in association with your programmable phone number, or are new to using Twilio with node.js, you might want to check out this introductory tutorial. Let’s start by looking at our webhook code in node.js, written using a customized version of the popular Express web framework.

Building Twilio Webhooks in Express

In the Twilio admin interface, we can configure a URL for each of our Twilio phone numbers that will be sent an HTTP POST request on an incoming call or text. To handle these HTTP requests, we’ll use Express, probably the most popular web framework in the node.js community. To learn more about how Express works, check out their official documentation and guide.

In our application, there are three routes that handle incoming requests from Twilio – they are configured like so:

app.post('/voice', webhook, controllers.twilio.voice);
app.post('/recording', webhook, controllers.twilio.recording);
app.post('/sms', webhook, controllers.twilio.sms);

You’ll notice that all of these routes have a second argument after the route definition – this is a special Connect middleware that is provided as a part of the Twilio node.js module. This middleware helps Express understand how to render TwiML objects (like the ones we will be building in our handlers), as well as validate that the incoming HTTP requests actually came from Twilio. This request validation is very important to do, especially if a request from Twilio will create or update data, or if a response will contain sensitive information. For more information on webhook security, check out our security documentation.

The middleware configuration is found in the same file as all the request handlers:

exports.webhook = twilio.webhook({
    // Only validate requests in production
    validate: process.env.NODE_ENV === 'production',

    // Manually configure the host and protocol used for webhook config - this
    // is the URL our Twilio number will hit in production
    host:'my-awesome-app-name.herokuapp.com',
    protocol:'https'
});

You’ll notice that request validation is only enabled if the NODE_ENV system environment variable has been set to “production”, as is customary for many node.js apps to help differentiate between development and production deployment.

Handling Text Messages and Saving to MongoDB

To handle incoming text messages, we just need a single route:

exports.sms = function(request, response) {
    // Create a new saved message object from the Twilio data
    var msg = new Message({
        sid: request.param('MessageSid'),
        type:'text',
        textMessage:request.param('Body'),
        fromCity:request.param('FromCity'),
        fromState:request.param('FromState'),
        fromCountry:request.param('FromCountry')
    });

    // Save it to our MongoDB
    msg.save(function(err, model) {
        var twiml = new twilio.TwimlResponse()
            .message('Thanks for sending Joe a text!  Your message will appear anonymously on leavejoeamessage.com once we confirm the contents are PG rated.');
        response.send(twiml);
    });
};

In this route, we are creating a new object, called a Message, and saving it to our MongoDB data store. This Message object is a mongoose model object – mongoose is a wrapper for MongoDB that makes it easier to implement persistent business objects by providing data validation, class/instance methods, and querying.

Our connection to MongoDB, which is required for mongoose to initialize, is configured in our main application JavaScript file, app.js in our project root directory. Any MongoDB connection URL can be passed to “connect”. For both development and production, I’ve really enjoyed using the hosted MongoDB instances provided by MongoHQ. They provide instant MongoDB databases that are easy to integrate into your node app, and friendly graphical tools you can use in the browser to view and manually edit your data. You can read more about how to set up a MongoHQ database here.

Back in our webhook code, we can see that the properties for the Message object are being read in from the POST data sent to our application by Twilio. We store, among other things, the user’s phone number, the message’s unique ID (SID), and the body of the text message:

var msg = new Message({
    sid: request.param('MessageSid'),
    type:'text',
    textMessage:request.param('Body'),
    fromCity:request.param('FromCity'),
    fromState:request.param('FromState'),
    fromCountry:request.param('FromCountry')
});

We then use the Model API provided by mongoose to save the Message. Once the Message has been saved, we create a TwiML response object, and then render it as XML back to Twilio using the Express response object. These TwiML instructions tell Twilio to send back a text message to the user, indicating that their message has been received:

var twiml = new twilio.TwimlResponse();
twiml.message('Thanks for sending Joe a text!  Your message will appear anonymously on leavejoeamessage.com once we confirm the contents are PG rated.');
response.send(twiml);

Handling Incoming Calls and Recording Messages

To record a voicemail message, we will create two routes – let’s begin by examining the first route that will be used by Twilio when our number receives an incoming call:

exports.voice = function(request, response) {
    var twiml = new twilio.TwimlResponse();

    twiml.say('Hi there! Thanks for calling to wish Joe good luck this season. Please leave your message after the beep - you can end the message by pressing any button. Get ready!')
        .record({
            maxLength:120,
            action:'/recording'
        });

    response.send(twiml);
};

To initiate the voicemail recording, we begin with a brief text-to-speech (TTS) message that invites a user to leave a message. Once again, this is accomplished with the twilio node.js module’s TwiML building helper, a TwimlResponse object. It’s nothing fancy – just an object that allows you to easily build an XML string in JavaScript.

A TwimlResponse object provides a chainable interface, allowing you to add multiple TwiML nodes to a response with a single statement. After the TTS message is configured in a <Say> node, we immediately use the <Record> tag to tell Twilio to initiate a new recording from the user. We limit the recording to two minutes and tell Twilio that once the recording is finished, it should send a POST to /recording with the recording URL.

Twilio will handle initiating the recording, and saving the recorded audio file (even making it available in a variety of formats, like mp3 and wav). Once the recording is finished, Twilio will continue the call flow by sending an HTTP request to the URL specified in the action attribute of the <Record> tag. In this app, that URL maps to our /recording route:

exports.recording = function(request, response) {
    // Create a new saved message object from the Twilio data
    var msg = new Message({
        sid: request.param('CallSid'),
        type:'call',
        recordingUrl: request.param('RecordingUrl'),
        recordingDuration: Number(request.param('RecordingDuration')),
        fromCity:request.param('FromCity'),
        fromState:request.param('FromState'),
        fromCountry:request.param('FromCountry')
    });

    // Save it to our MongoDB 
    msg.save(function(err, model) {
        var twiml = new twilio.TwimlResponse()
            .say('Thanks for leaving Joe a message - your message will appear on the web site once we confirm it is PG rated.  Goodbye!', {
                voice:'alice'
            })
            .hangup();
        response.send(twiml);
    });
};

The HTTP request that Twilio sends us in this step will contain POST parameters with information about the recording, notably it’s public URL (accessed with the code request.param('RecordingUrl') in the snippet above). Once we have this URL, we are ready to save our voice mail message to the database:

var msg = new Message({
    sid: request.param('CallSid'),
    type:'call',
    recordingUrl: request.param('RecordingUrl'),
    recordingDuration: Number(request.param('RecordingDuration')),
    fromCity:request.param('FromCity'),
    fromState:request.param('FromState'),
    fromCountry:request.param('FromCountry')
});

Deploying Our Application

This application is set up to use two primary cloud services for hosting and persistence. We’ll use Heroku as our PaaS to run the node.js app, and MongoHQ to host and manage our MongoDB database.

Setting Up MongoHQ

To create a MongoHQ database, you will first need to sign up for MongoHQ. Once you’re signed up, you can create a database of the size you would like – there’s a free sandbox database that should do fine for development and testing. If you need help getting set up, they have a short and helpful getting started video here.

Once you’ve created a database, you’ll need the connection string to provide our node app as a system environment variable. The connection string can be found in the “Admin” section of your new database:

monghq interface

You’ll need to replace a valid database username and password in this connection string where it says <user> and <password>. To get a database username and password, you’ll probably just go ahead and create a new one. Also in the Admin section, there’s a tab labeled “Users”. Use the form provided to add a new user, but make note of the username and password!

MongoHQ interface

Now, you should have the info you need for the connection string, which we’ll set as an environment variable when we deploy our node app to Heroku. It should look something like this, based on the values in the last screen capture:

mongodb://foobar:awesomeness@oceanic.mongohq.com:10076/mauer

Setting Up Heroku

To deploy this application to Heroku, the steps would be very similar to their deployment guide for node.js. The GitHub repository for leavejoeamessage.com includes a Procfile that will start a node.js web app process once your app is deployed.

However, for our app to work, we need our sensitive database connection string and Twilio account credentials to be added as system environment variables. Using the Heroku Toolbelt command line utility, this is pretty simple. Open up a terminal window, and execute the following commands:

heroku config:add TWILIO_ACCOUNT_SID=your twilio account sid

heroku config:add TWILIO_AUTH_TOKEN=your twilio auth token

heroku config:add MONGOHQ_URL=your MongoDB connection string

Now our application is configured to start accepting incoming calls and texts from Twilio!

Wiring Up Our Phone Number

Now that we have our back end set up, all we need to do is wire up the Twilio number we want our fans to call or text. Go to your available phone numbers and select (or buy) a number you want fans to call. Next, click on that number and set up both the Voice and Messaging callbacks to point to the domain where our app is running:

Note that in our application, Twilio request validation is enabled. We need to make sure that the host name and protocol specified here in our webhooks matches the values configured in our code.

Now when you hit save, all incoming calls and texts to this number should be saved to our database! As the calls and texts come in, you can view them in the MongoHQ admin interface for the “messages” collection:

That’s It (For Now)!

In this tutorial, we have walked through the code necessary to drive a Twilio-powered answering machine, with text and voice messages saved to a MongoDB database. In our next installment of this series, we’ll take a look at how we work with the message data we’ve saved to provide a responsive front end, powered by AngularJS , which allows us to moderate messages left by users, as well as display and play back the audio on recorded messages. After that, we’ll show how we can use Rev.com’s transcription APIs to provide 100% accurate human-powered transcriptions to include alongside our voice mail messages.

If you’d like to argue a ruling on the field and lobby for an instant replay, feel free to ping me on Twitter or in the comments with any questions!