Hi, my name is Chris Ismael, Developer Evangelist from Mashape. In this tutorial, we will build a phone service that lets you translate messages to your preferred language using SMS. The service will also call you back with a native language pronunciation of the translated message. Here’s a quick video demo:
We will do this using node.js, and a couple of APIs:
- Twilio APIs for SMS and voice
- APIs from Mashape to let us detect language and translate messages. We will also use a text-to-voice API to read out the message in a native language pronunciation.
You can find the source code here on Github. Try out the service by sending an SMS message to +1 (415) 992-9984 like “Take me to the bar ES”. The “ES” stands for Spanish. You can try other languages such as “DE”, “IT”, etc. You can also translate a non-English message like “Hola mundo en”.
Let’s get started!
How do we interact with Twilio?
Twilio makes it easy for our applications to send and receive messages and calls. There’s a lot more you can do with the Twilio API and I encourage you to check out this page for an overview. For our app, receiving, sending SMS, and calling back are the things we need to do. Here’s a diagram of the process flow from our phones, to Twilio, to our app, and back.
- Receiving an SMS in Twilio – To do this, we need to register for a Twilio account in order to a) get a Twilio number where our SMS messages will be sent, and b) associate that number with the URL of our app that will get called whenever it gets an SMS. Here’s what mine looks like after setting it up (note that I blurred the app name to lessen the possibility for abuse. You can also set up authentication but I’ll let you check that out in your spare time)
Here’s the snippet of node.js code that receives the POST request from Twilio as a result of sending an SMS to the number above. We are simply getting the SMS message, the sender’s phone number, and number the message it was sent to (which is my number above)
- Sending an SMS from Twilio – To get Twilio to do stuff, we need to respond to it with an XML file (they call it TwiML) that contains instructions such as sending an SMS, calling a phone, playing audio, and more. When a user sent an SMS to our Twilio number, Twilio will expect a TwiML back so it will know what to do next. Here’s a function that sends back a TwiML to Twilio to send an SMS back to the message sender:
- Calling a phone from Twilio – To call a phone from Twilio, we can use their REST endpoint as described in the code below. One thing to note is that aside from the ‘From’ and ‘To’ fields (phone numbers), the ‘Url’ parameter expects a TwiML response that contains instructions on what Twilio should do when the call is picked up. (We will explore this in number 4 below shortly):
- Playing audio when call is picked up – In the code below, we are sending back a TwiML to Twilio that plays either Twilio’s default voice, or a link to an mp3. This will be played to the person we’re calling from Twilio when he/she picks it up. Also note in the code below that we are introduced to a Mashape API that does text-to-voice for us:
What is Mashape and how do I call their APIs?
Mashape is a Cloud API Marketplace where you can discover interesting and cool APIs for your apps. If you developed your own API, you can add them to Mashape too so that more developers can discover them. You can find a list of 40+ machine learning APIs here! For this project, we used the following APIs:
- Sprawk Language detection – figures out the language of a message
- MyMemory Translation Memory – can translate from one language to another
- Text To Voice – a wrapper to Google Translate that saves the mp3 audio to a public URL in Dropbox
Each of the links above will take you to a Mashape documentation page. It provides a test console, among other things, to help developers test the API before actually using it for their apps.
You need to have a Mashape account in order to use the test console, and most importantly to use the API in your app. Go on and register for a Mashape account, we’ll wait for you below.
Once you have signed up, you will get a Mashape key that you can use for all the APIs in Mashape. (Do note that some APIs require a paid subscription.)
You’ve already seen how to call a Mashape API from number 4 above. If you’re wondering where the url and X-Mashape-Authorization values came from, you can inspect the curl request from the Mashape test console, as below:
As you’ve probably noticed, there are also code snippets for Java, PHP, Python, Obj-C, and Ruby. Go ahead and check them out.
Piecing them all together
So far, we’ve described the major components of our service. Hopefully this has given you a good understanding of how each part fits. We won’t go into the nitty gritty of explaining each part line by line. Instead, you can check out the source code in its full glory in Github here.
The important thing to remember here is the flow from the user, to Twilio, to our node.js app, and back:
- User sends an SMS message to our Twilio number.
- Twilio calls our node.js app, passing along the SMS message and the sender’s number. (Line 15 – 19 index.js)
- Our node.js app calls Mashape’s language detection API to determine the language of the message. This will be used to determine which language to translate from and to. The actual translation is done by the translation API. (Line 43 and 64 in index.js)
- We send a TwiML to Twilio to send the translated message SMS to the user. (Line 91 in index.js)
- We also do a REST call to Twilio that initiates the call to the user. When the user picks up the call, we call the text-to-voice Mashape API to give us a public URL of the translated voice audio mp3. We then send back a TwiML to Twilio that plays back this mp3 to the user. (Line 112 and 156 from index.js)
I hope this tutorial was useful. Please send me a note at firstname.lastname@example.org if you have questions, comments, or feedback! Happy coding!