In this tutorial you’ll leverage Twilio Programmable Voice to receive phone calls at your Twilio phone number, and transcribe any voice messages left by the caller. This guide can be used as a foundation to build your own voicemail system.
To get started with this tutorial, you’ll need the following:
- A free Twilio account (sign up with this link and get $10 in free credit when you upgrade to a paid account).
- A Twilio phone number.
In this section you are going to set up a brand new Flask project. To keep things nicely organized, open a terminal or command prompt, find a suitable place and create a new directory where the project you are about to create will live:
mkdir python-flask-transcription cd python-flask-transcription
Create a virtual environment
Following Python best practices, you are now going to create a virtual environment, where you are going to install the Python dependencies needed for this project.
If you are using a Unix or Mac OS system, open a terminal and enter the following commands to create and activate your virtual environment:
python3 -m venv venv source venv/bin/activate
If you are following the tutorial on Windows, enter the following commands in a command prompt window:
python -m venv venv venv\Scripts\activate
Now you are ready to install the Python dependencies used by this project:
pip install flask twilio pyngrok python-dotenv
The four Python packages that are needed by this project are:
- The Flask framework, to create the web application that will receive message notifications from Twilio.
- The Twilio Python Helper library, to work with WhatsApp messages.
- Pyngrok, to make the Flask application temporarily accessible on the Internet for testing via the ngrok utility.
- The python-dotenv package, to read a configuration file.
Set up a development Flask server
Make sure that you are currently in the virtual environment of your project’s directory in the terminal or command prompt. Since we will be utilizing Flask throughout the project, we will need to set up the development server. Add a .flaskenv file (make sure you have the leading dot) to your project with the following lines:
These incredibly helpful lines will save you time when it comes to testing and debugging your project.
FLASK_APPtells the Flask framework where our application is located.
FLASK_ENVconfigures Flask to run in debug mode.
Run the command
flask run in your terminal to start the Flask framework.
The screenshot above displays what your console will look like after running the command
flask run. The service is running privately on your computer’s port
5000 and will wait for incoming connections there. You will also notice that debugging mode is active. When in this mode, the Flask server will automatically restart to incorporate any further changes you make to the source code.
However, since you don't have an app.py file yet, nothing will happen. Though, this is a great indicator that everything is installed properly.
Feel free to have Flask running in the background as you explore the code. We will be testing the entire project at the end.
Authenticate against Twilio Services
We need to safely store some important credentials that will be used to authenticate against the Twilio services.
Create a file named .env in your working directory and paste the following text:
Look for the TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN variables on the Twilio Console and add it to the .env file.
Start an ngrok tunnel
The problem with the Flask web server is that it is local, which means that it cannot be accessed over the Internet. Twilio needs to send web requests to this server, so during development, a trick is necessary to make the local server available on the Internet.
On a second terminal window, activate the virtual environment and then run the following command:
ngrok http 5000
The ngrok screen should look as follows:
While ngrok is running, you can access the application from anywhere in the world using the temporary forwarding URL shown in the output of the command. All web requests that arrive into the ngrok URL will be forwarded to the Flask application by ngrok.
Record an incoming call
Twilio uses the concept of webhooks to handle any incoming calls to your Twilio phone number.
Create a file named app.py and paste the following code:
import os from dotenv import load_dotenv from flask import Flask, request from twilio.twiml.voice_response import VoiceResponse from twilio.rest import Client load_dotenv() app = Flask(__name__) @app.route("/record", methods=["POST"]) def record(): response = VoiceResponse() if 'RecordingSid' not in request.form: response.say("Hello, please leave your message after the tone.") response.record(transcribe=True) else: print("Hanging up... ") response.hangup() return str(response) if __name__ == "__main__": app.run()
record() function defines a
response object using the Twilio library's
VoiceResponse() helper class. There are a few TwiML verbs that are referenced in the code such as
hangup in order to control the call flow.
You'll learn more about the
record verb in the next section.
The TwiML <Record> verb
Before I dive into the TwiML
<Record> verb it’s important to mention that recording phone calls or voice messages has a variety of legal considerations and you must ensure that you’re adhering to local, state, and federal laws when recording anything.
The code above first creates a new variable called
twiml that holds a reference to a new TwiML Voice Response object.
TwiML, which stands for Twilio Markup Language, is XML that has special tags defined by Twilio. You can use TwiML to tell Twilio how to handle an incoming phone call or SMS. Instead of writing XML, you can also write TwiML programmatically, which is what you’re doing in this function.
<Record> verb will create an audio recording of anything the caller says after the call connects, and it can be modified with a number of different attributes. The attributes most relevant for this tutorial are
transcribe is an optional attribute that, when included and set to
true, will tell Twilio to create a speech-to-text transcription of any message left by the caller, with the caveat that the message has to be between 2 and 120 seconds in length. This means that some very short messages and very long messages will not be transcribed, though the actual audio recordings of the message will not be impacted.
The content of the transcription will be stored by Twilio for you, and can be accessed via the transcription API.
Alternatively, you can provide a transcription callback to the
<Record> verb that will execute when the transcription is finished. In this callback, you can access the contents of the transcription and perform an action on it, like save it to a database or print it to a webpage.
If you use the
transcribeCallback attribute, you don’t also need to include the
transcribe: true attribute.
This brings you to your next step: creating the transcription callback function.
Add the transcription callback function
Create a new file named transcribe.py and paste the following code:
import os from dotenv import load_dotenv from twilio.rest import Client load_dotenv() def message(): client = Client() transcription = client.transcriptions.list(limit=1) sid = transcription.sid t = client.transcriptions(sid).fetch() print(t.transcription_text) return str(sid) if __name__ == '__main__': message()
In this file, you create a Twilio client object in order to fetch the transcription of the phone call. The client will look at the list of calls and store the most recent transcription ID from the Twilio REST API into the
transcription variable. Then you will parse out the individual
sid of the voicemail in order to fetch the transcription of the voicemail.
Configure the webhook for your Twilio phone number
Make sure the Flask server and ngrok are still running. You will need to configure the ngrok URL to the Twilio phone number before testing out the app in the next step.
Go to the Twilio Console and find the phone number you’re using for this tutorial in the list to open the configuration page for that number.
Scroll down until you see a section titled Voice & Fax.
Make the following adjustments to the information shown in this section:
- For Accept Incoming, select Voice Calls
- For Configure With, select Webhooks, TwiML Bins, Functions, Studio, or Proxy
- For A Call Comes In, select Webhook
On the same line as A Call Comes In, paste the temporary ngrok URL with "/record" appended at the end. Remember to leave it as "HTTP POST". You can see an example below:
After making these changes, click the Save button.
Test your app
Call your Twilio phone number from your personal phone. You’ll hear a beep after which you can speak into the phone and say a few words. Make sure you speak for at least a few seconds to ensure that there is enough content for the transcription to be triggered. After leaving your message, hang up the call.
On a third terminal window, activate the virtual environment and then run the following command:
Wait a second to see your transcribed message show up on the terminal.
Congratulations, now that you’ve learned how to record transcriptions, what will you do next?
Diane Phan is a Developer Network editor on the Developer Voices team. She loves to help programmers tackle difficult challenges that might prevent them from bringing their projects to life. She can be reached at dphan [at] twilio.com or LinkedIn.
Build a cool button that can mute/unmute your audio during a video call, or trigger your favorite keyboard shortcut.
Learn how to send an MMS to a mobile device directly from your command line using Python and Twilio.
Learn how to verify email addresses provided by your users in your FastAPI application using Twilio Verify and SendGrid.
É possível construir aplicações de voz e mensagens na mesma plataforma utilizando a integração com 1 clique do DialogFlow com a Twilio. Aprenda aqui como implementar.
Learn how to receive and transcribe a voice message recording using Twilio Programmable Voice, Python, and Django.