Image recognition can seem like a pretty daunting technical challenge. Scraping images to use as training data for a machine learning model stresses me out. That’s where Clarifai comes in. This API is great for implementing image recognition so you can focus on the core functionality of what you are building.
Let’s build a Flask application in Python with Twilio MMS to receive picture messages over a phone number and respond with relevant keywords from Clarifai’s image recognition API.
Setting up your environment
Before moving on, make sure to have your Python environment setup. Getting everything working correctly, especially with respect to virtual environments, is important for isolating your dependencies if you have multiple projects running on the same machine.
You can also run through this guide to make sure you’re good to go before moving on.
Now that your environment is set up, you’re going to need to install the libraries we’ll use for this app:
- Flask for our web framework
- Twilio’s Python library to interact with the Twilio API
- Clarifai’s Python library to interact with the Clarifai API for image recognition
First, navigate to the directory where you want this code to live and run the following command in your terminal with your virtual environment activated to install these dependencies:
pip install flask==0.12.2 twilio==6.4.2 clarifai==2.0.29
Image recognition with Clarifai
Let’s start by writing a module to interact with the Clarifai API. Before being able to use the Clarifai API, you’ll have to make an account. Once you have an account, you’ll need to create an application so you have an API key to use. You can name your application whatever you want.
Once you have an application, you need to set the following environment variable so the Clarifai Python module can use your API key to authenticate:
export CLARIFAI_API_KEY=*your API key*
Now you can start writing some code. Create a file called
tags.py and enter the following code:
from clarifai.rest import ClarifaiApp app = ClarifaiApp() def get_relevant_tags(image_url): response_data = app.tag_urls([image_url]) tag_urls =  for concept in response_data['outputs']['data']['concepts']: tag_urls.append(concept['name']) return tag_urls
What we’re doing here is defining a function that hits the Clarifai API with an image URL and returns all of the tags or “concepts” associated with that image. Try running this code with an image of your choice from the Internet and seeing what happens. I’m going to use this picture from a show my old band played in Philly back in 2012.
Append the line
print('n'.join(get_relevant_tags('*image_url*'))) to your code and run it:
You should see some results printing to your terminal. Here’s some of what I got. It’s interesting that Clarifai was basically able to figure out that this picture was of a group of people taken at a concert:
You can delete that line you just added before moving on.
Setting up your Twilio account
Before being able to respond to picture messages, you’ll need a Twilio phone number. You can buy a phone number here.
Your Flask app will need to be visible from the internet in order for Twilio to send requests to it. We will use ngrok for this, which you’ll need to install if you don’t have it. In your terminal run the following command:
ngrok http 5000
This provides us with a publicly accessible URL to the Flask app. Configure your phone number as seen in this image:
You are now ready to send a text message to your new Twilio number.
Building the Flask app
Now that you have a Twilio number and are able to grab relevant tags and keywords associated with an image, you want to allow users to text a phone number with their own pictures for your code to analyze.
Let’s create our Flask app. Open a new file called
app.py and add the following code:
from flask import Flask, request from twilio.twiml.messaging_response import MessagingResponse from tags import get_relevant_tags app = Flask(__name__) @app.route('/sms', methods=['POST']) def sms_reply(): # Create a MessagingResponse object to generate TwiML. resp = MessagingResponse() # See if the number of images in the text message is greater than zero. if request.form['NumMedia'] != '0': # Grab the image URL from the request body. image_url = request.form['MediaUrl0'] relevant_tags = get_relevant_tags(image_url) resp.message('n'.join(relevant_tags)) else: resp.message('Please send an image.') return str(resp) if __name__ == '__main__': app.run()
We only need one route on this app:
/sms to handle incoming text messages.
Run your code with the following terminal command:
Now take a selfie (or any picture) and send it to your Twilio phone number to see if it will recognize what’s inside!
How does this all work?
With this app running on port 5000, sitting behind our public ngrok URL, Twilio can see your application. Upon receiving a text message:
- Twilio will send a
sms_replyfunction will be called.
- The URL to the image in the text message is passed to our
- A request to the Clarifai API will be made, receiving a response with keywords associated with our image.
/smsroute responds to Twilio’s request telling Twilio to send a message back with the tags we received from the Clarifai API.
For more Clarifai related fun, check out this post on how to hack your gift giving. If you’re interested in trying your hand at image recognition on your own with OpenCV, you can check out this blog post written by Megan Speir.
Feel free to reach out if you have any questions or comments or just want to show off the cool stuff you’ve built.