Image recognition in Python with the Clarifai API and Twilio MMS

June 30, 2017
Written by
Sam Agnew
Twilion

SamAndPedroWithText

Image recognition can seem like a pretty daunting technical challenge. Scraping images to use as training data for a machine learning model stresses me out. That’s where Clarifai comes in. This API is great for implementing image recognition so you can focus on the core functionality of what you are building.

Let’s build a Flask application in Python with Twilio MMS to receive picture messages over a phone number and respond with relevant keywords from Clarifai’s image recognition API.

Setting up your environment

Before moving on, make sure to have your Python environment setup. Getting everything working correctly, especially with respect to virtual environments, is important for isolating your dependencies if you have multiple projects running on the same machine.

You can also run through this guide to make sure you’re good to go before moving on.

Installing dependencies

Now that your environment is set up, you’re going to need to install the libraries we’ll use for this app:

First, navigate to the directory where you want this code to live and run the following command in your terminal with your virtual environment activated to install these dependencies:

pip install flask==0.12.2 twilio==6.4.2 clarifai==2.0.29

Image recognition with Clarifai

Let’s start by writing a module to interact with the Clarifai API. Before being able to use the Clarifai API, you’ll have to make an account. Once you have an account, you’ll need to create an application so you have an API key to use. You can name your application whatever you want.

ClarifaiAPIKey.gif

Once you have an application, you need to set the following environment variable so the Clarifai Python module can use your API key to authenticate:

export CLARIFAI_API_KEY=*your API key*

Now you can start writing some code. Create a file called tags.py and enter the following code:

from clarifai.rest import ClarifaiApp


app = ClarifaiApp()


def get_relevant_tags(image_url):
    response_data = app.tag_urls([image_url])

    tag_urls = []
    for concept in response_data['outputs'][0]['data']['concepts']:
        tag_urls.append(concept['name'])

    return tag_urls

What we’re doing here is defining a function that hits the Clarifai API with an image URL and returns all of the tags or “concepts” associated with that image. Try running this code with an image of your choice from the Internet and seeing what happens. I’m going to use this picture from a show my old band played in Philly back in 2012.

394057_328648487179283_294009412_n.jpg

Append the line print('n'.join(get_relevant_tags('*image_url*'))) to your code and run it:

python tags.py

You should see some results printing to your terminal. Here’s some of what I got. It’s interesting that Clarifai was basically able to figure out that this picture was of a group of people taken at a concert:

Screen Shot 2017-06-28 at 4.49.09 PM.png

You can delete that line you just added before moving on.

Setting up your Twilio account

Before being able to respond to picture messages, you’ll need a Twilio phone number. You can buy a phone number here.

Your Flask app will need to be visible from the internet in order for Twilio to send requests to it. We will use ngrok for this, which you’ll need to install if you don’t have it. In your terminal run the following command:

ngrok http 5000

This provides us with a publicly accessible URL to the Flask app. Configure your phone number as seen in this image:

Screen Shot 2016-07-14 at 10.54.53 AM.png

You are now ready to send a text message to your new Twilio number.

Building the Flask app

Now that you have a Twilio number and are able to grab relevant tags and keywords associated with an image, you want to allow users to text a phone number with their own pictures for your code to analyze.

Let’s create our Flask app. Open a new file called app.py and add the following code:

from flask import Flask, request
from twilio.twiml.messaging_response import MessagingResponse

from tags import get_relevant_tags


app = Flask(__name__)


@app.route('/sms', methods=['POST'])
def sms_reply():
    # Create a MessagingResponse object to generate TwiML.
    resp = MessagingResponse()

    # See if the number of images in the text message is greater than zero.
    if request.form['NumMedia'] != '0':

        # Grab the image URL from the request body.
        image_url = request.form['MediaUrl0']
        relevant_tags = get_relevant_tags(image_url)
        resp.message('n'.join(relevant_tags))
    else:
        resp.message('Please send an image.')

    return str(resp)


if __name__ == '__main__':
    app.run()

We only need one route on this app: /sms to handle incoming text messages.

Run your code with the following terminal command:

python app.py

Now take a selfie (or any picture) and send it to your Twilio phone number to see if it will recognize what’s inside!

How does this all work?

With this app running on port 5000, sitting behind our public ngrok URL, Twilio can see your application. Upon receiving a text message:

  1. Twilio will send a POST request to /sms.
  2. The sms_reply function will be called.
  3. The URL to the image in the text message is passed to our tags module
  4. A request to the Clarifai API will be made, receiving a response with keywords associated with our image.
  5. Your /sms route responds to Twilio’s request telling Twilio to send a message back with the tags we received from the Clarifai API.

For more Clarifai related fun, check out this post on how to hack your gift giving. If you’re interested in trying your hand at image recognition on your own with OpenCV, you can check out this blog post written by Megan Speir.

Feel free to reach out if you have any questions or comments or just want to show off the cool stuff you’ve built.