Register by 10/16 for $250 off the on-site price.
Build the future of communications.
Start building for free

Build a Google Analytics Slack Bot with Python

analytics_splash_screen

Google Analytics is an incredibly powerful tool. All of the members of your team can see everything from which sources your web traffic comes from to what demographics frequent your site. There’s just one problem:

Nobody is willing to go to the Google Analytics site and look!

If these features aren’t used they may as well not exist. So, to give teammates easier access you can make a custom Slackbot to display Google Analytics.

What you’ll need to build the Analytics bot

  • Make sure that your starterbot.py file looks just like this one. In this tutorial most of the changes we make will be to its handle_command() function.
  • A Google Analytics account connected to your site and permission on it to manage users. This is free until you exceed 10 million hits a month.
  • The numpy, statsmodels, matplotlib, slackclient, apiclient, and google api dependencies:

pip3 install numpy
pip3 install statsmodels
pip3 install matplotlib
pip3 install slackclient
pip3 install apiclient
pip3 install --upgrade google-api-python-client

Display pageviews for the week

We are going to start with the bot from this prerequisite tutorial, so make sure that you followed this tutorial correctly before moving on. The bot should be able to respond to commands.

To connect the bot to Google Analytics we need to import some libraries at the top of starterbot.py. The rest of the code changes in this tutorial will take place in this file.

from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
from googleapiclient.errors import HttpError

Next, define the necessary credentials to our Google Analytics account. Our bot only needs to read the information, so let’s set the scope to readOnly. We need to create a service account, which can be done by following the first section of Google’s Reporting API instructions. Make sure to save the Key File, we’ll be using it later.

When you finish , notice the “email” given to your bot’s service accout.

This is important because because we have to give this account read access to the Google Analytics data under Google Analytics → Admin → User Management

Next get the View_ID through Google’s Account Explorer.

Now using the Key File and View_ID

SCOPES = ['https://www.googleapis.com/auth/analytics.readonly']
KEY_FILE_LOCATION = 'exampleKey.json'
VIEW_ID = 'example'

We can use the variables defined above to initialize an analytics object.

def initialize_analyticsreporting():
  credentials = ServiceAccountCredentials.from_json_keyfile_name(
      KEY_FILE_LOCATION, SCOPES)
  analytics = build('analytics', 'v4', credentials=credentials)
  return analytics

We can then use this analytics object in a new function, count(), to return the number of pageviews on your site over the last seven days.

def count():
  analytics = initialize_analyticsreporting()
  response = analytics.reports().batchGet(
      body={
        'reportRequests': [
        {
          'viewId': VIEW_ID,
          'dateRanges': [{'startDate': '7daysAgo', 'endDate': 'today'}],
          'metrics': [{'expression': 'ga:pageviews'}]
        }]
      }
  ).execute()
  answer = response['reports'][0]['data']['totals'][0]['values'][0]
  return answer

This function contains an API call in accordance to Google Analytics Reporting API, which is set to “response.” From this batch report the ‘total’ value is returned.

To see what the original batch report looks like you can add

print(response)

between the lines where response is set and answer is returned.

You can also see the original batch report through Google’s API Explorer. I already set up the request to match the one used in count(), so all you have to do is add your VIEW_ID, authorize access to the Google Account associated with Google Analytics, and click “execute.” This explorer can also be a great way to look into Google Analytics features beyond this tutorial.

Finally, we call count() as a command by putting a new conditional in the handle_command() function.

def handle_command(command, channel):
    """
        Executes bot command if the command is known
    """
    # Default response is help text for the user
    default_response = "Not sure what you mean. Try *{}*.".format(EXAMPLE_COMMAND)

    # Finds and executes the given command, filling in response
    response = None
    # This is where you start to implement more commands!
    if command.startswith(EXAMPLE_COMMAND):
        response = "Sure...write some more code then I can do that!"
    if command.startswith("count"):
        response = '`{} pageviews!`'.format(count())
    # Sends the response back to the channel
    slack_client.api_call(
        "chat.postMessage",
        channel=channel,
        text=response or default_response
    )

Now your bot can handle “@BOTNAME count”

You can test this by running starterbot.py the same way that you ran it in the prerequisite tutorial:

python starterbot.py

When it is running, message it in Slack “@BOTNAME count”

Display any metric for the week

This shows us the proof of concept that our bot is connected to Google Analytics, but we’ve barely scratched the surface of what we can do. So let’s change the command to display any metric for the week.

First change count() to handle different commands

def count(metric):
def count(metric):
    analytics = initialize_analyticsreporting()
    response = analytics.reports().batchGet(
        body={
            'reportRequests': [
            {
                'viewId': VIEW_ID,
                'dateRanges': [{'startDate': "7daysAgo", 'endDate': "today"}],
                'metrics': [{'expression': 'ga:{}'.format(metric)}]
            }]
        }
    ).execute()
    answer = response['reports'][0]['data']['totals'][0]['values'][0]
    return answer

Next we’ll change our condition in handle_command() to correctly call our new report. This is done by recognizing the metric in the command somewhere in the count condition before you call count().

def handle_command(command, channel):
    """
        Executes bot command if the command is known
    """
    # Default response is help text for the user
    default_response = "Not sure what you mean. Try *{}*.".format(EXAMPLE_COMMAND)

    # Finds and executes the given command, filling in response
    response = None
    # This is where you start to implement more commands!
    if command.startswith(EXAMPLE_COMMAND):
        response = "Sure...write some more code then I can do that!"
    elif command.startswith("count"): 
        metric = command.split()[1]
        response = '`{} pageviews!`'.format(count())
   # Sends the response back to the channel
   slack_client.api_call(
        "chat.postMessage",
        channel=channel,
        text=response or default_response
    )

Change the response line that calls the report

def handle_command(command, channel):
    """
        Executes bot command if the command is known
    """
    # Default response is help text for the user
    default_response = "Not sure what you mean. Try *{}*.".format(EXAMPLE_COMMAND)

    # Finds and executes the given command, filling in response
    response = None
    # This is where you start to implement more commands!
    if command.startswith(EXAMPLE_COMMAND):
        response = "Sure...write some more code then I can do that!"
    elif command.startswith("count"): 
        metric = command.split()[1]
        response = '`{} {}!`'.format(count(metric), metric)
    # Sends the response back to the channel
    slack_client.api_call(
        "chat.postMessage",
        channel=channel,
        text=response or default_response
    )

Now your bot can handle “@BOTNAME count [metric]”

We just went from access to one metric to access to over one hundred, greatly expanding the capabilities of the bot. You can find a full list of metrics offered by Google Analytics in the documentation. It can be overwhelming at first, so I found this reference on commonly used metrics a helpful starting point.

Display any metric for any time

The next step is to allow the command to display any metric for any time. After all, without the ability to look at data for previous weeks the command is significantly limited.

To do this we have to do is change count() to handle different dates.

def count(metric, command):
    start_date = "7daysAgo"
    end_date = "today"
    words = command.split(' ')
    if 'from' in command:
        pos = words.index('from')
        start_date = command.split()[pos+1]
    if 'to' in command:
        pos = words.index('to')
        end_date = command.split()[pos+1]
    analytics = initialize_analyticsreporting()
    response = analytics.reports().batchGet(
        body={
            'reportRequests': [
            {
                'viewId': VIEW_ID,
                'dateRanges': [{'startDate': start_date, 'endDate': end_date}],
                'metrics': [{'expression': 'ga:{}'.format(metric)}]
            }]
        }
    ).execute()
    answer = response['reports'][0]['data']['totals'][0]['values'][0]
    return answer

Now your bot can handle “@BOTNAME count [metric]” with the optional specification of date (“from ___ to ____”)

Display a graph

At this point the bot can handle single metric displays with reasonable flexibility. But returning a single number – regardless of how cool – isn’t going to captivate your coworkers for long. Let’s make a graph!

First we need to import another library, a math plot library. This is done at the top of the program along with the other libraries we imported.

import matplotlib

Importing this library gave me trouble when I tried to run the bot on a virtual server. This was because by default matplotlib is configured to work with a graphical user interface. This solution to this is to use the Agg backend, which can be done by configuring matplotlib to use Agg before importing the rest of the files.

matplotlib.use('Agg')
import matplotlib.pylab as pl
import matplotlib.lines as ln
import matplotlib.pyplot as plt

To graph with matplotlib we need lists for the X and Y coordinates, so lets create a new count function to do just that.

The first part will be similar to our old count function

def count_xy(metric, dimension, command):
    start_date = "7daysAgo"
    end_date = "today"
    words = command.split(' ')
    if 'from' in command:
        pos = words.index('from')
        start_date = command.split()[pos+1]
    if 'to' in command:
        pos = words.index('to')
        end_date = command.split()[pos+1]
    analytics = initialize_analyticsreporting()
    response = analytics.reports().batchGet(
        body={
            'reportRequests': [
            {
                'viewId': VIEW_ID,
                'dateRanges': [{'startDate': start_date, 'endDate': end_date}],
                'metrics': [{'expression': 'ga:{}'.format(metric)}],
                'dimensions': [{'name':'ga:{}'.format(dimension)}]

            }]
        }
    ).execute()

Unlike in count(), we want to take the row of data rather than the total

    answer = response['reports'][0]['data']['rows']

This is important if you want your x-axis values arranged properly. This pulls the first dimension from the Google Analytics query to see what type of dimension is being used, if the dimension isn’t a digit then the rows get sorted from highest metric value to lowest metric value. This way graphs of pageviews per page will show the most viewed pages, while pageviews by date retains its order.

    if not answer[0]['dimensions'][0].isdigit():
        answer = sorted(answer, key=lambda x: float(x['metrics'][0]['values'][0]), reverse=True)

Lastly, add the values from answer to yArray

    yArray=[]
    for step in range(0, len(answer)):
        yArray.append(float(answer[step]['metrics'][0]['values'][0]))

    xArray=[]
    for step in range(0, len(answer)):
        xArray.append(answer[step]['dimensions'][0])

    return xArray, yArray

Now lets add a ‘graph’ condition in handle_command() to handle the new command.

def handle_command(command, channel):
    """
        Executes bot command if the command is known
    """
    # Default response is help text for the user
    default_response = "Not sure what you mean. Try *{}*.".format(EXAMPLE_COMMAND)

    # Finds and executes the given command, filling in response
    response = None
    # This is where you start to implement more commands!
    if command.startswith(EXAMPLE_COMMAND):
        response = "Sure...write some more code then I can do that!"
    elif command.startswith("count"): 
        metric = command.split()[1]
        response = '`{} {}!`'.format(count(metric), metric)
    elif command.split()[0] == 'graph':
        # Sends the response back to the channel
        slack_client.api_call(
            "chat.postMessage",
            channel=channel,
            text=response or default_response
        )

In this condition we will call to get the data from count_xy(), build a graph from it using the matplotlib library, and format the graph.

The first step is to parse through the command, similar to what we did in the count condition. But rather than just assigning the second word to “metric,” we have to assign multiple words at unknown lengths.

So in the ‘graph’ condition, first check if the command contains anything beyond @BOTNAME. We’ll add an alternate response that requests more information if it doesn’t.

        if len(command.split())>1:

Then we assign the metric, the same way that we did in count()

            metric = command.split()[1]

To accommodate for multiple command formats we can find the dimension through a keyword rather than its word number. This way it doesn’t matter if the command is “@BOTNAME graph pageview by day” or “@BOTNAME graph pageview and do it by day,” the dimension will be correctly identified.

To do this make a list of the words in the command by splitting on spaces, find the index of the keyword (in this case “by”), and setting the word that comes after “by” as the dimension.

            words = command.split(' ')
            if 'by' in command and len(command.split())>3:
                pos = words.index('by')
                dimension = command.split()[pos+1]

Call count_xy() to get the data.

                x, y=count_xy(metric, dimension, command)

And check if the values for the dimensions are not digits.

                if not x[0].isdigit():

If the dimension isn’t displayed in digits, such as pageTitle or source, matplotlib won’t name the ticks the way that we want, so we can do it manually by setting the x-axis ticks to be named after the values in our x array. We will put this in the not digit condition, because matplotlib can handle the ticks for digit arrays automatically.

First we create an array for the xticks, and fill it with the first six x values:

                    xtick_names = [x[0], x[1], x[2], x[3], x[4],  x[5],  x[6]]

Then we set the x-labels to wrap instead overlap each other and set our other arrays to be the same size as xticks; if there is a different number of ticks than values we will get an error when we try to graph.

                    my_xticks = [textwrap.fill(text,15) for text in xtick_names]
                    x = np.array([0, 1, 2, 3, 4, 5, 6])
                    plt.xticks(x, my_xticks, rotation=45)
                    y = np.array([y[0], y[1], y[2], y[3], y[4], y[5], y[6]])

Next we do some stylistic changes. The matplotlib documentation is a good place to start to learn how to customize your graphs. This is how I did it.

                    pl.plot(x, y, "r-") # plots the graph with red lines
                    plt.ylim(ymin=0) #Sets the minimum y value to 0
                    pl.grid(True, linestyle='-.') #Makes a dashed grid in the background
                    plt.xlabel(dimension.capitalize()) #Capitalizes the first letter of the x-axis label
                    plt.ylabel(metric.capitalize()) #Capitalizes the first letter of the y-axis label
                    plt.title(metric.capitalize()+' by '+dimension.capitalize()) #Sets the title as [Metric] by [Dimension]
                    plt.tight_layout() #adjusts spacing between subplots

We send the graph by saving it as a .png file and using Slack’s API command to upload files.

                    pl.savefig("graph.png") #saves the graph as a png file
                    slack_client.api_call('files.upload', channels=channel, filename='graph.png', file=open('graph.png', 'rb')) #uploads the png file to slack
                    pl.close()

I also added two else statements to make the command easier to use.

            else: #run if the command doesn’t contain more than four words
                response='`What should {} be graphed by?`'.format(metric)
        else: #run if the command doesn’t contain more than two words
            response='`Graph what?`'

Now your bot can handle “@BOTNAME graph [metric] by [dimension]” with the optional specification of date (“from ___ to ____”)

You can find a full list of dimensions offered by Google Analytics on the same page that we found metrics. It also includes links to an explanation of what each keyword means. I found this page very helpful for understanding the difference between metrics and dimensions.

Display list of possible commands

So at this point your bot can handle a variety of commands. Let’s create another conditional for handle_command() in addition to ‘count’ and ‘graph’ to help users remember them.

    elif command.split()[0] == 'help':
            response = '`Count ____ (from ____ to ____)` \n`Graph ____[Metric] by ____[Dimension] (from ____ to ____)` \n`(Dates: today / yesterday / NdaysAgo / YYYY-MM-DD)` \n`(Metrics: pageviews / adsenserevenue / <https://developers.google.com/analytics/devguides/reporting/core/dimsmets|more...>)` \n`(Dimensions: day / source / author / <https://developers.google.com/analytics/devguides/reporting/core/dimsmets|more...>)`'

So now “@BOTNAME help” will return the list of optional commands

We’re done! Check out the GitHub repo to see how the entire program should look when it’s all put together.

Next steps: usability and more commands for the Google Analytics Python Slack bot

The great thing about a custom bot is that there is tons of potential for more commands. To increase user-friendliness I recommend integrating a basic spell check on metrics or dimensions. This can be performed by listing potential options in a text file and running get_close_matches() from the difflib library:

import difflib
with open('metricList.txt') as inputfile:
    metricList = inputfile.read().split(', ')
    metric = command.split()[1]
    metric = difflib.get_close_matches(metric, metricList, n=1)
    metric = ''.join(metric)

I copied my text from metricList.txt and dimensionList.txt. They aren’t comprehensive, but do cover the terms that my team uses and then some.

Commands for statistical analysis are another good next step. The analysis can be done using statsmodels and count_xy(), making your bot even better at showing its users the key information that they need.

Other commands can explore options beyond Google Analytics integration. I recently added reminders to my bot by using the os, time, and datetime libraries; storing the reminder, its channel, and its time to send in a .txt file; and then comparing it to the current time periodically.

Have fun!

Greg Schwartz is the Director of Technology at STEMY, a student-led nonprofit dedicated to spreading stem education. He also manages the RedEye website and does school work in his free time. You can contact him at greg@stemy.org

Sign up and start building
Not ready yet? Talk to an expert.