How to Build A Boba Tea Shop Finder with Python, Google Maps and GeoJSON

September 22, 2017
Written by
Lesley Cordero
Contributor
Opinions expressed by Twilio contributors are their own

Boba tea

If you plant me anywhere in Manhattan, I can confidently tell you where the nearest bubble tea place is located. This may be  because I have a lot of them memorized, but for the times my memory betrays me, luckily I have the boba map on my data blog. In this tutorial, we’ll use a combination of Python, the Google Maps API, and geojsonio to create what can only be described as the most important tool in the world: a boba map.

Environment & Dependencies

We have to set our environment up before we start coding. This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install several packages that we’ll use throughout this tutorial on the command line in our project directory:

pip3 install googlemaps==2.4.6
pip3 install geocoder==1.22.4
pip3 install geojsonio==0.0.3
pip3 install pandas==0.20.1
pip3 install geocoder==1.22.4
pip3 install geopandas==0.2.1
pip3 install Shapely==1.5.17.post1

We’ll use the Google Maps API, so make sure to generate an API key. Since we’ll be working with Python throughout, using the Jupyter Notebook is the best way to get the most out of this tutorial. Once you have your notebook up and running, you can download all the data for this post from Github. Make sure you have the data in the same directory as your notebook and then we’re good to go!

For this task, we’re going to take an object-oriented programming approach. We’ll create a class called BubbleTea to take care of the processing and methods we’ll need for our bot. To accomplish this we’ll begin by using the googlemaps API module to initialize our authentication and pandas, a nice data analytics library, to read in the CSV.

import pandas as pd 
import googlemaps

class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='[your-own-key]')

    def __init__(self, filename):
        self.boba = pd.read_csv(filename)

In the code sample above, the googlemaps initialization is before the constructor since this API key shouldn’t necessarily change. In the constructor, however, we need the filename of the boba places as a parameter so we can use pandas to read it in as a DataFrame.

Just so that we know what we’re working with let’s take a look at the file containing bubble tea places:

import pandas as pd
pd.read_csv("./boba.csv").head()

NameAddress
0Boba Guys11 Waverly Pl New York, NY 10002
1Bubble Tea & Crepes251 5th Ave, New York, NY 10016
2Bubbly Tea55B Bayard St New York, NY 10013
3Cafe East2920 Broadway, New York, NY 10027
4Coco Bubble Tea129 E 45th St New York, NY 10017

As you can see, it’s just a simple DataFrame containing two columns, one with the name of the bubble tea place and another one with its address.
To visualize each bubble tea place as a point on a map we have to convert the addresses into coordinates. Eventually, we’ll use these coordinates to create shapely Point geospatial objects.

Let’s review how these coordinates are obtained. Because we don’t have the latitude or the longitude we’ll use the geocoder and googlemaps modules to request the coordinates. Below you can see the API request with geocoder.google(). As a parameter, we provide the address which will be used to create the geospatial object. For this example I’ve used the address of a building at Columbia University.

import googlemaps
import geocoder

gmaps = googlemaps.Client(key='your-key')

geocoder.google("2920 Broadway, New York, NY 10027")

Which displays the following output:

<[OK] Google - Geocode [Alfred Lerner Hall, 2920 Broadway, New York, NY 10027, USA]>

This geospatial object has multiple attributes you can utilize. For the purpose of this tutorial, we’ll be using the lat and lng attributes.

geocoder.google("2920 Broadway, New York, NY 10027").lat
geocoder.google("2920 Broadway, New York, NY 10027").lng

Outputs:

40.8069421
-73.9639939

Let’s use the code we’ve reviewed above to add three columns to our boba CSV DataFrame: Latitude, Longitude, and Coordinates. This function will create the longitude and latitude columns and then use these columns to create the Point geospatial object with shapely, a library that lets us manipulate geometric objects.

import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)

    # new code here
    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

The final step for this project is to visualize the geospatial data using geojsonio. But to use geojsonio, we need to convert the DataFrame above into geojson format. At first glance you may be worried since our original data was in a CSV format. Never fear, however, for we can convert this with a few lines of code. More specifically we’ll create three get methods for our visualize function to work.

The function get_geo returns the coordinates as a list:

import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)


    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

    # new code below
    def get_geo(self):
        return(list(self.boba['Coordinates']))

The get_names() function returns the Name column as series.

import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)


    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]


    def get_geo(self):
        return(list(self.boba['Coordinates']))


    # new code below
    def get_names(self):
        return(self.boba['Name'])

And finally, get_gdf converts all the data into a GeoDataFrame and then returns an object of the same GeoDataFrame type. This is where we utilize the two previous functions since the first parameter requires the indices to be a series and the geometry parameter requires a list.

from geopandas import GeoDataFrame
import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)

    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

    def get_geo(self):
        return(list(self.boba['Coordinates']))

    def get_names(self):
        return(self.boba['Name'])


    # new code below
    def get_gdf(self):
        crs = {'init': 'epsg:4326'}
        return(GeoDataFrame(self.get_names(), crs=crs, geometry=self.get_geo()))

Great! Now let’s use geojsonio for some boba fun! Now that we have all our helper functions implemented, we can use them to deploy our visualization with geojsonio’s display function.

from geopandas import GeoDataFrame
import pandas as pd 
import geocoder 
import googlemaps
from shapely.geometry import Point
from geojsonio import display


class BubbleTea(object):

    # authentication initialized
    gmaps = googlemaps.Client(key='your-key')

    # filename: file with list of bubble tea places and addresses
    def __init__(self, filename):
        # initalizes csv with list of bubble tea places to dataframe
        self.boba = pd.read_csv(filename)

    def calc_coords(self): 
        self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
        self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]

    def get_geo(self):
        return(list(self.boba['Coordinates']))

    def get_names(self):
        return(self.boba['Name'])

    def get_gdf(self):
        crs = {'init': 'epsg:4326'}
        return(GeoDataFrame(self.get_names(), crs=crs, geometry=self.get_geo()))


    # new code here
    def visualize(self):
        self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
        updated = self.get_gdf()
        display(updated.to_json())

And we’ve done it, our BubbleTea object is finished and ready to be used. Continuing on in the same Jupyter notebook we’ll use the code we just created to build the map. We initialize the class with our boba file in a new Jupyter notebook cell. Remember, this only initializes what’s in the constructor so as of now we only have a pandas DataFrame created — the GeoDataFrame has not been yet created.

boba = BubbleTea("./boba.csv")

Next we call the calc_coords method. Recall that this function makes API calls to Google Maps for the latitude and longitude and then takes these two columns to convert to a shapely Point object.

Because of the many Google Maps API calls, expect this to take a while. In the meantime, spend some time readings this awesome post.

boba.calc_coords()

The longest part is over! Now we’re ready for our awesome boba map:

boba.visualize()

This gets us a beautiful interactive map we can then use for whatever purposes. I chose to include it in a Github Gist, but geojsonio has a lot of different ways of sharing your content so feel free to choose whatever fits your needs.

And that’s a wrap! If you’d like to learn more about geospatial analysis, check out the following resources:
GeoJSON 
OpenStreetMap 
CartoDB

If you liked what you did here, follow me @lesleyclovesyou on Twitter for more content, data science ramblings, and most importantly, retweets of super cute puppies.