If you plant me anywhere in Manhattan, I can confidently tell you where the nearest bubble tea place is located. This may be because I have a lot of them memorized, but for the times my memory betrays me, luckily I have the boba map on my data blog. In this tutorial, we’ll use a combination of Python, the Google Maps API, and geojsonio to create what can only be described as the most important tool in the world: a boba map.
Environment & Dependencies
We have to set our environment up before we start coding. This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install several packages that we’ll use throughout this tutorial on the command line in our project directory:
pip3 install googlemaps==2.4.6
pip3 install geocoder==1.22.4
pip3 install geojsonio==0.0.3
pip3 install pandas==0.20.1
pip3 install geocoder==1.22.4
pip3 install geopandas==0.2.1
pip3 install Shapely==1.5.17.post1
We’ll use the Google Maps API, so make sure to generate an API key. Since we’ll be working with Python throughout, using the Jupyter Notebook is the best way to get the most out of this tutorial. Once you have your notebook up and running, you can download all the data for this post from Github. Make sure you have the data in the same directory as your notebook and then we’re good to go!
For this task, we’re going to take an object-oriented programming approach. We’ll create a class called BubbleTea to take care of the processing and methods we’ll need for our bot. To accomplish this we’ll begin by using the googlemaps API module to initialize our authentication and pandas, a nice data analytics library, to read in the CSV.
import pandas as pd
import googlemaps
class BubbleTea(object):
# authentication initialized
gmaps = googlemaps.Client(key='[your-own-key]')
def __init__(self, filename):
self.boba = pd.read_csv(filename)
In the code sample above, the googlemaps
initialization is before the constructor since this API key shouldn’t necessarily change. In the constructor, however, we need the filename of the boba places as a parameter so we can use pandas to read it in as a DataFrame.
Just so that we know what we’re working with let’s take a look at the file containing bubble tea places:
import pandas as pd
pd.read_csv("./boba.csv").head()
Name | Address | |
---|---|---|
0 | Boba Guys | 11 Waverly Pl New York, NY 10002 |
1 | Bubble Tea & Crepes | 251 5th Ave, New York, NY 10016 |
2 | Bubbly Tea | 55B Bayard St New York, NY 10013 |
3 | Cafe East | 2920 Broadway, New York, NY 10027 |
4 | Coco Bubble Tea | 129 E 45th St New York, NY 10017 |
As you can see, it’s just a simple DataFrame
containing two columns, one with the name of the bubble tea place and another one with its address.
To visualize each bubble tea place as a point on a map we have to convert the addresses into coordinates. Eventually, we’ll use these coordinates to create shapely Point geospatial objects.
Let’s review how these coordinates are obtained. Because we don’t have the latitude or the longitude we’ll use the geocoder
and googlemaps
modules to request the coordinates. Below you can see the API request with geocoder.google()
. As a parameter, we provide the address which will be used to create the geospatial object. For this example I’ve used the address of a building at Columbia University.
import googlemaps
import geocoder
gmaps = googlemaps.Client(key='your-key')
geocoder.google("2920 Broadway, New York, NY 10027")
Which displays the following output:
<[OK] Google - Geocode [Alfred Lerner Hall, 2920 Broadway, New York, NY 10027, USA]>
This geospatial object has multiple attributes you can utilize. For the purpose of this tutorial, we’ll be using the lat
and lng
attributes.
geocoder.google("2920 Broadway, New York, NY 10027").lat
geocoder.google("2920 Broadway, New York, NY 10027").lng
Outputs:
40.8069421
-73.9639939
Let’s use the code we’ve reviewed above to add three columns to our boba CSV DataFrame: Latitude, Longitude, and Coordinates. This function will create the longitude and latitude columns and then use these columns to create the Point geospatial object with shapely, a library that lets us manipulate geometric objects.
import pandas as pd
import geocoder
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display
class BubbleTea(object):
# authentication initialized
gmaps = googlemaps.Client(key='your-key')
# filename: file with list of bubble tea places and addresses
def __init__(self, filename):
# initalizes csv with list of bubble tea places to dataframe
self.boba = pd.read_csv(filename)
# new code here
def calc_coords(self):
self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
The final step for this project is to visualize the geospatial data using geojsonio. But to use geojsonio, we need to convert the DataFrame above into geojson format. At first glance you may be worried since our original data was in a CSV format. Never fear, however, for we can convert this with a few lines of code. More specifically we’ll create three get methods for our visualize
function to work.
The function get_geo
returns the coordinates as a list:
import pandas as pd
import geocoder
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display
class BubbleTea(object):
# authentication initialized
gmaps = googlemaps.Client(key='your-key')
# filename: file with list of bubble tea places and addresses
def __init__(self, filename):
# initalizes csv with list of bubble tea places to dataframe
self.boba = pd.read_csv(filename)
def calc_coords(self):
self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
# new code below
def get_geo(self):
return(list(self.boba['Coordinates']))
The get_names()
function returns the Name column as series.
import pandas as pd
import geocoder
import googlemaps
from shapely.geometry import Point
from geopandas import GeoDataFrame
from geojsonio import display
class BubbleTea(object):
# authentication initialized
gmaps = googlemaps.Client(key='your-key')
# filename: file with list of bubble tea places and addresses
def __init__(self, filename):
# initalizes csv with list of bubble tea places to dataframe
self.boba = pd.read_csv(filename)
def calc_coords(self):
self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
def get_geo(self):
return(list(self.boba['Coordinates']))
# new code below
def get_names(self):
return(self.boba['Name'])
And finally, get_gdf
converts all the data into a GeoDataFrame
and then returns an object of the same GeoDataFrame
type. This is where we utilize the two previous functions since the first parameter requires the indices to be a series and the geometry
parameter requires a list.
from geopandas import GeoDataFrame
import pandas as pd
import geocoder
import googlemaps
from shapely.geometry import Point
from geojsonio import display
class BubbleTea(object):
# authentication initialized
gmaps = googlemaps.Client(key='your-key')
# filename: file with list of bubble tea places and addresses
def __init__(self, filename):
# initalizes csv with list of bubble tea places to dataframe
self.boba = pd.read_csv(filename)
def calc_coords(self):
self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
def get_geo(self):
return(list(self.boba['Coordinates']))
def get_names(self):
return(self.boba['Name'])
# new code below
def get_gdf(self):
crs = {'init': 'epsg:4326'}
return(GeoDataFrame(self.get_names(), crs=crs, geometry=self.get_geo()))
Great! Now let’s use geojsonio for some boba fun! Now that we have all our helper functions implemented, we can use them to deploy our visualization with geojsonio’s display
function.
from geopandas import GeoDataFrame
import pandas as pd
import geocoder
import googlemaps
from shapely.geometry import Point
from geojsonio import display
class BubbleTea(object):
# authentication initialized
gmaps = googlemaps.Client(key='your-key')
# filename: file with list of bubble tea places and addresses
def __init__(self, filename):
# initalizes csv with list of bubble tea places to dataframe
self.boba = pd.read_csv(filename)
def calc_coords(self):
self.boba['Lat'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lat)
self.boba['Longitude'] = self.boba['Address'].apply(geocoder.google).apply(lambda x: x.lng)
self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
def get_geo(self):
return(list(self.boba['Coordinates']))
def get_names(self):
return(self.boba['Name'])
def get_gdf(self):
crs = {'init': 'epsg:4326'}
return(GeoDataFrame(self.get_names(), crs=crs, geometry=self.get_geo()))
# new code here
def visualize(self):
self.boba['Coordinates'] = [Point(xy) for xy in zip(self.boba.Longitude, self.boba.Lat)]
updated = self.get_gdf()
display(updated.to_json())
And we’ve done it, our BubbleTea
object is finished and ready to be used. Continuing on in the same Jupyter notebook we’ll use the code we just created to build the map. We initialize the class with our boba file in a new Jupyter notebook cell. Remember, this only initializes what’s in the constructor so as of now we only have a pandas DataFrame
created — the GeoDataFrame
has not been yet created.
boba = BubbleTea("./boba.csv")
Next we call the calc_coords
method. Recall that this function makes API calls to Google Maps for the latitude and longitude and then takes these two columns to convert to a shapely Point
object.
Because of the many Google Maps API calls, expect this to take a while. In the meantime, spend some time readings this awesome post.
boba.calc_coords()
The longest part is over! Now we’re ready for our awesome boba map:
boba.visualize()
This gets us a beautiful interactive map we can then use for whatever purposes. I chose to include it in a Github Gist, but geojsonio has a lot of different ways of sharing your content so feel free to choose whatever fits your needs.
And that’s a wrap! If you’d like to learn more about geospatial analysis, check out the following resources:
GeoJSON
OpenStreetMap
CartoDB
If you liked what you did here, follow me @lesleyclovesyou on Twitter for more content, data science ramblings, and most importantly, retweets of super cute puppies.