Data Science and Linear Algebra Fundamentals with Python, SciPy, & NumPy
Math is relevant to software engineering but it is often overshadowed by all of the exciting tools and technologies. In the field of data science, however, being familiar with linear algebra and statistics is very important to statistical analysis and prediction. In this tutorial, we’ll use SciPy and NumPy to learn some of the fundamentals of linear algebra and statistics.
We’ll be using Python to show how different statistical concepts can be applied computationally. Specifically, we’ll work with NumPy, a scientific computing module for Python.
This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Once you have Python and Pip installed, clone this repo using Git as follows:
git clone email@example.com:lesley2958/dod-math.git
The Git repository contains all of the data you’ll need for this tutorial.!-->
Next, install the NumPy, SciPy, and ...
Build a Bot Powered Slack Game with Python
One of my all-time favorite Facebook groups is “DogSpotting.” For those of you unfamiliar with this revolutionary group, it’s a Facebook group dedicated to posting pictures of random dogs you see as you go along your regular day. There are tons of “spotting” rules, but any way you slice it, this group is awesome.
Using this model for inspiration, I built a Slack bot for a college student group I was involved in once upon a time. We named it ADI Spotting and dedicated an entire Slack channel to posting “spottings” of whenever we’d see each other on campus, outside of our own events and meetings. In this tutorial, I will walk you through the steps to create this bot on your own Slack organization.
Python Environment Setup
But before we even get started, we have to set our environment up. This guide was written in Python 3 ...
Embedding Maps with Python & Plotly
Data Visualization is an art form. Whether it be a simple line graph or complex objects like wordclouds or sunbursts, there are countless tools across different programming languages and platforms. The field of geospatial analysis is no exception. In this tutorial we’ll build a map visualization of the United States Electoral College using Python’s
plotlymodule and a Jupyter Notebook.
Python Visualization Environment Setup
This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install the
plotlymodule that we’ll use throughout this tutorial. You can do this by running the following in the terminal or command prompt on your operating system:
pip3 install plotly==2.0.9 pip3 install jupyter==1.0.0
Since we’ll be working with Python interactively, using the Jupyter Notebook is the best way to get the most out ...!-->
Making Sentiment Analysis Easy With Scikit-Learn
Sentiment analysis uses computational tools to determine the emotional tone behind words. Python has a bunch of handy libraries for statistics and machine learning so in this post we’ll use Scikit-learn to learn how to add sentiment analysis to our applications.
Sentiment Analysis isn’t a new concept. There are thousands of labeled datasets out there, labels varying from simple positive and negative to more complex systems that determine how positive or negative is a given text.
For this post, we’ll use a pre-labeled dataset consisting of Twitter tweets that are already labeled as positive or negative. Using this data, we’ll build a model that categorizes any tweet as either positive or negative with Scikit-learn.
Scikit-learn is a Python module with built-in machine learning algorithms. In this tutorial, we’ll specifically use the Logistic Regression model, which is a linear model commonly used for classifying binary data ...
Basic Statistics in Python with NumPy and Jupyter Notebook
While not all data science relies on statistics, a lot of the exciting topics like machine learning or analysis relies on statistical concepts. In this tutorial, we’ll learn how to calculate introductory statistics in Python.
What is Statistics?
Statistics is a discipline that uses data to support claims about populations. These “populations” are what we refer to as “distributions.” Most statistical analysis is based on probability, which is why these pieces are usually presented together. More often than not, you’ll see courses labeled “Intro to Probability and Statistics” rather than separate intro to probability and intro to statistics courses. This is because probability is the study of random events, or the study of how likely it is that some event will happen.
Let’s use Python to show how different statistical concepts can be applied computationally. We’ll work with NumPy, a scientific computing module in ...
How to Build A Boba Tea Shop Finder with Python, Google Maps and GeoJSON
If you plant me anywhere in Manhattan, I can confidently tell you where the nearest bubble tea place is located. This may be because I have a lot of them memorized, but for the times my memory betrays me, luckily I have the boba map on my data blog. In this tutorial, we’ll use a combination of Python, the Google Maps API, and geojsonio to create what can only be described as the most important tool in the world: a boba map.
Environment & Dependencies
We have to set our environment up before we start coding. This guide was written in Python 3.6. If you haven’t already, download Python and Pip. Next, you’ll need to install several packages that we’ll use throughout this tutorial on the command line in our project directory:
pip3 install googlemaps==2.4.6 pip3 install geocoder==1.22.4 pip3 ...
Analyzing Messy Data Sentiment with Python and nltk
Sentiment analysis uses computational tools to determine the emotional tone behind words. This approach can be important because it allows you to gain an understanding of the attitudes, opinions, and emotions of the people in your data.
At a higher level, sentiment analysis involves natural language processing and artificial intelligence by taking the text element, transforming it into a format that a machine can read, and using statistics to determine the actual sentiment.
In this tutorial, we’ll use the natural language processing module, nltk, to determine the sentiment of tweets from Twitter.
Sentiment analysis on text
Sentiment analysis isn’t a new concept. There are thousands of labeled data out there, labels varying from simple positive and negative to more complex systems that determine how positive or negative is a given text. Because there’s so much ambiguity within how textual data is labeled, there’s no one way ...
What’s in your Pocket? Visualizing your Reading List with Python
I’m going to give you a little bit of a spoiler alert: I’ve read the equivalent of about 14 books this past year. Now I’m not a cover-to-cover novel reading person — I consume most of my content in the form of articles and tutorials. So while I’m feverishly reading all the time I never have a sense of how much I’m actually reading. After all it’s not like I’m exactly keeping track of how many articles I’m reading.
But I could! What I didn’t mention is that my reading flow is almost completely through Pocket. For those of you who don’t know, Pocket is a convenient way to save content (whether that be in the form of articles, video, etc) for later use. This is especially important to me because it gives me an easy way of viewing content while ...
Getting Started on Geospatial Analysis with Python, GeoJSON and GeoPandas
As a native New Yorker, I would be a mess without Google Maps every single time I go anywhere outside the city. We take products like Google Maps for granted, but they’re an important convenience. Products like Google or Apple Maps are built on foundations of geospatial technology. At the center of these technologies are locations, their interactions and roles in a greater ecosystem of location services.
This field is referred to as geospatial analysis. Geospatial analysis applies statistical analysis to data that has geographical or geometrical components. In this tutorial, we’ll use Python to learn the basics of acquiring geospatial data, handling it, and visualizing it. More specifically, we’ll do some interactive visualizations of the United States!