Generating music with Python and Neural Networks using Magenta for TensorFlow

December 18, 2018
Written by
Sam Agnew
Twilion

MIDI Data

Machine Learning is all the rage these days, and with open source frameworks like TensorFlow developers have access to a range of APIs for using machine learning in their projects. Magenta, a Python library built by the TensorFlow team, makes it easier to process music and image data in particular.

Since I started learning how to code, one of the things that has always fascinated me was the concept of computers artificially creating music. I even published a paper talking about it in an undergrad research journal my freshman year of college.

Let's walk through the basics of setting up Magenta and programmatically generating some simple melodies in MIDI file format.

Installing Magenta

First we need to install Magenta, which can be done using pip. Make sure you create a virtual environment before installing. I am using Python 3.6.5, but Magenta is compatible with both Python 2 and 3.

Run the following command to install Magenta in your virtual environment, it's a pretty big library with a good amount of dependencies so it might take a bit of time:

pip install magenta==0.4.0

Alternatively, if you want to install Magenta globally you can use the following shell commands to run an install script created by the Magenta team to simplify things:

curl https://raw.githubusercontent.com/tensorflow/magenta/master/magenta/tools/magenta-install.sh > /tmp/magenta-install.sh
bash /tmp/magenta-install.sh

This will give you access to both the Magenta and TensorFlow Python modules for development, as well as scripts to work with all of the models that Magenta has available. For this post, we're going to be using the Melody recurrent neural network model.

Generating a basic melody in MIDI format

Rather than training our own model, let's use one of the pre-trained melody models provided by the TensorFlow team.

First, download this file, which is a .mag bundle file for a recurrent neural network that has been trained on thousands of MIDI files. We're going to use this as a starting point to generate some melodies. Save it to the current directory you are working in.

When generating a melody, we have to provide a priming melody. This can be a MIDI file using the --prime_midi flag, or a format that Magenta uses which is a string representation of a Python list using the --prime_melody flag. Let's create some melodies using middle C as the starting note, which in this format would be "[60]".

Each melody will be 8 measures in length, corresponding to the --num_steps flag. This refers to how many 16th step durations the generated tune will be.

With your virtual environment activated, run the following command, making sure to replace /path/to/basic_rnn.mag with an actual path to the .mag file you just downloaded:

melody_rnn_generate \
--config=basic_rnn \
--bundle_file=/path/to/basic_rnn.mag \
--output_dir=/tmp/melody_rnn/generated \
--num_outputs=10 \
--num_steps=128 \
--primer_melody="[60]"

This should output 10 MIDI files in the directory /tmp/melody_rnn/generated, or whichever directory you want using the --output_dir flag. It will take some time to execute, so be patient!

Navigate to the output directory, and try playing the MIDI files to see what kind of music you just created! If you are on Mac OS X, the GarageBand program can play MIDI files.

Here's an example of a melody that was generated when I ran this code:

Using different models for more structured compositions

Those melodies are cool for the novelty of a machine composing music, but to me it still sounds mostly like a bunch of random notes. The Magenta team provides two other pre-trained models we can use to generate melodies that have more structure.

The previous model worked by generating notes one by one, only keeping track of the most recent note. That's why a lot of the melodies sound all over the place. Among other things, this Lookback RNN keeps track of the most recent two bars, so it is able to add more repetition into the music.

Download the Lookback RNN model, and save it to the same directory you saved basic_rnn.mag.

Let's generate some melodies using the Lookback RNN, remembering to replace /path/to/lookback_rnn.mag with an actual path to the .mag file you downloaded:

melody_rnn_generate \
--config=lookback_rnn \
--bundle_file=/path/to/lookback_rnn.mag \
--output_dir=/tmp/melody_rnn/generated \
--num_outputs=10 \
--num_steps=128 \
--primer_melody="[60]"

You will likely notice that the melodies you generate with this one have a lot more repetition. Here's one of the ones I got:

Now let's try out the Attention RNN. Instead of just looking back at the last two measures, this one is designed to give more long term structure in generated compositions. You can read about the algorithm in this blog post.

Again download the model, and save it to the right directory, and then run the following:

melody_rnn_generate \
--config=attention_rnn \
--bundle_file=/path/to/attention_rnn.mag \
--output_dir=/tmp/melody_rnn/generated \
--num_outputs=10 \
--num_steps=256 \
--primer_melody="[60]"

For this example, we are generating melodies that are twice as long. One of the melodies I generated seems to even have a structure that could be repeated. To me, this one sounds like it could be turned into a long form song, complete with different sections that flow into each other.


Using a MIDI file as a priming melody

In the previous examples we have only used a single note, middle C, as the priming melody. But it's much more interesting to create music from an already existing melody that a human wrote. Let's use a MIDI file with a more complex melody that was written by a human to create a musical collaboration between man and machine.

I'm a guitarist and would like to hear a computer shred. So for this example I'm going to use the guitar solo from Omens of Love by the Japanese fusion band T-Square. We'll use the first four measures of this solo, which provide a nice melodic start, and see if we can generate four more measures to complement it.

Shred

Download this MIDI file containing a 4-bar melody, and save it to a directory of your choice.

Now use whichever model you want from the previous sections to create a computational jam session! I am going to use the Attention RNN because I liked some of the results I got before:

melody_rnn_generate \
--config=attention_rnn \
--bundle_file=/path/to/attention_rnn.mag \
--output_dir=/tmp/melody_rnn/generated \
--num_outputs=10 \
--num_steps=128 \
--primer_midi=/path/to/OmensOfLove.mid

You might have to generate a ton of output melodies to get something that sounds human, but out of the 10 I generated, this one works really nicely!

Awesome! What else can I do?

We were able to generate some simple melodies with some pre-trained neural network models, and that's awesome! We're already off to a great start when it comes to using machines to create music.

It's a lot of fun to feed different MIDIs to a neural network model and see what comes out. Magenta offers a whole variety of models to work with, and in this post we've only covered the first steps to working with the Melody RNN model.

Keep an eye out for future Twilio blog posts on working with music data using Magenta, including how to train your own models.

Feel free to reach out for any questions or to show off any cool artificial creativity related projects you build or find out about: