Working with Files Asynchronously in Python using aiofiles and asyncio

May 13, 2021
Written by
Sam Agnew
Twilion

Asynchronous code has become a mainstay of Python development. With asyncio becoming part of the standard library and many third party packages providing features compatible with it, this paradigm is not going away anytime soon.

If you're writing asynchronous code, it's important to make sure all parts of your code are working together so one aspect of it isn't slowing everything else down. File I/O can be a common blocker on this front, so let's walk through how to use the aiofiles library to work with files asynchronously.

Starting with the basics, this is all the code you need to read the contents of a file asynchronously (within an async function):

async with aiofiles.open('filename', mode='r') as f:
    contents = await f.read()
print(contents)

Let's move on and dig deeper.

What is non-blocking code?

You may hear terms like "asynchronous", "non-blocking" or "concurrent" and be a little confused as to what they all mean. According to this much more detailed tutorial, two of the primary properties are:

  • Asynchronous routines are able to “pause” while waiting on their ultimate result to let other routines run in the meantime.
  • Asynchronous code, through the mechanism above, facilitates concurrent execution. To put it differently, asynchronous code gives the look and feel of concurrency.

So asynchronous code is code that can hang while waiting for a result, in order to let other code run in the meantime. It doesn't "block" other code from running so we can call it "non-blocking" code.

The asyncio library provides a variety of tools for Python developers to do this, and aiofiles provides even more specific functionality for working with files.

Setting Up

Make sure to have your Python environment setup before we get started. Follow this guide up through the virtualenv section if you need some help. Getting everything working correctly, especially with respect to virtual environments is important for isolating your dependencies if you have multiple projects running on the same machine. You will need at least Python 3.7 or higher in order to run the code in this post.

Now that your environment is set up, you’re going to need to install some third party libraries. We’re going to use aiohttp so install this with the following command after activating your virtual environment:

pip install aiofiles==0.6.0

For the examples in the rest of this post, we'll be using JSON files of Pokemon API data corresponding to the original 150 Pokemon. You can download a folder with all of those here. With this you should be ready to move on and write some code.

Reading from a file with aiofiles

Let's begin with by simply opening a file corresponding to a particular Pokemon, parsing its JSON into a dictionary, and printing out its name:

import aiofiles
import asyncio
import json


async def main():
    async with aiofiles.open('articuno.json', mode='r') as f:
        contents = await f.read()
    pokemon = json.loads(contents)
    print(pokemon['name'])

asyncio.run(main())

When running this code, you should see "articuno" printed to the terminal. You can also iterate through the file asynchronously, line by line (this code will print out all 9271 lines of articuno.json):

import aiofiles
import asyncio

async def main():
    async with aiofiles.open('articuno.json', mode='r') as f:
        async for line in f:
            print(line)

asyncio.run(main())

Writing to a file with aiofiles

Writing to a file is also similar to standard Python file I/O. Let's say we wanted to create files containing a list of all moves that each Pokemon can learn. For a simple example, here's what we would do for the Pokemon Ditto, who can only learn the move "transform":

import aiofiles
import asyncio

async def main():
    async with aiofiles.open('ditto_moves.txt', mode='w') as f:
        await f.write('transform')

asyncio.run(main())

Let's try this with a Pokemon that has more than one move, like Rhydon:

import aiofiles
import asyncio
import json


async def main():
    # Read the contents of the json file.
    async with aiofiles.open('rhydon.json', mode='r') as f:
        contents = await f.read()

    # Load it into a dictionary and create a list of moves.
    pokemon = json.loads(contents)
    name = pokemon['name']
    moves = [move['move']['name'] for move in pokemon['moves']]

    # Open a new file to write the list of moves into.
    async with aiofiles.open(f'{name}_moves.txt', mode='w') as f:
        await f.write('\n'.join(moves))


asyncio.run(main())

If you open up rhydon_moves.txt you should see a file with 112 lines that starts something like this.

A text file containing the list of moves that Rhydon can learn

Using asyncio to go through many files asynchronously

Now let's get a little more complicated and do this for all 150 Pokemon that we have JSON files for. Our code will have to read from every file, parse the JSON, and rewrite each Pokemon's moves to a new file:

import aiofiles
import asyncio
import json
from pathlib import Path


directory = 'directory/your/files/are/in'


async def main():
    pathlist = Path(directory).glob('*.json')

    # Iterate through all json files in the directory.
    for path in pathlist:
        # Read the contents of the json file.
        async with aiofiles.open(f'{directory}/{path.name}', mode='r') as f:
            contents = await f.read()

        # Load it into a dictionary and create a list of moves.
        pokemon = json.loads(contents)
        name = pokemon['name']
        moves = [move['move']['name'] for move in pokemon['moves']]

        # Open a new file to write the list of moves into.
        async with aiofiles.open(f'{directory}/{name}_moves.txt', mode='w') as f:
            await f.write('\n'.join(moves))


asyncio.run(main())

After running this code, you should see the directory of Pokemon files populated with .txt files alongside the .json ones, containing move lists corresponding to each Pokemon.

The ouput of an ls command, displaying json files and txt files side by side

If you need to perform some asynchronous actions and want to end with data corresponding to those asynchronous tasks, such as a list with each Pokemon's moves after having written the files, you can use asyncio.ensure_future and asyncio.gather.

You can break out the portion of your code that handles each file into its own async function, and append promises for those function calls to a list of tasks. Here's an example of what that function, and your new main function would look like:

async def write_pokemon_moves(filename):
    # Read the contents of the json file.
    async with aiofiles.open(f'{directory}/{filename}', mode='r') as f:
        contents = await f.read()

    # Load it into a dictionary and create a list of moves.
    pokemon = json.loads(contents)
    name = pokemon['name']
    moves = [move['move']['name'] for move in pokemon['moves']]

    # Open a new file to write the list of moves into.
    async with aiofiles.open(f'{directory}/{name}_moves.txt', mode='w') as f:
        await f.write('\n'.join(moves))
    return { 'name': name, 'moves': moves }


async def main():
    pathlist = Path(directory).glob('*.json')

    # A list to be populated with async tasks.
    tasks = []

    # Iterate through all json files in the directory.
    for path in pathlist:
        tasks.append(asyncio.ensure_future(write_pokemon_moves(path.name)))

    # Will contain a list of dictionaries containing Pokemons' names and moves
    moves_list = await asyncio.gather(*tasks)

This is a common way to utilize asynchronous code in Python, and is often used for things like making HTTP requests.

So what do I use this for?

The examples in this post using data from the Pokemon were just an excuse to show the functionality of the aiofiles module, and how you would write code to navigate through a directory of files for reading and writing. Hopefully, you can adapt these code samples to the specific problems you're trying to solve so file I/O doesn't become a blocker in your asynchronous code.

We have only scratched the surface of what you can do with aiohttp and asyncio, but I hope that this has made starting your journey into the world of asynchronous Python a little easier.

I’m looking forward to seeing what you build. Feel free to reach out and share your experiences or ask any questions.