Automated Headless Browser scripting in Node.js with Playwright

April 18, 2020
Written by
Sam Agnew
Twilion

Copy of Language template - GENERIC3 (3).png

Sometimes the data you need is available online, but not through a public API. Web scraping can be useful in these situations, but only if this data can be accessed statically on a web page. Fortunately for developers everywhere, most things that you can do manually in the browser can be done using Playwright, a Node library built by the same team that made Puppeteer which provides a high-level API for automating various browsers.

Let's walk through how to use Playwright to interact with web pages programmatically. In this example we'll use the Native Land Digital tool, an awesome project built to teach people more about their local indigenous history. In this case, an API does exist, but it only takes location data in the form of geo-coordinates rather than a more user-friendly address. We'll write code to programmatically type an address and figure out which Native land corresponds to that location.

Setting up dependencies

Before moving on, you will need to make sure you have an up to date version of Node.js and npm installed.

Navigate to the directory where you want this code to live and run the following command in your terminal to create a package for this project:

npm init --yes

The --yes argument runs through all of the prompts that you would otherwise have to fill out or skip. Now that we have a package.json for our project run the following command in your terminal to install Playwright:

npm install playwright@0.13.0

Note: When you install Playwright, it will come with binaries for a few different browsers so their might be a decent amount of network traffic during installation.

Launching Playwright and taking a screenshot of a page

Let's get started by just launching Playwright, navigating to a web page, and taking a screenshot of the page. Let's use Chromium for the examples in this tutorial.

Here's some "Hello World" code for taking a screenshot of a page. Create a file called index.js and add this code to it:

const playwright = require('playwright');

(async () => {
  const browser = await playwright.chromium.launch();
  const page = await browser.newPage();
  await page.goto('https://native-land.ca/');
  await page.screenshot({ path: 'example.png' });

  await browser.close();
})();

Run the code with node index.js in your terminal from the same directory that the code is saved in. After a couple of seconds, open the example.png image that it saved. It should look something like this:

Disclaimer screenshot

Depending on how fast the code runs, your screenshot might look different if the disclaimer modal element didn't have enough time to finish popping up. When I run my code multiple times, the screenshot is slightly different each time, so don't worry if yours doesn't look just like this.

You can use the page.waitFor() function to have your script wait a number of milliseconds before running other commands. This is necessary in some cases to make your scripts interact with pages in a way that is more "human", especially on pages that have animations which could take time to finish. You want to be sure that you're not going to try interacting with things on the page that don't exist yet. For that purpose, there's also a page.waitForSelector() function that takes a specific CSS selector and waits for an element fitting that description to appear. We will use this later.

Before moving on, we're going to have to get rid of this modal element to work with the rest of the page to get useful information. Let's take a closer look to come up with a good CSS selector to access the modal button directly in code. If you left-click on an element in your browser of choice, you should see an option that says something along the lines of "Inspect Element".

Inspect element

You should see the HTML representing this element as a modal-footer class with a button element as its direct child.

Footer element

Interacting with elements on a web page

Playwright provides methods page.click to click a DOM element and page.type to type text. To click on the modal button, we'll use the CSS selector, .modal-footer > button, which uses the child combinator CSS selector to get the button we're looking for.

Change your code to click on this button and then take another screenshot:


const playwright = require('playwright');

(async () => {
  const browser = await playwright.chromium.launch();
  const page = await browser.newPage();

  const MODALBUTTONSELECTOR = '.modal-footer > button';

  await page.goto('https://native-land.ca/');
  await page.waitFor(200);
  await page.click(MODALBUTTONSELECTOR);

  await page.screenshot({path: 'example.png'});

  await browser.close();
})();

Run this code and the new screenshot should look something like this:

Modal cleared screenshot

Now we see the modal element is gone and the rest of the page is loading. We'll need to type in the text field, so let's find a CSS selector for that.

The text box is an input field that has a placeholder element, so we can use the attribute selector to get 'input[placeholder=Search]'. We can also use the page.waitForSelector() function mentioned earlier to wait until the dropdown of location suggestions appear after typing a location we want to look up.

Looking at the HTML

Let's write some new code to type in this text field, and wait for location suggestions to appear using the selector li.active > a before taking another screenshot:


const playwright = require('playwright');

(async () => {
  const browser = await playwright.chromium.launch();
  const page = await browser.newPage();

  const MODAL_BUTTON_SELECTOR = '.modal-footer > button';
  const SEARCH_SELECTOR = 'input[placeholder=Search]';
  const LOCATION_SELECTOR = 'li.active > a';

  await page.goto('https://native-land.ca/');
  await page.waitFor(200);
  await page.click(MODAL_BUTTON_SELECTOR);
  await page.waitFor(300);

  await page.click(SEARCH_SELECTOR);
  await page.keyboard.type('Philadelphia');
  await page.waitForSelector(LOCATION_SELECTOR);

  await page.screenshot({path: 'example.png'});

  await browser.close();
})();

Notice we're also waiting for 300 milliseconds before typing in the search box, just to give the page a little bit of time to load. If this doesn't work for you, try messing with the numbers a bit. This latest screenshot should look like this:

Search bar screenshot

Now we have location suggestions appearing and just need to click on that first one to get a result.

Reading data from the page

With Playwright, you can also evaluate HTML elements to read the innerText and innerHTML from them. This is how we're going to access the data itself.

After typing a location and loading the results, inspect the element that the text appears in so we can come up with a good CSS selector for them.

 

This one is pretty straightforward. It's a div with the results-tab class, so .results-tab should work. This element exists immediately as the page is loaded, however, so if we want to write code that waits for the actual results to appear, we'll have to use .results-tab > p to refer to the child nodes in this tab.

We can use the page.$, which takes a CSS selector, to create an object for the element we want, and then the evaluate method on this ElementHandle object to write code to directly access what's inside of the element.

Modify your code in index.js one last time:


const playwright = require('playwright');

(async () => {
  const browser = await playwright.chromium.launch();
  const page = await browser.newPage();

  const MODAL_BUTTON_SELECTOR = '.modal-footer > button';
  const SEARCH_SELECTOR = 'input[placeholder=Search]';
  const LOCATION_SELECTOR = 'li.active > a';
  const RESULTS_SELECTOR = '.results-tab';

  await page.goto('https://native-land.ca/');
  await page.waitFor(200);
  await page.click(MODAL_BUTTON_SELECTOR);
  await page.waitFor(300);

  await page.click(SEARCH_SELECTOR);
  await page.keyboard.type('Philadelphia');
  await page.waitForSelector(LOCATION_SELECTOR);

  await page.click(LOCATION_SELECTOR);
  await page.waitForSelector(`${RESULTS_SELECTOR} > p`);

  const results = await page.$(RESULTS_SELECTOR);
  const text = await results.evaluate(element => element.innerText);
  console.log(text);

  await page.screenshot({path: 'example.png'});

  await browser.close();
})();

Run this code again and you should see the location text printing to your console, and your final screenshot should look like this:

Final screenshot

In this case, we can see that the city of Philadelphia exists on Lenape land.

Getting it right and moving forward

Sometimes these scripts can be tricky to get working correctly, so another useful trick for debugging is to run Playwright in non-headless mode to see exactly what's going on. You can do this when you first launch a browser by passing { headless: false } as an optional parameter. With this option, Playwright can even work for testing Chrome Extensions.

Another cool thing about Playwright compared to other libraries is that it works with different browsers and not just Chromium. Try going through the code samples in this post and changing const browser = await playwright.chromium.launch(); to const browser = await playwright.firefox.launch(); to use FireFox instead of Chromium and see how it affects the behavior of your code.

Now that you have the ability to programmatically manipulate web pages and access data on them in a way that you can't with static web scraping tools, I’m looking forward to seeing what you build! Feel free to reach out and share your experiences or ask any questions.