End-to-end Testing with Python and Playwright

In this post I am going to describe how you can start adding end-to-end tests on a web application using Python and Microsoft’s open source Playwright application. I am going to show how to get started with Playwright, add an end-to-end test to help test an existing project that is hosted on GitHub, and automate running it using GitHub Actions.

What is Playwright?

Playwright is an open source testing platform released in 2020 on GitHub by Microsoft that enables reliable end-to-end (E2E) testing. It is positioned on the same category as very popular and stable tools Selenium and Cypress.

It has gained traction and that’s not surprising because it:

is very fast,
has great documentation,
is easy to install (comprising just two commands);
and supports a variety of programming languages including Python, JavaScript/TypeScript, Java and .Net. Based on the number of stars on the relative GitHub repositories, Node.js is by far the most widely adopted.

You may choose the language you’re more familiar with and start using it — they all share the same underlying implementation and API, and support all core features of Playwright.

It supports all modern rendering engines including Chromium, WebKit, and Firefox. It is cross-platform and can be used on headless or headed mode on Linux/macOS/Windows. It will act as a native emulator for a number of devices (including Android and Safari).

Playwright provides an inspector GUI tool and a code generator. We'll see how to use both of them over the next sections.

As with similar tools, Playwright can also be used to automate any other web-based task that needs automation.

Playwright with Python

If Python is your language of choice, using Playwright gives a significant advantage over similar tools.

I’ve used both Cypress and Selenium in a number of projects in the past, but since learning about Playwright at the 2022 Python Web Conference, I was eager to start using this tool, which plays well with synchronous and asynchronous requests.

Why end-to-end testing is important, and what problems it solves

End-to-end testing is a software testing process that involves testing software from start to finish as it performs a task. This essentially involves all the functions an app provides for actual users, which is important because it helps validate the entire usage flow of an application to ensure that every part of it works as expected.

During E2E testing, you might want to perform tests on different devices and even consider specifying characteristics such as the browser language and location, provided that the application is using these to serve different data.

E2E is not meant to replace unit testing and integration testing. It works in parallel and has the potential to expose problems that the other tests cannot find because they function in a more isolated way.

A high-level overview for introducing E2E on our project might look something like this:

Understand and analyze the application requirements and document user flows.
Set up the E2E testing tool.
Develop the test cases.
Run the E2E tests.
Decide when they need to be run and automate them. Will they be run after each commit is made? When a PR is opened? When a PR is merged? Where are they going to be run — Github Actions, Gitlab, CircleCI?

One thing that teams that introduce E2E tests have to monitor is how long it takes for the E2E tests to be run and whether this will affect deployments. If E2E tests have to pass before a deployment is made — which is usually the case — and E2E takes too long to complete, a team has to consider how to speed it up. This can be facilitated by the fact that tools such as Playwright have the ability to run in parallel and utilize more resources, if set up correctly.

Problems E2E may catch on a web application

Provided that there are automated E2E tests in place, they might catch all sorts of errors that other tests cannot:

A component breaking due to an error on JavaScript/CSS (an element’s position has changed or been renamed, or an attribute such as the URL has been changed).
Issues communicating with a service or microservice. E2E gives us the chance to see where the problem is.
Version update issues. This is very frequent for libraries that are unpinned (the version isn’t explicitly set).
Errors that stem from the way the software is packaged, such as Kubernetes, Docker Compose, or Supervisord. This saves the time of having to deploy and manually check everything on each of these environments.

Getting started with Python Playwright

Let’s start by installing Python Playwright using pip

pip install pytest-playwright

This will install pytest-playwright and the Playwright command line utility. Playwright recommends the official pytest plugin to write and run the tests. This works well with pytest and provides context isolation; however, it utilizes the sync version of Playwright. If we want to benefit from the async version, we can use Python Playwright without the pytest plugin, so pip install playwright in this case.

We can see a list of options with playwright --help.

Now the second command that we need to run brings the browsers needed by Playwright: playwright install

This fetches all browsers so we can choose to specify a single browser if we are not interested in all of them. For example, use playwright install chrome for installing Chrome only.

Time for our first test. Create first_test.py:

import os
from playwright.sync_api import Playwright, sync_playwright, expect

# get URL through environment variable
URL = os.environ.get("URL")

def run(playwright: Playwright) -> None:
    browser = playwright.chromium.launch(headless=False)
    context = browser.new_context()
    # Open new page and go to our URL
    page = context.new_page()
    page.goto(URL)
    expect(page).to_have_title("MediaCMS")

# Find and click Register link
register_page = page.get_by_role("link", name="Register")
register_page.click()

# Expects the URL to contain intro.
expect(page).to_have_url("https://demo.mediacms.io/accounts/signup/")
with sync_playwright() as playwright:
    run(playwright)

We can either write the URL of the web application we are testing inside the code, or set it as an environment variable and retrieve it. In our case the URL will be https://demo.mediacms.io

Set the environment variable:

export URL=https://demo.mediacms.io

Run test with:

python first_test.py

A browser window opens and a page is visited. Once it loads, we perform an assertion / check and if all is as expected, Playwright closes.

This is a very small test, but should be pretty straightforward and shows what we can achieve with the tool. Take a moment to congratulate yourself for making it up to this point!

Suppose we change the signup URL to `/register`. When we run this script, we see the following error:

AssertionError: Page URL expected to be 'https://demo.mediacms.io/accounts/signup/'
Actual value: https://demo.mediacms.io/register/
Call log:
PageAssertions.to_have_url with timeout 5000ms
waiting for locator(":root")

Another thing to consider here is the headless=False. While developing our tests, we need to be able to see the real action happening (as if someone opens the browser and executes actions), but this won’t be needed when the tests are finalized and are running automatically. In this case, we are only interested in whether they have succeeded or failed without actually seeing the browser window. This is possible with changing headless=True.

Using Playwright Inspector

We are going to use Playwright’s GUI tool to inspect our web application and gradually add more tests that perform more useful interactions than checking for a single URL or text.

The GUI can be opened by specifying the URL:

playwright open https://demo.mediacms.io

This opens a browser window that loads our URL plus the inspector window separately. On it we click the “record” option and then start interacting with the page. The inspector window displays how the code reacts as we perform actions on the page. We can then experiment with the behavior that interests us and see the output on the inspector window.

The URL I visited is a demo instance of an open source CMS, MediaCMS. A really useful E2E test we can add here would be to test user registration.

Users might register by clicking the Register link after they provide basic information. The code has constraints to check that the provided username or email is not in use by another account. At the end of the signup, the user should be redirected to the index page and see a welcome message on the header.

While there are unit tests to check this at the code level, having the ability to mimic the action of a user registering and uploading a file gives us the chance to find deficiencies across the whole application stack.

The test at this point can give the web application a few seconds to perform the above actions and will then check whether the different encoding versions were produced and are ready to stream or download.

*How the inspector GUI looks. Our interactions on the browser are added automatically for us as soon as we perform them.*

Running the test with Pytest

We can start copying code that is produced by the inspector GUI and place it in different files in the E2E tests folder. In this case, I will replace the URL with http://localhost, which is the local instance of the software I am testing.

This project already has a folder tests/ with other types of tests (unit tests and integration tests), we can create the folder e2e and place inside our first test: tests/e2e/test_login_register.py

import os
import re
from playwright.sync_api import Page, expect

# get URL through environment variable
URL = os.environ.get("URL", "http://localhost/")

def test_register_link(page: Page):
    page.goto(URL)
    expect(page).to_have_title(re.compile("MediaCMS"))
    # create a locator
    get_started = page.get_by_role("link", name="REGISTER")
    # Expect an attribute "to be strictly equal" to the value.
    expect(get_started).to_have_attribute("href", "/accounts/signup/")
    # Click the get started link.
    get_started.click()
    # Expects the URL to contain intro.
    expect(page).to_have_url(re.compile(".*signup"))

def test_login_link(page: Page):
    page.goto(URL)
    expect(page).to_have_title(re.compile("MediaCMS"))
    get_started = page.get_by_role("link", name="SIGN IN")
    expect(get_started).to_have_attribute("href", "/accounts/login/")
    get_started.click()
    expect(page).to_have_url(re.compile(".*login"))

Let’s run it. It succeeds, as expected. Notice that pytest has run the tests in headless mode, which is the default. That way we can create the structure that facilitates our project for the addition of new tests and the increase on the test coverage.

root@0a6aca179fb8:/home/mediacms.io/mediacms# pytest tests/e2e/test_login_register.py
======================================================================================= test session starts =======================================================================================
platform linux -- Python 3.8.6, pytest-7.2.0, pluggy-1.0.0
...
collected 2 items
tests/e2e/test_login_register.py .. [100%]

======================================================================================== 2 passed in 5.46s ========================================================================================

Consider the following structure for this project
…
tests/e2e/users/test_login_workflow.py
tests/e2e/users/test_register_workflow.py
…

This looks much more scalable and maintainable than having everything on a single file!

Useful Python Playwright APIs

Without getting in details, let’s quickly see what APIs we can use that would get good mileage.

Locators

Locators find elements on the page at any moment.

page.get_by_role("button", name="Sign in")

Lots of options here — get_by_role, get_by_text, get_by_label and of course get by CSS or XPath locators

page.locator("css=button")
page.locator("xpath=//button")

Something very useful is that you can chain methods that create a locator for more specific results.

product = page.get_by_role("listitem").filter(has_text="Product 2")
product.get_by_role("button", name="Add to cart").click()

Navigations

Navigations that are caused by interactions on the application are handled by Playwright. We’ve already seen how to go to a page. Once it finishes loading, it is ready to perform actions or assertions.

page.goto("http://localhost")
page.get_by_text("Contact").click()

Assertions

Assertions allow us to check whether the expected behavior of our application is met.

Lots of options here too, with the to_have family (to_have_url, to_have_value, to_have_text, to_have_css, to_have_js_propery and more) being what we will need most of the time.

TIP: Assertions are very handy so that if something is not meeting our expectation it makes tests fail quickly and raises an AssertionError. Consider this example:

page.goto("http://localhost")
expect(page).to_have_url("http://localhost/url_not_existing/")

At this point our test will check if the URL is now http://localhost/url_not_existing and if that’s not the case, it will fail with an AssertionError in five seconds, which is a default time for the page assertions.

However, for this snippet,

page.goto("http://localhost")
page.locator("text=Text that does not exist").click()

it will take 30 seconds to complete and then the test will fail with Timeout 30000ms exceeded.

We can set a custom timeout on specific click events and not wait for the whole 30 second timeout:

page.locator("text=Text that does not exist").click(timeout=1000) # 1sec

This will make the script fail with a TimeoutError in one second.

Timeout support seem to be something missing from Python Playwright. The JavaScript/TypeScript version has more timeout options available.

Automating writing of E2E tests with codegen

Most (if not all) of the testing tools include an automatic code generator. This is impressive, especially the first time you encounter it!

Playwright provides its own tool. Start by typing the command playwright codegen URL

playwright codegen https://demo.mediacms.io

This opens the browser window and the inspector window for that URL so that you can track actions.

Codegen attempts to generate resilient text-based selectors for us. Some of the output might not be ideal; for example, it could be very generic. In our case, we want to explicitly specify an element based on its class or position on the page rather than simply based on its text. We might want to make these explicit selections to ensure that even small future changes on the UI will be caught by the tests.

Or it could be the case where generated output is not consistent with our usage of selectors, or they could be in a non-intuitive way, such as in this example:

page.locator("#app-header div:has-text(\"Upload media\")").nth(2).click()

Still, codegen can act as a baseline on which we may start adding more and more tests.

Specifying more options

By default the browser (headless or not) starts on a specific viewport, but this is something we might want to consider changing.

For example, in order to start on a 1600X1200 size, we can do

playwright open http://localhost --viewport-size=1600,1200

We can emulate a specific device with the --device option:

# not valid command but still will show the options
playwright open http://localhost --device list

Below is an example of emulating a Desktop Firefox device:

playwright open http://localhost --device="Desktop Firefox"

Other options include:

emulating timezone, eg --timezone="Europe/London"
emulating language, eg --lang="fr-FR")
emulating location, eg --geolocation="41.890221,12.492348"

These are all very convenient in case our application supports these features and we want to be able to automate these tests.

Last, there are options to save cookies and localStorage at the end of a session and then be able to reuse them, thus minimizing the times a login on the application has to be made.

playwright codegen --save-storage=auth.json http://localhost
playwright codegen --load-storage=auth.json http://localhost

Connecting with GitHub Actions

After taking the time to analyze requirements, document the user flows and design E2E tests for them, it’s time to automate the run through GitHub Actions.

A baseline GitHub Action for a Python application could be this:

name: CI
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest-playwright==0.3.0
- name: Ensure browsers are installed
run: python -m playwright install --with-deps
- name: Run your tests
run: pytest

Set this to .github/workflows/ci.py and once pushed to your GitHub repo it will run the full suite of tests through pytest, which will include whatever E2E tests we have added on folder tests/e2e.

This is a very minimal example, as it is just installing dependencies and running pytest; however, it can be used as a baseline. In case of a complete web application stack (e.g., a Django backend with React frontend, PostgreSQL as database, Redis and Celery workers) all of the services will need to be included and started before pytest is run.

Instead of pinning the library dependency at that point, we will probably prefer to add the pytest-playwright dependency in some dependency file similar to requirements-dev.txt, which is used to separate dependencies that need to be installed on a production environment versus the development/testing one.

# requirements-dev.txt
pytest-playwright==0.3.0

If our Python project is using Poetry or another package management tool, the requirement to install package pytest-playwright can be specified accordingly.

If we are using Docker for our Python project, we might want to:

have a separate Docker file for dev, e.g., Dockerfile-dev (we don’t want to add all these extra dependencies to our production Docker images); and
place the installation of the two dependencies early on the Dockerfile definition so that the layer is cached and not run during all Docker builds.

RUN pip install -r requirements-dev.txt
RUN playwright install chrome --with-deps

The time to install Playwright and bring the browsers dependencies will contribute a lot to the time E2E tests take to run and this might be very problematic in case there’s a CI/CD pipeline that expects E2E tests to succeed in order for a production deployment to take place.

Consider building and reusing a Docker image that does not need to bring and install all these dependencies every time, or if possible use one of the official Playwright ones, which are built on Ubuntu Linux 18.04/20.04/22.04.

As of this writing, Playwright does not work in some versions of Debian Linux due to unresolved issues with install-deps, and if you try to run the command playwright install you get a message that Debian Linux is not supported.

Conclusion

Playwright is a great tool for reliably running E2E tests and if you are a Python engineer, you should definitely have a look at some point!

Blog