pytest, pytest-xdist and Allure: friends or foes?
Wednesday 7 February 2024
This post is accompanied by a GitHub repo which contains all code examples.
Introduction
Last week, I discovered a broken import
in our codebase, which had been introduced the
previous day. Making mistakes happen and the fix was trivial, so no big deal.
What puzzled me though is that our test results on that day looked fine (we have thousands of tests running every day, so it’s hard to notice a missing test), so I decided to investigate further what we were seeing. I discovered that using pytest, pytest-xdist and Allure has some consequences I hadn’t thought of.
No tests are run when there is a collection error…
When a pytest session starts, pytest first collects the tests in the test folder
(or .
if there is no test folder specified). If there are errors that occur during the
collection, for instance an import
that can’t be resolved, pytest stops there and no
tests are run.
For instance, the tests
directory in the example repo contains two test modules: one
runs fine, but the other contains a broken import
statement. If we run the test suite,
we get the following:
> docker run --rm pytest-xdist-allure-error
============================= test session starts ==============================
platform linux -- Python 3.12.1, pytest-8.0.0, pluggy-1.4.0
rootdir: /home
plugins: xdist-3.5.0, allure-pytest-2.13.2
collected 1 item / 1 error
==================================== ERRORS ====================================
___________________ ERROR collecting tests/test_breaking.py ____________________
ImportError while importing test module '/home/tests/test_breaking.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.12/importlib/__init__.py:90: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
/Users/jean/Code/pytest-xdist-allure-error/tests/test_breaking.py:1: in <module>
???
E ModuleNotFoundError: No module named 'idontexist'
=========================== short test summary info ============================
ERROR tests/test_breaking.py
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
=============================== 1 error in 0.03s ===============================
Return code: 2
We notice the line Interrupted: 1 error during collection
, and the fact that no tests
are run.
… unless you use pytest-xdist…
This behavior is actually nice because we probably don’t want to go further in the test suite if we have import errors.
However, when running tests in parallel with pytest-xdist, this is not the behavior by
default. We can observe that, by setting the PYTEST_XDIST_AUTO_NUM_WORKERS
environment
variable to a number greater than 0:
> docker run --rm -e PYTEST_XDIST_AUTO_NUM_WORKERS=2 pytest-xdist-allure-error
============================= test session starts ==============================
platform linux -- Python 3.12.1, pytest-8.0.0, pluggy-1.4.0
rootdir: /home
plugins: xdist-3.5.0, allure-pytest-2.13.2
created: 2/2 workers
2 workers [1 item]
. [100%]
==================================== ERRORS ====================================
___________________ ERROR collecting tests/test_breaking.py ____________________
ImportError while importing test module '/home/tests/test_breaking.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.12/importlib/__init__.py:90: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
/Users/jean/Code/pytest-xdist-allure-error/tests/test_breaking.py:1: in <module>
???
E ModuleNotFoundError: No module named 'idontexist'
=========================== short test summary info ============================
ERROR tests/test_breaking.py
========================== 1 passed, 1 error in 0.21s ==========================
Return code: 1
Now, see how we still get the collection error but it doesn’t actually prevent the
working test from running, as the last line of the report tells us
(1 passed, 1 error in 0.21s
).
This actually makes sense from pytest-xdist’s point-of-view: one of the xdist processes encounters an error but it doesn’t prevent the other processes from running, so the other test runs and passes.
In that case, only the return code of the command informs us that something went wrong, but if we don’t check it and we don’t look at the Pytest output directly (more on that later), we might not see the issue.
… unless you use the “-x” option
One exception to that is if you use the -x
option (short for --exitfirst
): this
option stops the test session as soon as there is a failure, so it will stop the test
session when the collection error is encountered:
docker run --rm -e PYTEST_XDIST_AUTO_NUM_WORKERS=2 -e PYTEST_ADDITIONAL_ARGS="-x" pytest-xdist-allure-error
============================= test session starts ==============================
platform linux -- Python 3.12.1, pytest-8.0.0, pluggy-1.4.0
rootdir: /home
plugins: xdist-3.5.0, allure-pytest-2.13.2
created: 2/2 workers
==================================== ERRORS ====================================
___________________ ERROR collecting tests/test_breaking.py ____________________
ImportError while importing test module '/home/tests/test_breaking.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.12/importlib/__init__.py:90: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
/Users/jean/Code/pytest-xdist-allure-error/tests/test_breaking.py:1: in <module>
???
E ModuleNotFoundError: No module named 'idontexist'
=========================== short test summary info ============================
ERROR tests/test_breaking.py
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!! xdist.dsession.Interrupted: stopping after 1 failures !!!!!!!!!!!!!
=============================== 1 error in 0.22s ===============================
Return code: 2
This is nice, but deciding to add -x
to all your pytest calls shouldn’t be done
lightly. Read one more time what the option does: it stops the test session as soon
as there is a failure. If you have a realistic test suite with a few thousand tests,
it is likely that a few of them fail every day. Do you really want to stop your test
execution as soon as the first failure is encountered, potentially leaving hundreds of
tests not running?
So far, the only real solution we have is reading the pytest logs to see if there was and error. But wait, I promised I would tell you more about not looking at pytest logs (who does that?!).
Introducing Allure
When may one not look at pytest logs, in which case one would see the error directly?
Well, this can be the case if you use a tool like Allure to
view your test results. In that case, you rely on it to tell you whether tests failed.
Well, can you imagine what happens if you run tests with pytest-xdist and that you
didn’t pass the -x
option? Let’s find out:
docker run --rm -it -p 9090:9090 -e PYTEST_XDIST_AUTO_NUM_WORKERS=2 -e ALLURE_REPORT=1 pytest-xdist-allure-error
============================= test session starts =============================
platform linux -- Python 3.12.1, pytest-8.0.0, pluggy-1.4.0
rootdir: /home
plugins: xdist-3.5.0, allure-pytest-2.13.2
2 workers [1 item] error
. [100%]
=================================== ERRORS ====================================
___________________ ERROR collecting tests/test_breaking.py ___________________
ImportError while importing test module '/home/tests/test_breaking.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.12/importlib/__init__.py:90: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
/Users/jean/Code/pytest-xdist-allure-error/tests/test_breaking.py:1: in <module>
???
E ModuleNotFoundError: No module named 'idontexist'
=========================== short test summary info ===========================
ERROR tests/test_breaking.py
========================= 1 passed, 1 error in 0.21s ==========================
Return code: 1
Generating report to temp directory...
Report successfully generated to /tmp/6845943324082651755/allure-report
Starting web server...
2024-01-31 16:03:14.550:INFO::main: Logging initialized @941ms to org.eclipse.jetty.util.log.StdErrLog
Can not open browser because this capability is not supported on your platform. You can use the link below to open the report manually.
Server started at <http://172.17.0.2:9090/>. Press <Ctrl+C> to exit
Opening up localhost:9090
, we see that the Allure report tells us that everything is
fine, although it wasn’t:
Conclusion
Ideally, I shouldn’t have encountered this issue because:
- most of the times our test collection is broken, it’s because of static issues that could be easily caught by an IDE or a static analysis tool, like mypy
- alternatively, the issue should get caught during the review
- if it is not, we should read regularly Pytest logs to see if there are issues
- we should have a test suite that is reliable enough so that we can use the
-x
option and know that any failure is a “real failure” - etc.
Yet, all integration test suites on which I worked had thousands of tests, most of them relying on non entirely deterministic environment (network to name one), with log files that are tens of megabytes big and hard to parse. Automated static analysis, being on the optional tooling side, is often easy to neglect. “By the book”, those are all bad practices but in my experience, this is what a real-life project looks like, so let’s try to keep our eyes open.