Regression Tests
Last updated on 2026-02-17 | Edit this page
Overview
Questions
- How can we detect changes in program outputs?
- How can snapshots make this easier?
Objectives
- Explain what regression tests are and when they’re useful
- Write a manual regression test (save output and compare later)
- Use Snaptol snapshots to simplify output/array regression testing
- Use tolerances (rtol/atol) to handle numerical outputs safely
1) Introduction
In short, a regression test asks “this test used to produce X, does it still produce X?”. This can help us detect unexpected or unwanted changes in the output of a program.
They are particularly useful,
when beginning to add tests to an existing project,
when adding unit tests to all parts of a project is not feasible,
to quickly give a good test coverage,
when it does not matter if the output is correct or not.
These types of tests are not a substitute for unit tests, but rather are complimentary.
2) Manual example
Let’s make a regression test in a test.py file. It is
going to utilise a “very complex” processing function to simulate the
processing of data,
PYTHON
# test.py
def very_complex_processing(data: list):
return [x ** 2 - 10 * x + 42 for x in data]
Let’s write the basic structure for a test with example input data, but for now we will simply print the output,
PYTHON
# test.py continued
def test_something():
input_data = [i for i in range(8)]
processed_data = very_complex_processing(input_data)
print(processed_data)
Let’s run pytest with reduced verbosity -q
and print the statement from the test -s,
$ pytest -qs test.py
[42, 33, 26, 21, 18, 17, 18, 21]
.
1 passed in 0.00s
We get a list of output numbers that simulate the result of a complex
function in our project. Let’s save this data at the top of our
test.py file so that we can assert that it is
always equal to the output of the processing function,
PYTHON
# test.py
SNAPSHOT_DATA = [42, 33, 26, 21, 18, 17, 18, 21]
def very_complex_processing(data: list):
return [x ** 2 - 10 * x + 42 for x in data]
def test_something():
input_data = [i for i in range(8)]
processed_data = very_complex_processing(input_data)
assert SNAPSHOT_DATA == processed_data
We call the saved version of the data a “snapshot”.
We can now be assured that any development of the code that
erroneously alters the output of the function will cause the test to
fail. For example, suppose we slightly altered the
very_complex_processing function,
PYTHON
def very_complex_processing(data: list):
return [3 * x ** 2 - 10 * x + 42 for x in data]
# ^^^^ small change
Then, running the test causes it to fail,
$ pytest -q test.py
F
__________________________________ FAILURES _________________________________
_______________________________ test_something ______________________________
def test_something():
input_data = [i for i in range(8)]
processed_data = very_complex_processing(input_data)
> assert SNAPSHOT_DATA == processed_data
E assert [42, 33, 26, 21, 18, 17, ...] == [42, 35, 34, 39, 50, 67, ...]
E At index 1 diff: 33 != 35
test.py:12: AssertionError
1 failed in 0.03s
If the change was intentional, then we could print the output again
and update SNAPSHOT_DATA. Otherwise, we would want to
investigate the cause of the change and fix it.
3) Snaptol
So far, performing a regression test manually has been a bit tedious. Storing the output data at the top of our test file,
adds clutter,
is laborious,
is prone to errors.
We could move the data to a separate file, but once again we would have to handle its contents manually.
There are tools out there that can handle this for us, one widely known is Syrupy. A new tool has also been developed called Snaptol, that we will use here.
Let’s use the original very_complex_processing function,
and introduce the snaptolshot fixture,
PYTHON
# test.py
def very_complex_processing(data: list):
return [x ** 2 - 10 * x + 42 for x in data]
def test_something(snaptolshot):
input_data = [i for i in range(8)]
processed_data = very_complex_processing(input_data)
assert snaptolshot == processed_data
Notice that we have replaced the SNAPSHOT_DATA variable
with snaptolshot, which is an object provided by Snaptol
that can handle the snapshot file management, amongst other smart
features, for us.
When we run the test for the first time, we will be met with a
FileNotFoundError,
$ pytest -q test.py
F
================================== FAILURES =================================
_______________________________ test_something ______________________________
def test_something(snaptolshot):
input_data = [i for i in range(8)]
processed_data = very_complex_processing(input_data)
> assert snaptolshot == processed_data
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
test.py:10:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.../snapshot.py:167: FileNotFoundError
========================== short test summary info ==========================
FAILED test.py::test_something - FileNotFoundError: Snapshot file not found.
1 failed in 0.03s
This is because we have not yet created the snapshot file. Let’s run
snaptol in update mode so that it knows to create the
snapshot file for us. This is similar to the print, copy and paste step
in the manual approach above,
$ pytest -q test.py --snaptol-update
.
1 passed in 0.00s
This tells us that the test performed successfully, and, because we
were in update mode, an associated snapshot file was created with the
name format <test_file>.<test_name>.json in a
dedicated directory,
$ tree
.
├── __snapshots__
│ └── test.test_something.json
└── test.py
The contents of the JSON file are the same as in the manual example,
As the data is saved in JSON format, almost any Python object can be used in a snapshot test – not just integers and lists.
Just as previously, if we alter the function then the test will fail.
We can similarly update the snapshot file with the new output with the
--snaptol-update flag as above.
Note: --snaptol-update will only update
snapshot files for tests that failed in the previous run of
pytest. This is because the expected workflow is 1) run
pytest, 2) observe a test failure, 3) if happy with the
change then run the update, --snaptol-update. This stops
the unnecessary rewrite of snapshot files in tests that pass – which is
particularly important when we allow for tolerance as explained in the
next section.
Floating point numbers
Consider a simulation code that uses algorithms that depend on convergence – perhaps a complicated equation that does not have an exact answer but can be approximated numerically within a given tolerance. This, along with the common use of controlled randomised initial conditions, can lead to results that differ slightly between runs.
In the example below, we use the estimate_pi function
from the “Floating Point Data” module. It relies on the use of
randomised input and as a result the determined value will vary slightly
between runs.
PYTHON
# test_tol.py
import random
def estimate_pi(iterations):
num_inside = 0
for _ in range(iterations):
x = random.random()
y = random.random()
if x**2 + y**2 < 1:
num_inside += 1
return 4 * num_inside / iterations
def test_something(snaptolshot):
result = estimate_pi(10000000)
print(result)
snaptolshot.assert_allclose(result, rtol=1e-03, atol=0.0)
Notice that here we use a method of the snaptolshot
object called assert_allclose. This is a wrapper around the
numpy.testing.assert_allclose function, as discussed in the
“Floating Point Data” module, and allows us to specify tolerances for
the comparison rather than asserting an exact equality.
Let’s run the test initially like before but create the snapshot file straight away by running in update mode,
$ pytest -qs test_tol.py --snaptol-update-all
3.1423884
.
1 passed in 0.30s
Even with ten million data points, the approximation of pi, 3.1423884, isn’t great!
Note: remember that the result of a regression test is not the important part, but rather on how that result changes in future runs. We want to focus on whether our code reproduces the result in future runs – in this case within a given tolerance to account for the randomness.
In the test above, we supplied rtol and
atol arguments to the function in the assertion. These are
used to control the tolerance of the comparison between the snapshot and
the actual output. This means on future runs of the test, the computed
value will not be required to exactly match the snapshot, but rather
within the given tolerance. Remember,
-
rtolis the relative tolerance, useful for handling large numbers (e.g magnitude much greater than 1), -
atolis the absolute tolerance, useful for numbers “near zero” (e.g magnitude much less than 1).
If we run the test again, we see the printed output is different to that saved to file, but the test still passes,
$ pytest -qs test_tol.py
3.1408724
.
1 passed in 0.24s
Exercises
Create your own regression test
Add the below code to a new file and add your own code to the
...sections.On the first run, capture the output of your implemented
very_complex_processingfunction and store it appropriately.After, ensure the test compares the stored data to the result, and passes successfully. Avoid using
floats for now.
Implement a regression test with Snaptol
Using the
estimate_pifunction above, implement a regression test using thesnaptolshotobject.Ensure to use the
assert_allclosemethod to compare the result to the snapshot carefully.On the first pass, ensure that it fails due to a
FileNotFoundError.Run it in update mode to save the snapshot, and ensure it passes successfuly on future runs.
PYTHON
import random
def estimate_pi(iterations):
num_inside = 0
for _ in range(iterations):
x = random.random()
y = random.random()
if x**2 + y**2 < 1:
num_inside += 1
return 4 * num_inside / iterations
def test_something(snaptolshot):
result = estimate_pi(10000000)
snaptolshot.assert_allclose(result, rtol=1e-03, atol=0.0)
More complex regression tests
Create two separate tests that both utilise the
estimate_pifunction as a fixture.Using different tolerances for each test, assert that the first passes successfully, and assert that the second raises an
AssertionError. Hints: 1) remember to look back at the “Testing for Exceptions” and “Fixtures” modules, 2) the error in the pi calculation algorithm is \(\frac{1}{\sqrt{N}}\) where \(N\) is the number of points used.
PYTHON
import random
import pytest
@pytest.fixture
def estimate_pi():
iterations = 10000000
num_inside = 0
for _ in range(iterations):
x = random.random()
y = random.random()
if x**2 + y**2 < 1:
num_inside += 1
return 4 * num_inside / iterations
def test_pi_passes(snaptolshot, estimate_pi):
# Passes due to loose tolerance.
snaptolshot.assert_allclose(estimate_pi, rtol=1e-03, atol=0.0)
def test_pi_fails(snaptolshot, estimate_pi):
# Fails due to tight tolerance.
with pytest.raises(AssertionError):
snaptolshot.assert_allclose(estimate_pi, rtol=1e-04, atol=0.0)
- Regression testing ensures that the output of a function remains consistent between test runs.
- The
pytestplugin,snaptol, can be used to simplify this process and cater for floating point numbers that may need tolerances on assertion checks.