Floating Point Data
Last updated on 2026-02-17 | Edit this page
Overview
Questions
- What are the best practices when working with floating point data?
- How do you compare objects in libraries like
numpy?
Objectives
- Learn how to test floating point data with tolerances.
- Learn how to compare objects in libraries like
numpy.
Floating Point Data
Real numbers are encountered very frequently in research, but it’s
quite likely that they won’t be ‘nice’ numbers like 2.0 or 0.0. Instead,
the outcome of our code might be something like
2.34958124890e-31, and we may only be confident in that
answer to a certain precision.
Computers typically represent real numbers using a ‘floating point’ representation, which truncates their precision to a certain number of decimal places. Floating point arithmetic errors can cause a significant amount of noise in the last few decimal places. This can be affected by:
- Choice of algorithm.
- Precise order of operations.
- Inherent randomness in the calculation.
We could therefore test our code using
assert result == 2.34958124890e-31, but it’s possible that
this test could erroneously fail in future for reasons outside our
control. This lesson will teach best practices for handling this type of
data.
Libraries like NumPy, SciPy, and Pandas are commonly used to interact with large quantities of floating point numbers. NumPy provides special functions to assist with testing.
Relative and Absolute Tolerances
Rather than testing that a floating point number is exactly equal to another, it is preferable to test that it is within a certain tolerance. In most cases, it is best to use a relative tolerance:
PYTHON
from math import fabs
def test_float_rtol():
expected = 7.31926e12 # Reference solution
actual = my_function()
rtol = 1e-3
# Use fabs to ensure a positive result!
assert fabs((actual - expected) / expected) < rtol
In some situations, such as testing a number is close to zero without caring about exactly how large it is, it is preferable to test within an absolute tolerance:
PYTHON
from math import fabs
def test_float_atol():
expected = 0.0 # Reference solution
actual = my_function()
atol = 1e-5
# Use fabs to ensure a positive result!
assert fabs(actual - expected) < atol
Let’s practice with a function that estimates the value of pi (very inefficiently!).
Testing with tolerances
- Write this function to a file
estimate_pi.py:
PYTHON
import random
def estimate_pi(iterations):
"""
Estimate pi by counting the number of random points
inside a quarter circle of radius 1
"""
num_inside = 0
for _ in range(iterations):
x = random.random()
y = random.random()
if x**2 + y**2 < 1:
num_inside += 1
return 4 * num_inside / iterations
- Add a file
test_estimate_pi.py, and include a test for this function using both absolute and relative tolerances. - Find an appropriate number of iterations so that the test finishes
quickly, but keep in mind that both
atolandrtolwill need to be modified accordingly!
PYTHON
import random
from math import fabs
from estimate_pi import estimate_pi
def test_estimate_pi():
random.seed(0)
expected = 3.141592654
actual = estimate_pi(iterations=10000)
# Test absolute tolerance
atol = 1e-2
assert fabs(actual - expected) < atol
# Test relative tolerance
rtol = 5e-3
assert fabs((actual - expected) / expected) < rtol
In this case the absolute and relative tolerances should be similar, as the expected result is close in magnitude to 1.0, but in principle they could be very different!
The built-in function math.isclose can be used to
simplify these checks:
Both rel_tol and abs_tol may be provided,
and it will return True if either of the conditions are
satisfied.
Using math.isclose
- Adapt the test you wrote in the previous challenge to make use of
the
math.isclosefunction.
NumPy
NumPy is a common library used in research. Instead of the usual
assert a == b, NumPy has its own testing functions that are
more suitable for comparing NumPy arrays. These functions are the ones
you are most likely to use:
-
numpy.testing.assert_array_equalis used to compare two NumPy arrays for equality – best used for integer data. -
numpy.testing.assert_allcloseis used to compare two NumPy arrays with a tolerance for floating point numbers.
Here are some examples of how to use these functions:
PYTHON
def test_numpy_arrays():
"""Test that numpy arrays are equal"""
# Create two numpy arrays
array1 = np.array([1, 2, 3])
array2 = np.array([1, 2, 3])
# Check that the arrays are equal
np.testing.assert_array_equal(array1, array2)
# Note that np.testing.assert_array_equal even works with multidimensional numpy arrays!
def test_2d_numpy_arrays():
"""Test that 2d numpy arrays are equal"""
# Create two 2d numpy arrays
array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[1, 2], [3, 4]])
# Check that the nested arrays are equal
np.testing.assert_array_equal(array1, array2)
def test_numpy_arrays_with_tolerance():
"""Test that numpy arrays are equal with tolerance"""
# Create two numpy arrays
array1 = np.array([1.0, 2.0, 3.0])
array2 = np.array([1.00009, 2.0005, 3.0001])
# Check that the arrays are equal with tolerance
np.testing.assert_allclose(array1, array2, atol=1e-3)
The NumPy testing functions can be used on anything NumPy considers to be ‘array-like’. This includes lists, tuples, and even individual floating point numbers if you choose. They can also be used for other objects in the scientific Python ecosystem, such as Pandas Series/DataFrames.
The Pandas library also provides its own testing functions:
pandas.testing.assert_frame_equalpandas.testing.assert_series_equal
These functions can also take rtol and atol
arguments, so can fulfill the role of both
numpy.testing.assert_array_equal and
numpy.testing.assert_allclose.
Checking if NumPy arrays are equal
In statistics/stats.py add this function to calculate
the cumulative sum of a NumPy array:
PYTHON
import numpy as np
def calculate_cumulative_sum(array: np.ndarray) -> np.ndarray:
"""Calculate the cumulative sum of a numpy array"""
# don't use the built-in numpy function
result = np.zeros(array.shape)
result[0] = array[0]
for i in range(1, len(array)):
result[i] = result[i-1] + array[i]
return result
Then write a test for this function by comparing NumPy arrays.
PYTHON
import numpy as np
from stats import calculate_cumulative_sum
def test_calculate_cumulative_sum():
"""Test calculate_cumulative_sum function"""
array = np.array([1, 2, 3, 4, 5])
expected_result = np.array([1, 3, 6, 10, 15])
np.testing.assert_array_equal(calculate_cumulative_sum(array), expected_result)
- When comparing floating point data, you should use relative/absolute tolerances instead of testing for equality.
- Numpy arrays cannot be compared using the
==operator. Instead, usenumpy.testing.assert_array_equalandnumpy.testing.assert_allclose.