Skip to content

Instantly share code, notes, and snippets.

@phargogh
Created December 12, 2022 22:47
Show Gist options
  • Save phargogh/867ca890acf3d23012770468e9552532 to your computer and use it in GitHub Desktop.
Save phargogh/867ca890acf3d23012770468e9552532 to your computer and use it in GitHub Desktop.
Timing numpy array indexing performance with `False` vs `numpy.zeros`

Is it faster to return None as a mask?

While reading through the InVEST utils source code, I noticed that our nodata masking function returns a boolean array of zeros rather than just False, and it made me wonder if the memory allocation for this array was slower than simply returning False and letting the numpy array broadcasting handle the distribution of that single value across all of the comparison array values.

In my tests here (see test-nodata-timings.py), the runtime is dominated by random array generation rather than indexing, but there does appear to be a time savings as array sizes get larger.

This is confirmed with timeit (see function main_timeit()):

Indexing option: False
array size: 10*10: 1.9653402079999998s
array size: 10*100: 2.0178617080000003s
array size: 10*1000: 1.9825304170000004s
array size: 10*10000: 1.974148209s
array size: 10*100000: 2.922310917000001s
Indexing option: numpy.zeros(shape, dtype=bool)
array size: 10*10: 1.275730083000001s
array size: 10*100: 1.499919499999999s
array size: 10*1000: 2.1492175420000006s
array size: 10*10000: 8.273289708s
array size: 10*100000: 58.965569708000004s

When using False as the index, the runtime of creating the index is roughly constant until we end up with very many pixels, and even then the runtime is a bit less than double what it was with an order of magnitude fewer pixels. On the other side of things, the numpy.zeros indexing is clearly slower, with runtime increasing roughly linearly with the number of array elements needing to be created.

Note that the default iterblocks number of elements is 2^16 (about 65,000). This means that in the case of no defined nodata, we're allocating a whole lot of array elements that are very likely thrown away and could easily be replaced with False. Really this is a minor tweak (especially considering that these timings are with n=10000 or whatever the default is on timeit.timeit), but it could add up on very large arrays.

import contextlib
import time
import timeit
import numpy
import numpy.random
# Taken from https://github.com/natcap/invest/blob/8e3bc6d3a0011c21f6de275aa76476ed25f5e95a/src/natcap/invest/utils.py
def array_equals_nodata(array, nodata):
"""Check for the presence of ``nodata`` values in ``array``.
The comparison supports ``numpy.nan`` and unset (``None``) nodata values.
Args:
array (numpy array): the array to mask for nodata values.
nodata (number): the nodata value to check for. Supports ``numpy.nan``.
Returns:
A boolean numpy array with values of 1 where ``array`` is equal to
``nodata`` and 0 otherwise.
"""
# If nodata is undefined, nothing matches nodata.
if nodata is None:
return numpy.zeros(array.shape, dtype=bool)
# comparing an integer array against numpy.nan works correctly and is
# faster than using numpy.isclose().
if numpy.issubdtype(array.dtype, numpy.integer):
return array == nodata
return numpy.isclose(array, nodata, equal_nan=True)
def array_equals_nodata_allfalse(array, nodata):
"""Check for the presence of ``nodata`` values in ``array``.
The comparison supports ``numpy.nan`` and unset (``None``) nodata values.
Args:
array (numpy array): the array to mask for nodata values.
nodata (number): the nodata value to check for. Supports ``numpy.nan``.
Returns:
A boolean numpy array with values of 1 where ``array`` is equal to
``nodata`` and 0 otherwise.
"""
# If nodata is undefined, nothing matches nodata.
if nodata is None:
return False
# comparing an integer array against numpy.nan works correctly and is
# faster than using numpy.isclose().
if numpy.issubdtype(array.dtype, numpy.integer):
return array == nodata
return numpy.isclose(array, nodata, equal_nan=True)
@contextlib.contextmanager
def my_timeit(message):
start_time = time.time()
yield
elapsed = round(time.time() - start_time, 4)
print(f"{message}: {elapsed}s")
def main():
n = 10000
shape = (1000, 100)
nodata = None
with my_timeit("Baseline"):
count = 0
for nodata in range(n):
array = numpy.random.randint(0, n, size=shape)
mask = array[False]
count += array[mask].size
with my_timeit("Current InVEST"):
count = 0
for nodata in range(n):
array = numpy.random.randint(0, n, size=shape)
mask = array_equals_nodata(array, None)
count += array[mask].size
with my_timeit("Simplified None case"):
count = 0
for seed in range(n):
array = numpy.random.randint(0, n, size=shape)
mask = array_equals_nodata_allfalse(array, None)
count += array[mask].size
def main_timeit():
for indexing_option in ("False", "numpy.zeros(shape, dtype=bool)"):
print(f"Indexing option: {indexing_option}")
for array_size_factor in (10, 100, 1000, 10000, 100000):
runtime = timeit.timeit(
f"numpy.empty(shape, dtype=numpy.float32)[{indexing_option}]",
setup=f"import numpy; shape=(10*{array_size_factor})")
print(f"array size: 10*{array_size_factor}: {runtime}s")
if __name__ == '__main__':
main()
main_timeit()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment