NumPy efficiency

NumPy efficiency#

Beside some convenient functions for the manipulation of data in arrays of arbitrary dimensions, numpy can be much more efficient than pure Python.

Let us import Numpy and define two utility functions:

import numpy as np

def compute_time_in_second(timeit_result):
    string = timeit_result.stdout
    print(string)
    for line in string.split("\n"):
        words = line.split(" ")
        if len(words) > 1:
            time = float(words[0])
            unit = words[1]

    if unit == "ms":
        time *= 1e-3
    elif unit == "us":
        time *= 1e-6
    elif unit == "ns":
        time *= 1e-9
    return time


def compare_times(string, timeit_python, timeit_numpy):
    time_python = compute_time_in_second(timeit_python)
    time_numpy = compute_time_in_second(timeit_numpy)

    print(string + ": ratio times (Python / NumPy): ", time_python / time_numpy)
n = 200000
python_r_1 = range(n)
python_r_2 = range(n)

numpy_a_1 = np.arange(n)
numpy_a_2 = np.arange(n)

We use the IPython magic commands %%timeit to bench the code in a cell and %%capture to capture the result of %%timeit in a variable.

%%capture timeit_python
%%timeit
# Regular Python
[(x + y) for x, y in zip(python_r_1, python_r_2)]
print(timeit_python)
27.1 ms +- 2.06 ms per loop (mean +- std. dev. of 7 runs, 10 loops each)
%%capture timeit_numpy
%%timeit
# Numpy
numpy_a_1 + numpy_a_2
print(timeit_numpy)
77.8 us +- 3.01 us per loop (mean +- std. dev. of 7 runs, 10,000 loops each)
compare_times("Additions", timeit_python, timeit_numpy)
27.1 ms +- 2.06 ms per loop (mean +- std. dev. of 7 runs, 10 loops each)

77.8 us +- 3.01 us per loop (mean +- std. dev. of 7 runs, 10,000 loops each)

Additions: ratio times (Python / NumPy):  348.3290488431877

This shows that when you need to perform mathematical operations on a lot of homogeneous numbers, it is more efficient to use numpy arrays.