You don’t even need to use CMake!
Most of the time, we can use libraries like numpy (which is largely written in C) to speed up our calculations, which works when we are dealing with matrices or vectors – but sometimes loops are unavoidable. In those instances, it would be nice if we could use a compiled language such as C++ to remove the bottleneck.
This can be achieved extremely easily using pybind11, which enables us to export C++ functions and classes as importable python objects. We can do all of this very easily, without using CMake, using pybind11’s Pybind11Extension class, along with a modified setup.py. Pybind11 can be compiled from source or installed using:
pip install pybind11
The project directory structure in this case should look like this:
We are going to use an extremely inefficient method for calculating the nth number in the Finbinacci sequence (we will write fibinacci.h
in a second). This is the function, written in fibinacci.cpp
:
#include "fibinacci.h" unsigned int fibinacci(const unsigned int n){ if (n < 2){ return i; } return fibinacci(n - 1) + fibinacci(n - 2); }
This recursive algorithm is O(2^n)
, and will illustrate our speed gains nicely. Next we must write fibinacci.h
, which will also include the code which tells pybind11 how to glue everything together into a python module:
#include <pybind11/pybind11.h> unsigned int fibinacci(const unsigned int n); namespace py = pybind11; PYBIND11_MODULE(pybind_11_example, mod) { mod.def("fibinacci_cpp", &fibinacci, "Recursive fibinacci algorithm."); }
In the header file, we have our usual function declaration, followed by some more interesting code. The macro PYBIND11_MODULE
defines a module that we can import using python, when everything has been compiled into a .so
library file. We use mod.def
to define a function in the library, with three argument:
- The (python) name of the function (
fibinacci_cpp
) - A pointer to the (cpp) function definition (
&fibinacci
) - A brief description of the function
We are already finished with our C++! Now onto setup.py
:
from pathlib import Path from pybind11.setup_helpers import Pybind11Extension, build_ext from setuptools import setup example_module = Pybind11Extension( 'pybind_11_example', [str(fname) for fname in Path('src').glob('*.cpp')], include_dirs=['include'], extra_compile_args=['-O3'] ) setup( name='pybind_11_example', version=0.1, author='Joe Bloggs', author_email='joe_bloggs@email.com', description='Barebones pybind11+setup.py example project', ext_modules=[example_module], cmdclass={"build_ext": build_ext}, )
The only difference from a standard setup.py
is in the last two arguments of setup
.
The ext_modules
argument includes modules which must be compiled separately, which is where the magic comes in – pybind11 comes with an automatic extension builder, accessed using pybind11.setup_helpers.Pybind11Extension
. Here we provide the name of the module, the source files (globbing the src directory means we’ll catch any new files), the include directory, and extra compiler flags (-O3
tells the compiler to use lots of tricks to make our code run faster).
The cmdclass
argument overrides the usual build_ext
setuptools class so that it does some things automatically (like finding the highest available version g++).
We can now compile and install our module:
pip install -e . -vvv
And if we write main.py
, such that we time both the C++ and Python versions of our function:
import time import pybind_11_example as pbe def fibinacci_py(x): if x < 2: return x return fibinacci_py(x - 1) + fibinacci_py(x - 2) n = 40 print('Python:') start_time = time.perf_counter_ns() print('Answer:', fibinacci_py(n)) print('Time:', (time.perf_counter_ns() - start_time) / 1e9, 's') print() print('C++:') start_time = time.perf_counter_ns() print('Answer:', pbe.fibinacci_cpp(n)) print('Time:', (time.perf_counter_ns() - start_time) / 1e9, 's')
Python: Answer: 102334155 Time: 21.634114871 s C++: Answer: 102334155 Time: 0.274699966 s
Our C++ function is about 80 times faster. Not bad!