Making your code pip installable

aka when to use a CutomBuildCommand or a CustomInstallCommand when building python packages with setup.py

Bioinformatics software is complicated, and often a little bit messy. Recently I found myself wading through a python package building quagmire and thought I could share something I learnt about when to use a custom build command and when to use a custom install command. I have also provided some information about how to copy executables to your package installation bin. **ChatGPT wrote the initial skeleton draft of this post, and I have corrected and edited.

Next time you need to create a pip installable package yourself, hopefully this can save you some time!

The difference between a custom build command and a custom install command in setup.py

It is often desirable to write custom build or installation processes when packaging python software, in particular your build, installation or package require preprocessing, file generation or binary executables. If this is the case, you will need to overwrite the default install and build commands in your setup.py . These commands allow you to hook into the build and install processes to run additional code.


The Role of setup.py

Your setup.py script provides the build and installation instructions, as well as the distribution information (like the version number), for your python package. The script is highly configurable, and core parameters allow you to specify which source code to package, any package dependencies required to install (this allows programs like pip to resolve and install dependencies in one go), and to add author and license information.

Less well known very useful setup.py parameters are include_package_data, package_data, data_files, and cmd_class. The former of these can be used to specify non-python files, such as data and executables, that are required for the install. The latter we will take a closer look at now, because this is how you can create custom build and install functions.

When you need a custom build class

A custom build class inherits from build_py and allows you to hook into the build process of your package. The build process is responsible for preparing your package for distribution, which might involve compiling source code, generating data files, or performing other preparatory tasks.

Use a custom build command when you need to generate files, compile assets, or perform any preparatory work before your package is packaged into a distribution format (like a .tar.gz or .whl file). For example, if your package build requires some data files to be generated or downloaded, or some code to be precompiled, you would do that in a custom build command. See an example below:

from setuptools.command.build_py import build_py

class CustomBuildCommand(build_py):
    """Custom command to generate files before the build process."""
    
    def run(self):
        # Custom build steps go here
        print("Running custom build steps...")
        self.generate_files()
        super().run()

    def generate_files(self):
        # Example: Generate some files needed for the package
        with open('generated_file.txt', 'w') as f:
            f.write("This file was generated during the build process.")

When you need a custom install class

A custom installation class inherits from the setuptools class install and allows you to hook into the installation process. This is the step that takes place after the package has been built and is being installed into the target environment. During installation, the package is copied into the appropriate site-packages directory, and any additional tasks needed to make the package usable in its environment are performed. Use CustomInstallCommand when you need to perform tasks during the installation of your package, such as copying executables or setting up environment-specific files. See an example below:

from setuptools.command.install import install
import shutil
import sys

class CustomInstallCommand(install):
    """Custom install command to run post-installation tasks."""
    
    def run(self):
        # Run the standard installation first
        super().run()
        
        # Custom installation steps go here
        print("Running custom install steps...")
        self.copy_executables()

    def copy_executables(self):
        # Example: Copy some executables to the bin directory
        bin_dir = sys.executable.split("python")[0]
        shutil.copy("my_executable", bin_dir)

Putting it all together in a setup.py script

Here’s what it would all look like put together:

import os
import sys
import shutil
from setuptools import setup, find_packages
from setuptools.command.build_py import build_py
from setuptools.command.install import install

class CustomBuildCommand(build_py):
    """Custom command to generate files before the build process."""
    
    def run(self):
        # Custom build steps go here
        print("Running custom build steps...")
        self.generate_files()
        # Call the standard build process
        super().run()

    def generate_files(self):
        # Example: Generate a file needed for the package
        with open('generated_file.txt', 'w') as f:
            f.write("This file was generated during the build process.")
        print("Generated file: generated_file.txt")

class CustomInstallCommand(install):
    """Custom install command to run post-installation tasks."""
    
    def run(self):
        # Run the standard installation first
        super().run()
        
        # Custom installation steps go here
        print("Running custom install steps...")
        self.copy_executables()

    def copy_executables(self):
        # Example: Copy an executable to the bin directory
        bin_dir = sys.executable.split("python")[0]
        shutil.copy("my_executable", bin_dir)
        print(f"Copied my_executable to {bin_dir}")

# Define the setup
setup(
    name="my_package",
    version="0.1.0",
    description="A sample Python package with custom build and install commands.",
    author="Your Name",
    author_email="your.email@example.com",
    url="https://example.com/my_package",
    packages=find_packages(exclude="test"),  # Automatically find all packages in the current directory, and ignore the test directory
    include_package_data=True,  # Include additional files specified in MANIFEST.in
    package_data={
        # Include generated files in the package
        '': ['generated_file.txt'],
    },
    data_files=[
        # Include additional files in the installation
        ('bin', ['my_executable']),
    ],
    install_requires=[
        # List of dependencies
        'numpy',
        'requests',
    ],
    scripts=[
        # Scripts to be installed in the bin directory
        'scripts/my_script.py',
    ],
    cmdclass={
        # Custom commands
        'build_py': CustomBuildCommand,
        'install': CustomInstallCommand,
    },
)

Once your setup.py is ready, you can build a distribution ready wheel and upload it to pypi to make it pip installable from the command line!

python -m build
twine upload --repository testpypi dist/*

I hope that helps, and happy packaging!

Author