In this blog post, I’ll show you guys how to make your own shiny container for your tool! Zero fuss(*) and in FOUR simple steps.
As an example, I will show how to make a singularity container for one of our public tools, ANARCI, the antibody numbering tool everyone in OPIG and external users are familiar with – If not, check the web app and the GitHub repo here and here.
(*) Provided you have your own Linux machine with sudo
permissions, otherwise, you can’t do it – sorry. Same if you have a Mac or Windows – sorry again.
BUT, there are workarounds for these cases such as using the remote singularity builder here, for which you only need to sign up and create an account, and the use of Virtual Machines (VMs), as described here.
A brief preamble
If you already know what singularity and containerisation are, you can skip this section and move on to the actual fun!
In previous blog posts, other OPIGlets already introduced singularity and illustrated the security hazards posed by Docker (another popular containerisation technology) in shared systems, like HPC, which singularity prevents [here]. Also, explained further the idea of containerisation and highlighted its advantages for scientific reproducibility, to avoid the irritating “It works on my machine” scenario. As well as a basic introduction to building your own singularity container [here] and how to run singularity in Windows with WSL2 [here].
In this blog post, I’ll go further on how to build containers using definition files and provide a recipe to equip our container with conda
, to manage environment variables and dependencies, and illustrate the process by cloning and installing one of our tools available on GitHub.
STEP 1. Install singularity.
Instructions are here.
A word of caution: Aim to install any Singularity-CE version > 3.3; CE = Community Edition. Previous versions had bugs resulting in a mount error post-installation. If you find any other bugs, have a look at the issue list of the singularity repo here. On my laptop, I have SingularityCE 3.9.5v and Go 1.17.7v
STEP 2. Steal these lines.
Copy and paste the block below into a text file, for example, myfile.def
BootStrap: library
From: ubuntu:18.04
%environment
# set up all essential environment variables
export LC_ALL=C
export PATH=/miniconda3/bin:$PATH
export PYTHONPATH=/miniconda3/lib/python3.9/:$PYTHONPATH
# activate conda environment
source activate base;
conda activate;
%post
# update and install essential dependencies
apt-get -y update
apt-get update && apt-get install -y automake build-essential bzip2 wget git default-jre unzip
apt-get install muscle # ANARCI dependency
# download, install, and update miniconda3
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -f -p /miniconda3/
rm Miniconda3-latest-Linux-x86_64.sh
# install dependencies via conda
export PATH="/miniconda3/bin:$PATH"
conda install -y -c conda-forge pip numpy # general dependencies
conda install -y -c bioconda biopython hmmer # ANARCI dependencies
conda update --all
# clone and install your github code
mkdir mycode
git clone https://github.com/oxpig/ANARCI.git mycode/ANARCI
cd mycode/ANARCI
pip install .
%labels
Author Me
Version v0.0
MyLabel Hello World
What do these lines mean?
In singularity jargon, this is called a definition file, which is a recipe for singularity to build your singularity image file, aka .sif
, which is a highly compressed file that will allow you to run your tool in the command line.
As you can see from the definition file, it tells singularity to build a system with an Ubuntu 18.04
kernel pulled from its public library
or file repository, and to define essential system environment
variables. Also, a set of operations post
or after pulling the kernel that will be necessary to configure and update the container’s inner system, as well as the tools you want your system to have along with their dependencies; conda
and ANARCI
in this example.
The definition file can be further customised, here I just show some basic sections of it for the purposes of this example. Have a look here for what else you can add to your definition file.
STEP 3. Build your own sh*t
Run the following command line in terminal
sudo singularity build mytool.sif myfile.def
Again, you need sudo
rights to be able to build your output container file. But, as I mentioned above, you can try out the remote builder to go around it.
An important point: probably building directly the .sif
file is not the best practice. The problem is that once built, you can’t modify anything inside unless you modify the definition file. This is irritating if you want to modify or debug a tool built into your container. As a workaround, you can create instead a sandbox
sudo singularity build --sandbox mytool myfile.def
This will output a folder, mytool
, containing all the system files that go into your container content which you can actually see and modify at will. For instance, in shell mode (sudo singularity --writable shell mytool/
), you can manually install and modify files or update dependencies with conda too. And once you’re happy with your modifications, then you can create a .sif
file, which actually is much lighter than a sandbox folder for sharing.
sudo singularity build mytool.sif mytool
/
For more on this, look here and this previous blog post.
STEP 4. Have fun!
If you made it to this last step without stumbling across any errors, then, you’re nearly there, cause now you get to play with your tool!
For ANARCI, we’ll execute its command line tool to number an input sequence by using the exec
command of singularity upon calling your container
singularity exec mytool.sif ANARCI -i EVQLQQSGAEVVRSGASVKLSCTASGFNIKDYYIHWVKQRPEKGLEWIGWIDPEIGDTEYVPKFQGKATMTADTSSNTAYLQLSSLTSEDTAVYYCNAGHDYDRGRFPYWGQGTLVTVSA
If everything goes well you should see an output like in the screenshot
If you got an output like this, then pat yourself on the back and congratulate yourself. Well done! š
Now, you can just share mytool.sif
with your friends and they should be able to run your tool with singularity too!
The bottom line.
Containerisation is an amazing technology that lowers the reproducibility barrier in scientific computing by providing a self-contained environment to pack your tools and dependencies to make them ready to use.
Here we showed how to use singularity to containerise ANARCI
in addition to the use miniconda
to equip our container with an environment manager to handle dependencies. Now, you can just modify the definition file to pull and install your favourite tool.
Certainly, this is just the tip of the tip the iceberg. I will probably post more about this topic in the future. In the meantime, have fun and if you have any questions I’d love to hear from you.