Category Archives: Code

Visualising macromolecules and grids in Jupyter Notebooks with nglview

If you do most of your work in Jupyter notebooks, it can be convenient to have a quick visualisation tool to view the results of your latest computation from within the notebook, without having to flick between the notebook and your favourite molecule viewer.

I have recently started using NGLview, an IPython/Jupyter widget, to do this. It is based on the NGL viewer, an embeddable webapp for macromolecular visualisation. The nglvew module documentation can be found here, and in addition to handling the usual formats for molecular structure (.pdb, .mol2, .sdf, .pqr, etc.) and map density(.ccp4 and more), it supports visualising trajectories and even making movies.

Continue reading

Storing variables in Jupyter Notebooks using %store magic

We’ve all been there. You’ve just run an expensive computation in your Jupyter Notebook and are about to draw those conclusions which will prove that your theories were right all along (until you find the sixteen bugs in your code which render them invalid, but that’s an issue for a different time). Then at the critical moment, your flatmate begins streaming their Lord Of The Rings marathon in 4k and your already temperamental Wi-Fi severs your connection to the department servers in protest, crashing your Jupyter Notebook, leaving your hopes and dreams in tatters.

Continue reading

Editors for remote development

The ongoing COVID-19 situation has forced us all to dramatically rethink how we work, with many industries struggling to adjust their on-site procedures to ensure the safety of workers, and many more adapting to support much of their workforce in working from home. As a largely computational research group, we are incredibly fortunate in our ability to carry out most of our work remotely, and our department’s wonderful IT and administrative support staff have enabled a smooth transition to remote working.

Continue reading

GEMMI: A Python Cookbook

General MacroMocelecular I/O, or GEMMI, is a C++ 11 header only library for low level crystalographic .

Because its header only it is certainly the easiest to access and use low level crystalographic C++ library, however GEMMI comes with python binding via Pybind11, making it arguably the easiest low level crystalographic library to access and use in python as well!

What follows is a cookbook of useful Python code that uses GEMMI to accomplish macromolecular crystalographic tasks.

Continue reading

Lightning-fast Python code

Scientific code is never fast enough. We need the results of that simulation before that pressing deadline, or that meeting with our advisor. Computational resources are scarce, and competition for a spot in the computing nodes (cough, cough) can be tiresome. We need to squeeze every ounce of performance. And we need to do it with as little effort as possible.

Continue reading

Considering Containers? – Go for Singularity

Docker is an excellent containerisation system ideally suited to production servers.  It allows you to do one small thing but do it well.  For example, breaking a large blog up into individually maintained containers for a web-server, a database and (say) a wordpress instance. However due to inherent security woes, Docker doesn’t play nicely with multi-tenanted machines, the kind which are the bread and butter for researchers and HPC users.  That’s where Singularity steps in.   

Continue reading

Molecular dynamics analysis in MDAnalysis

Any opportunity to use rigorously tested and supported analysis tools rather than in-house code is, in my opinion, an opportunity you owe it to yourself to explore.

My preferred tool for analyzing the output of molecular dynamics (MD) simulations is MDAnalysis, a Python library that provides robust and easy-to-use tools for analyzing most common files output by MD packages (including PDB, DCD, COR, and XTC file formats). But, of course, MDAnalysis can analyze any PDB file, not just one output from an MD simulations. There may be an opportunity in your workflow to incorporate MDAnalysis to save time or to provide more robust error handling than whatever in-house code you currently use.

Continue reading

Using SLURM a little bit more efficiently

Your research group slurmified their servers? You basically have two options now.

Either you install all your necessary things on one of the slurm nodes within an interactive session, e.g.:

srun -p funkyserver-debug --pty --nodes=1 --ntasks-per-node=1 -t 00:10:00 --wait=0 /bin/bash

and always specify this node by adding the ‘#SBATCH –nodelist=funkyserver.cpu.do.work’ line to your sbatch scripts or you set up some template scripts that will help you to install all your requirements on multiple nodes so you can enjoy the benefits of the slurm system.

Here is how I did it; comments and suggestions welcome!

Step 1: Create an sbatch template file (e.g. sbatch_job_on_server.template_sh) on the submission node that does what you want. In the ‘#SBATCH –partition’ or ‘–nodelist’ lines use a placeholder, e.g. ‘<server>’, instead of funkyserver. 

For example, for installing the same conda environment on all nodes that you want to work on:

Continue reading

Functional Programming in Python

Introduction

The difficulty of reasoning about the behaviour of stateful programs, especially in concurrnent enviroments, has led to increased in intrest in a programming paradigm called functional programming. This style emphasises the connection between programs and mathematics, encouraging code that is easy to understand and, in some critical cases, even possible to prove properties of.

Continue reading

Consistent plotting with ggplot

Unlike other OPIGlets (looking at you, Claire), I have neither the skill nor the patience to make good figures from scratch. And making good figures — as well as remaking, rescaling and adapting them — is incredibly important, because they play a huge role in the way we communicate our research. So how does an aesthetically impaired DPhil student do her plotting?

Continue reading