Monthly Archives: November 2024

Controlling PyMol from afar

Do you keep downloading .pdb and .sdf files and loading them into PyMol repeatedly?

If yes, then PyMol remote might be just for you. With PyMol remote, you can control a PyMol session running on your laptop from any other machine. For example, from a Jupyter Notebook running on your HPC cluster.

Continue reading →

Building CLI Applications with Typer

Remember the last time you had to build a command-line tool? If you’re like me, you probably started with argparse or click, wrote boilerplate code, and still ended up with something that felt clunky. That’s where typer comes in – it’s a game-changer that lets you build CLI apps with minimal code. Although there are several other options, typer stands out because it leverages Python’s type hints to do the heavy lifting. No more manual argument parsing! The following snippet shows how to use typer in its simplest form:

import typer

app = typer.Typer()

@app.command()
def hello(name: str):
    typer.echo(f"Hello {name}!")

if __name__ == "__main__":
    app()

And you will be able to execute it with just:

$ python hello.py Pedro
Hello Pedro!

In this simple example, we were only defining positional arguments, but having optional arguments is as easy as setting default values in the function signature.

Continue reading →

Cross referencing across LaTeX documents in one project

A common scenario we come across is that we have a main manuscript document and a supplementary information document, each of which have their own sections, tables and figures. The question then becomes – how do we effectively cross-reference between the documents without having to tediously count all the numbers ourselves every time we make a change and recompile the documents?

The answer: cross referencing!

Continue reading →

Using Jupyter on a remote slurm cluster

Suppose we need to do some interactive analysis in a Jupyter notebook, but our local machine lacks the power. We have access to a slurm cluster, but we can’t SSH from the head node to the worker node; we can only SSH from the worker node to the head node. Can we still interact with a Jupyter notebook running on the worker node? As it happens, the answer is “yes” – we just need to do some reverse SSH tunnelling.

Continue reading →

Big Compute doesnt want you to know this! Maximising GPU Usage with CUDA MPS

Accelerating Simulations with CUDA MPS: An OpenMM Implementation Guide

Introduction

High-performance molecular dynamics simulations often require running concurrent simulations on GPUs. However, traditional GPU resource allocation can lead to inefficient utilization when running multiple processes, with users often resorting to using multiple GPUs to achieve this. While parrallelsing across nodes can improve time to solution, many processes require coordination and hence communication which quickly becomes a bottleneck. This is exacerbated with more powerful hardware as internal node communication for a single simulation on a single GPU can also become a bottleneck. This problem has been addressed for CPU parrallelism with multiprocessing and multithreading but previously this was challenging to do this efficiently on GPUs.

NVIDIA’s Multi-Process Service (MPS) offers a solution by enabling efficient and easy sharing of GPU resources among multiple processes with just a few commands. In this blog post, we’ll explore how to implement CUDA MPS with Python multiprocessing and OpenMM to accelerate molecular dynamics simulations.

Continue reading →

Cream, Compression, and Complexity: Notes from a Coffee-Induced Rabbit Hole

I have recently stumbled upon this paper which, quite unexpectedly, sent me down a rabbit hole reading about compression, generalisation, algorithmic information theory and looking at gifs of milk mixing with coffee on the internet. Here are some half-processed takeaways from this weird journey.

Complexity of a cup of Coffee

First, check out this cool video.

Imagine a nice, simple, low-entropy cup of coffee.
Pour some milk and you'll see beautiful & complex structures arising, before the cup reaches a high-entropy homogenous status, back to simple.
Interesting things happen while entropy increases!

Here a funny experiment:
1/3 pic.twitter.com/PmAFs0Y7ty
— Michele Ginolfi (@micginolfi) May 1, 2022

Continue reading →

OP-uns

As a long-standing social sec of the OPIG research group, the key part of the job is to ensure that the event name includes some kind of pun around OPIG (and then maybe organise something). Notable ones include OPIGmas, the annual Christmas party that I am neglecting to organise at the moment, and O’Punting, our annual punting trip. For the next generation of social secs, I have proposed some new activities with even worse names.

Continue reading →

Maybe we should train on our test set?

One of the fundamental (pitfalls) of machine learning is to ensure that you don’t train on your test set, but what if I told you that you could?

Continue reading →

The “AI-ntibody” Competition: benchmarking in silico antibody screening/design

We recently contributed to a communication in Nature Biotechnology detailing an upcoming competition coordinated by Specifica to evaluate the relative performance of in vitro display and in silico methods at identifying target-specific antibody binders and performing downstream antibody candidate optimisation.

Following in the footsteps of tournaments such as the Critical Assessment of Structure Prediction (CASP), which have led to substantial breakthroughs in computational methods for biomolecular structure prediction, the AI-ntibody initiative seeks to establish a periodic benchmarking exercise for in silico antibody discovery/design methods. It should help to identify the most significant breakthroughs in the space and orient future methods’ development.

Continue reading →

Making Peace with Molecular Entropy

I first stumbled upon OPIG blogs through a post on ligand-binding thermodynamics, which refreshed my understanding of some thermodynamics concepts from undergrad, bringing me face-to-face with the concept that made most molecular physics students break out in cold sweats: Entropy. Entropy is that perplexing measure of disorder and randomness in a system. In the context of molecular dynamics simulations (MD), it calculates the conformational freedom and disorder within protein molecules which becomes particularly relevant when calculating binding free energies.

In MD, MM/GBSA and MM/PBSA are fancy terms for trying to predict how strongly molecules stick together and are the go-to methods for binding free energy calculations. MM/PBSA uses the Poisson–Boltzmann (PB) equation to account for solvent polarisation and ionic effects accurately but at a high computational cost. While MM/GBSA approximates PB, using the Generalised Born (GB) model, offering faster calculations suitable for large systems, though with reduced accuracy. Consider MM/PBSA as the careful accountant who considers every detail but takes forever, while MM/GBSA is its faster, slightly less accurate coworker who gets the job done when you’re in a hurry.

Like many before me, I made the classic error of ignoring entropy, assuming that entropy changes that were similar across systems being compared would have their terms cancel out and could be neglected. This would simplify calculations and ease computational constraints (in other words it was too complicated, and I had deadlines breathing down my neck). This worked fine… until it didn’t. The wake-up call came during a project studying metal-isocitrate complexes in IDH1. For context, IDH1 is a homodimer with a flexible ‘hinge’ region that becomes unstable without its corresponding subunit, giving rise to very high fluctuations. By ignoring entropy in this unstable system, I managed to generate binding free energy results that violated several laws of thermodynamics and would make Clausius roll in his grave.

Continue reading →

Oxford Protein Informatics Group

or "OPIG" to friends

Monthly Archives: November 2024

Controlling PyMol from afar

Building CLI Applications with Typer

Cross referencing across LaTeX documents in one project

Using Jupyter on a remote slurm cluster

Big Compute doesnt want you to know this! Maximising GPU Usage with CUDA MPS

Accelerating Simulations with CUDA MPS: An OpenMM Implementation Guide

Introduction

Cream, Compression, and Complexity: Notes from a Coffee-Induced Rabbit Hole

Complexity of a cup of Coffee

OP-uns

Maybe we should train on our test set?

The “AI-ntibody” Competition: benchmarking in silico antibody screening/design

Making Peace with Molecular Entropy