Category Archives: Molecular Dynamics

MDAnalysis: Work with dynamics trajectories of proteins

For a long time crystallographers and subsequently the authors of AlphaFold2 had you believe that proteins are a static group of atoms written to a .pdb file. Turns out this was a HOAX. If you don’t want to miss out on the latest trend of working with dynamic structural ensembles of proteins this blog post is exactly right for you. MDAnalysis is a python package which as the name says was designed to analyse molecular dyanmics simulation and lets you work with trajectories of protein structures easily.

Continue reading

Incorporating conformer ensembles for better molecular representation learning

Conformer ensemble of tryptophan from Seibert et. al.

The spatial or 3D structure of a molecule is particularly relevant to modeling its activity in QSAR. The 3D structural information affects molecular properties and chemical reactivities and thus it is important to incorporate them in deep learning models built for molecules. A key aspect of the spatial structure of molecules is the flexible distribution of their constituent atoms known as conformation. Given the temperature of a molecular system, the probability of each of its possible conformation is defined by its formation energy and this follows a Boltzmann distribution [McQuarrie and Simon, 1997]. The Boltzmann distribution tells us the probability of a certain confirmation given its potential energy. The different conformations of a molecule could result in different properties and activity. Therefore, it is imperative to consider multiple conformers in molecular deep learning to ensure that the notion of conformational flexibility is embedded in the model developed. The model should also be able to capture the Boltzmann distribution of the potential energy related to the conformers.

Continue reading

Conference Summary: MGMS Adaptive Immune Receptors Meeting 2024

On 5th April 2024, over 60 researchers braved the train strikes and gusty weather to gather at Lady Margaret Hall in Oxford and engage in a day full of scientific talks, posters and discussions on the topic of adaptive immune receptor (AIR) analysis!

Continue reading

The stuff MDAnalysis didn’t implement: CPU Parallel HOLE conductance analysis

Some time ago, I needed to find a way to computationally estimate conductance values for every protein frame from several molecular dynamics (MD) trajectories.

In a previous post, I wrote about how to clean the resulting instant conductance timeseries from outliers. But, I never described how I generated these timeseries.

In this post, I will show how you can parallelise the computation of instant conductance given an MD trajectory. I will touch on the difficulties of this process. And why I had to implement a custom tool for it given that MDAnalysis seems to already have implemented a routine of this sort. Finally, I will provide two Python scripts that you can easily adapt to run your parallel calculations – for which I’ll provide some important notes you don’t wanna skip.

Violin plots of conductance distributions from 64 molecular dynamic trajectories with 1000 frames each.
Continue reading

Demystifying the thermodynamics of ligand binding

Chemoinformatics uses a curious jumble of terms from thermodynamics, wet-lab techniques and statistical terminology, which is at its most jarring, it could be argued, in machine learning. In some datasets one often sees pIC50, pEC50, pKi and pKD, in discussion sections a medchemist may talk casually of entropy, whereas in the world of molecular mechanics everything is internal energy. Herein I hope to address some common misconceptions and unify these concepts.

Continue reading

Be a computational chemist and you must be a jack of all trades

Being a jack of all trades brings to mind someone who has extensive multidisciplinary expertise and is equipped with many tools in their toolbox to solve different problems. A jack of all trades is a great succinct description for computational chemists in drug discovery.

Recently I had a great conversation with Dr. Arjun Narayanan, a Senior Research Scientist at Vertex Pharmaceuticals and a jack of all trades as a computational chemist. In this blog post, I’ll describe what he does as a computational chemist, the problems he solves, and the new tools he’s looking forward to adding to his toolbox.

Continue reading

Writing “vector trajectories” with cpptraj

The program cpptraj, written by Daniel Roe (https://github.com/Amber-MD/cpptraj) and distributed Open Source with the AmberTools package (https://ambermd.org/AmberTools.php), is a powerful tool for analysis of molecular dynamics simulations. In addition to all of the expected analyses like Root Mean Square Deviation and native contacts, cpptraj also includes a suite of vector algebra functions.

While this vector algebra functionality is fairly well known and easy to find in the documentation, I think it is less well known that cpptraj can write trajectories of the computed vectors. These trajectories can then be loaded into Visual Molecular Dynamics (VMD) alongside the analysed trajectory and played as a movie. This functionality is a valuable tool for debugging your vector calculations to make sure they are doing precisely what you intend. It may also prove useful for generating visualizations of vectors alongside molecular structures for publications.

The cpptraj script below reads in an Amber parameter file and coordinate file and then calculates the angle between two planes.

parm 7bbg_fixed.prmtop
trajin 7bbg_fixed.rst7
vector v1 mask :65@NH1 :65@NH2
vector v2 mask :65@NH1 :65@NE
vector v3 mask :64@CA :66@CA
vector v4 mask :66@CA :68@CA
vectormath vec1 v1 vec2 v2 crossproduct name n1
vectormath vec1 v3 vec2 v4 crossproduct name n2
vectormath vec1 n1 vec2 n2 dotangle out 7bbg_ref_plane_angle.dat

The first plane is defined by two vectors in the plane of the guanidino group of a R65 residue (v1 and v2); the second plane is defined by two vectors between CA atoms of amino acids in the alpha helix containing R65 (v3 and v4). The first two vectormath calls determine the normal vectors to the planes and the final vectormath line computes the angle between the normal vectors. Taken together, these commands compute the angle between the arginine side chain and a plane passing through the CA atoms of the alpha helix. Let’s check that the vectors {v1, v2, v3, v4} are being computed correctly.

parm 7bbg_fixed.prmtop
trajin 7bbg_fixed.rst7
vector v1 mask :65@NH1 :65@NH2
vector v2 mask :65@NH1 :65@NE
vector v3 mask :64@CA :66@CA
vector v4 mask :66@CA :68@CA
run
writedata vectors.mol2 v1 v2 v3 v4 vectraj trajfmt mol2

The resulting vector trajectory vectors.mol2 can be loaded directly into VMD without a topology. Note that in this case we only analyzed a single frame, but you can run this same procedure on DCD files, too. This is what I get when I load the vectors into VMD alongside the structure:

The vectors are shown as red/pink line segments. The right structure is identical to the left but with the alpha helix cartoon model removed. The blue spheres indicate the locations of the CA atoms used to define the plane of the helix.

I hope this vector trajectory functionality will be helpful to a few people who like to neurotically check their analyses like I do. You can download the example prmtop and rst7 files below. Note that you should rename them to remove the extra “.txt” file extension before attempting to use them for anything.

The information in this blog post is adapted from an Amber Archive post from Daniel Roe, dated 30-Oct-2018: http://archive.ambermd.org/201811/0058.html

Files for the example:

Cleaning outliers in conductance timeseries from molecular dynamics

Have you ever had an annoying dataset that looks something like this?

or even worse, just several of them

In this blog post, I will introduce basic techniques you can use and implement with Python to identify and clean outliers. The objective will be to get something more eye-pleasing (and mostly less troublesome for further data analysis) like this

Continue reading

Coarse-grained models of antibody solutions

Various coarse-grained (CG) models have become increasingly common in studies of antibody-antibody interactions in solution. These models appear poised to enter development pipelines in the near future to help predict and understand how antibody-antibody interactions influence the suitability of a given monoclonal antibody (mAb) for mass production and delivery as an antibody therapy. This blog post is a non-exhaustive summary of some of the highlights I found during a recent literature search.

Continue reading

Do you have cis peptide bonds in your simulation inputs?

People who run molecular simulations quickly become familiar with all of the things about a PDB file – missing residues, missing heavy atoms in residues, missing hydrogens, non-standard amino acids, multiple conformations, crystallization ligands, etc. – that might need to be fixed before setting up a simulation. This blog post is a reminder to check, after you have “fixed” your PDB, if you have accidentally introduced aberrant cis peptide bonds into your structure during rebuilding.

Continue reading