I recently had a problem where I wanted to provide an interactive visualization of multiple different protein–ligand complexes, requiring minimal setup by the user, allowing them to zoom in and out and change the visualization style, without just providing multiple PDB files or a PyMOL session.
Continue readingCategory Archives: Small Molecules
Comparing pose and affinity prediction methods for follow-up designs from fragments
In any task in the realm of virtual screening, there need to be many filters applied to a dataset of ligands to downselect the ‘best’ ones on a number of parameters to produce a manageable size. One popular filter is if a compound has a physical pose and good affinity as predicted by tools such as docking or energy minimisation. In my pipeline for downselecting elaborations of compounds proposed as fragment follow-ups, I calculate the pose and ΔΔG by energy minimizing the ligand with atom restraints to matching atoms in the fragment inspiration. I either use RDKit using its MMFF94 forcefield or PyRosetta using its ref2015 scorefunction, all made possible by the lovely tool Fragmenstein.
With RDKit as the minimizer the protein neighborhood around the ligand is fixed and placements take on average 21s whereas with PyRosetta placements, they take on average 238s (and I can run placements in parallel luckily). I would ideally like to use RDKit as the placement method since it is so fast and I would like to perform 500K within a few days but, I wanted to confirm that RDKit is ‘good enough’ compared to the slightly more rigorous tool PyRosetta (it allows residues to relax and samples more conformations with the longer runtime I think).
Fine-tune generated molecular poses with a force field
Some molecular pose generation methods benefit from an energy relaxation post-processing step.

Here is a quick way to do this using OpenMM via a short script I prepared:
Continue readingRSC Fragments 2024
I attended RSC Fragments 2024 (Hinxton, 4–5 March 2024), a conference dedicated to fragment-based drug discovery. The various talks were really good, because they gave overviews of projects involving teams across long stretches of time. As a result there were no slides discussing wet lab protocol optimisations and not a single Western blot was seen. The focus was primarily either illustrating a discovery platform or recounting a declassified campaign. The latter were interesting, although I’d admit I wish there had been more talk of organic chemistry —there was not a single moan/gloat about a yield. This top-down focus was nice as topics kept overlapping, namely:
- Target choice,
- covalents,
- molecular glues,
- whether to escape Flatland,
- thermodynamics, and
- cryptic pockets
Taking Equivariance in deep learning for a spin?
I recently went to Sheh Zaidi‘s brilliant introduction to Equivariance and Spherical Harmonics and I thought it would be useful to cement my understanding of it with a practical example. In this blog post I’m going to start with serotonin in two coordinate frames, and build a small equivariant neural network that featurises it.
Continue readingFinding and testing a reaction SMARTS pattern for any reaction
Have you ever needed to find a reaction SMARTS pattern for a certain reaction but don’t have it already written out? Do you have a reaction SMARTS pattern but need to test it on a set of reactants and products to make sure it transforms them correctly and doesn’t allow for odd reactants to work? I recently did and I spent some time developing functions that can:
- Generate a reaction SMARTS for a reaction given two reactants, a product, and a reaction name.
- Check the reaction SMARTS on a list of reactants and products that have the same reaction name.
Online tools for drawing and visualizing molecules
I recently came across a nice tool for depicting multiple molecules called CDK Depict (thanks to Ruben for sending it to me), so I decided to explore what other web-based molecule visualization and drawing tools are available.
Continue readingThe workings of Fragmenstein’s RDKit neighbour-aware minimisation
Fragmenstein is a Python module that combine hits or position a derivative following given templates by being very strict in obeying them. This is done by creating a “monster”, a compound that has the atomic positions of the templates, which then reanimated by very strict energy minimisation. This is done in two steps, first in RDKit with an extracted frozen neighbourhood and then in PyRosetta within a flexible protein. The mapping for both combinations and placements are complicated, but I will focus here on a particular step the minimisation, primarily in answer to an enquiry, namely how does the RDKit minimisation work.

Demystifying the thermodynamics of ligand binding
Chemoinformatics uses a curious jumble of terms from thermodynamics, wet-lab techniques and statistical terminology, which is at its most jarring, it could be argued, in machine learning. In some datasets one often sees pIC50, pEC50, pKi and pKD, in discussion sections a medchemist may talk casually of entropy, whereas in the world of molecular mechanics everything is internal energy. Herein I hope to address some common misconceptions and unify these concepts.
Continue readingConference feedback: AI in Chemistry 2023
Last month, a drift of OPIGlets attended the royal society of chemistry’s annual AI in chemistry conference. Co-organised by the group’s very own Garrett Morris and hosted in Churchill College, Cambridge, during a heatwave (!), the two days of conference featured aspects of artificial intelligence and deep machine learning methods to applications in chemistry. The programme included a mixture of keynote talks, panel discussion, oral presentations, flash presentations, posters and opportunities for open debate, networking and discussion amongst participants from academia and industry alike.
Continue reading