Category Archives: Small Molecules
Smaller and Smaller Fragments
Fragment-based drug discovery (FBDD) is based on the idea that using small (< 300 Da), highly soluble compounds to screen against a target will give higher hit rates and sample chemical space more efficiently compared to screens using larger, drug-like compounds.
Understanding Conformational Entropy in Small Molecules
While entropy is a major driving force in many chemical changes and is a key component of the free energy of a molecule, it can be challenging to calculate with standard quantum thermochemical methods. With proper consideration in flexible molecules, we can break down the total entropy into different components, including vibrational, translational, rotational and conformational entropy. The calculation of conformational entropy is the most time-consuming as we have to sample all thermally-accessible conformers. Here, we attempt to understand the components that contribute to the conformational entropy of a molecule, and develop a physically-motivated statistical model to rapidly predict the conformational entropies of small molecules.
Continue readingDrug Promiscuity vs Selectivity
In drug discovery, compound promiscuity and selectivity refers to the ability of drug compounds to bind to several different- (promiscuous) or only one main target (selective). An important distinction here is that promiscuity is defined as specific interactions with multiple biological targets (polypharmacology) rather than a number of non-specific targets. At first glance, you might expect drugs to be designed to be as selective as possible, only hitting one biological target necessary to treat the disease and therefore reduce the chance of any side effects. This paradigm of single-target specificity has been challenged over the past two decades. Even between scientists in the drug discovery field, compound promiscuity is still a controversial topic. The field has increasingly paid attention to the topic of polypharmacology and studies have shown many pharmaceutically relevant compounds, including approved drugs to derive their biological activity from polypharmacology [1-3].
Continue readingICML 2020: Chemistry / Biology papers
ICML is one of the largest machine learning conferences and, like many other conferences this year, is running virtually from 12th – 18th July.
The list of accepted papers can be found here, with 1,088 papers accepted out of 4,990 submissions (22% acceptance rate). Similar to my post on NeurIPS 2019 papers, I will highlight several of potential interest to the chem-/bio-informatics communities. As before, given the large number of papers, these were selected either by “accident” (i.e. I stumbled across them in one way or another) or through a basic search (e.g. Ctrl+f “molecule”).
Continue readingProCare: cavity similarity searching and its applications to fragment-based drug design
ProCare [1] is a package developed at the University of Strasbourg which is able to align and score the similarity of protein cavities. The aim is to find ligand binding sites between different proteins that are similar enough to bind the same ligand. The method used in ProCare is designed to look particularly at fragment (~⅓ size of a druglike ligand) binding sites. The aim is to predict potential fragment hits by comparing the cavities of the targets.
Continue readingDeLinker – Deep Generative Models for 3D Linker Design
*** Disclaimer: This blog post represents some shameless self-promotion. ***
I am delighted to announce that our most recent work, DeLinker, was recently published in the Journal of Chemical Information and Modeling (link).

Bayesian Optimization and Correlated Torsion Angles—in Small Molecules
Our collaborator, Prof. Geoff Hutchison from the University of Pittsburg recently took part in the Royal Society of Chemistry’s 2020 Twitter Poster Conference, to highlight the great work carried out by one of my DPhil students, Lucian Leung Chan, on the application of Bayesian optimization to conformer generation:
State of the art in AI for drug discovery: more wet-lab please
The reception of ML approaches for the drug discovery pipeline, especially when focused on the hit to lead optimization process, has been rather skeptical by the medchem community. One of the main drivers for that is the way many ML publications benchmark their models: Historic datasets are split into two parts, with the larger part used to train and the smaller to test ML models. In order to standardize that validation process, computational chemists have constructed widely used benchmark datasets such as the DUD-E set, which is commonly used as a standard for protein-ligand binding classification tasks. Common criticism from medicinal chemists centers on the main problem associated with benchmark datasets: the absence of direct lab validation.
Continue readingAutoDock 4 and AutoDock Vina
A recently just-released publication from Ngyuen et al. ing JCIM pointed out that while AutoDock Vina is faster, AutoDock 4 tends to have better correlation with experimental binding affinity.1
[This post has been edited to provide more information about the cited paper, as well as providing additional citations.]
Ngyuyen et al. selected 800 protein-ligand complexes for 47 protein targets that had both experimental PDB structures complexed with a ligand, as well as their associated binding affinity values.
Continue reading