In drug discovery, compound promiscuity and selectivity refers to the ability of drug compounds to bind to several different- (promiscuous) or only one main target (selective). An important distinction here is that promiscuity is defined as specific interactions with multiple biological targets (polypharmacology) rather than a number of non-specific targets. At first glance, you might expect drugs to be designed to be as selective as possible, only hitting one biological target necessary to treat the disease and therefore reduce the chance of any side effects. This paradigm of single-target specificity has been challenged over the past two decades. Even between scientists in the drug discovery field, compound promiscuity is still a controversial topic. The field has increasingly paid attention to the topic of polypharmacology and studies have shown many pharmaceutically relevant compounds, including approved drugs to derive their biological activity from polypharmacology [1-3].
Continue readingCategory Archives: Small Molecules
ICML 2020: Chemistry / Biology papers
ICML is one of the largest machine learning conferences and, like many other conferences this year, is running virtually from 12th – 18th July.
The list of accepted papers can be found here, with 1,088 papers accepted out of 4,990 submissions (22% acceptance rate). Similar to my post on NeurIPS 2019 papers, I will highlight several of potential interest to the chem-/bio-informatics communities. As before, given the large number of papers, these were selected either by “accident” (i.e. I stumbled across them in one way or another) or through a basic search (e.g. Ctrl+f “molecule”).
Continue readingProCare: cavity similarity searching and its applications to fragment-based drug design
ProCare [1] is a package developed at the University of Strasbourg which is able to align and score the similarity of protein cavities. The aim is to find ligand binding sites between different proteins that are similar enough to bind the same ligand. The method used in ProCare is designed to look particularly at fragment (~⅓ size of a druglike ligand) binding sites. The aim is to predict potential fragment hits by comparing the cavities of the targets.
Continue readingDeLinker – Deep Generative Models for 3D Linker Design
*** Disclaimer: This blog post represents some shameless self-promotion. ***
I am delighted to announce that our most recent work, DeLinker, was recently published in the Journal of Chemical Information and Modeling (link).
Continue readingBayesian Optimization and Correlated Torsion Angles—in Small Molecules
Our collaborator, Prof. Geoff Hutchison from the University of Pittsburg recently took part in the Royal Society of Chemistry’s 2020 Twitter Poster Conference, to highlight the great work carried out by one of my DPhil students, Lucian Leung Chan, on the application of Bayesian optimization to conformer generation:
State of the art in AI for drug discovery: more wet-lab please
The reception of ML approaches for the drug discovery pipeline, especially when focused on the hit to lead optimization process, has been rather skeptical by the medchem community. One of the main drivers for that is the way many ML publications benchmark their models: Historic datasets are split into two parts, with the larger part used to train and the smaller to test ML models. In order to standardize that validation process, computational chemists have constructed widely used benchmark datasets such as the DUD-E set, which is commonly used as a standard for protein-ligand binding classification tasks. Common criticism from medicinal chemists centers on the main problem associated with benchmark datasets: the absence of direct lab validation.
Continue readingAutoDock 4 and AutoDock Vina
A recently just-released publication from Ngyuen et al. ing JCIM pointed out that while AutoDock Vina is faster, AutoDock 4 tends to have better correlation with experimental binding affinity.1
[This post has been edited to provide more information about the cited paper, as well as providing additional citations.]
Ngyuyen et al. selected 800 protein-ligand complexes for 47 protein targets that had both experimental PDB structures complexed with a ligand, as well as their associated binding affinity values.
Continue readingde novo Small Molecule Design using Deep Learning
This is an interesting paper by Zhavoronkov, et al. that recently got published in Nature Biotechnology as a brief communication: https://www.nature.com/articles/s41587-019-0224-x. The paper describes a new deep generative model called generative tensorial reinforcement learning (GENTRL), which enables optimization for synthetic feasibility, novelty, and biological activity. In this work, authors have deigned, synthesized, and experimentally validated molecules targeting discoidin domain receptor 1 (DDR1) in less than two months. The code for GENTRL is available here: https://github.com/insilicomedicine/gentrl.
Reference: Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology 2019, 37, 1038-1040.
What are Hotspots in Structural Biology?
“Hotspot” is one of those extremely versatile words, similar to “model” and “buffer”, which can mean a variety of things depending on context. According to Merriam-Webster, a hotspot is “a place of more than usual interest, activity, or popularity”. This is the most general definition of the concept I could find in a quick search, and the one I find closest in spirit to the way hotspots are perceived in a structural biology context. What this blog post is definitely not about are hotspots as “areas of political, military, or civil unrest” (my experience with them has so far been mostly peaceful), or anything to do with geology, WiFi connections, or forest fires.
However, even within the context of structural biology and structure-based drug design, the word “hotspot” has multiple meanings. In this blog post, I will try to summarise the main ones I have come across, the (sometimes subtle) differences between them, and provide a few useful papers to serve as an entry point for interested readers. Continue reading
NeurIPS 2019: Chemistry/Biology papers
NeurIPS is the largest machine learning conference (by number of participants), with over 8,000 in 2017. This year, the conference will be held in Vancouver, Canada from 8th-14th December.
Recently, the list of accepted papers was announced, with 1430 papers accepted. Here, I will highlight several of potential interest to the chem-/bio-informatics communities. Given the large number of papers, these were selected either by “accident” (i.e. I stumbled across them in one way or another) or through a basic search (e.g. Ctrl+f “molecule”).
Continue reading