Author Archives: An Goto
Benchmarks in De Novo Drug Design
I recently came across a review of “De novo molecular drug design benchmarking” by Lauren L. Grant and Clarissa S. Sit where they highlighted the recently proposed benchmarking methods including Fréchet ChemNet Distance [1], GuacaMol [2], and Molecular Sets (MOSES) [3] together with its current and future potential applications as well as the steps moving forward in terms of validation of benchmarking methods [4].
From this review, I particularly wanted to note about the issues with current benchmarking methods and the points we should be aware of when using these methods to benchmark our own de novo molecular design methods. Goal-directed models are referring to de novo molecular design methods optimizing for a particular scoring function [2].
Continue readingSeaborn 101
Seaborn is a Python-based data visualization library, which is based on matplotlib (https://seaborn.pydata.org/) . I would like to share some guidance/code to get started with drawing plots using this library! I will be using the dataset ‘flights’ from Seaborn (https://github.com/mwaskom/seaborn-data) to highlight an example.
Continue readingUMAP Visualization of SARS-CoV-2 Data in ChEMBL
de novo Small Molecule Design using Deep Learning
This is an interesting paper by Zhavoronkov, et al. that recently got published in Nature Biotechnology as a brief communication: https://www.nature.com/articles/s41587-019-0224-x. The paper describes a new deep generative model called generative tensorial reinforcement learning (GENTRL), which enables optimization for synthetic feasibility, novelty, and biological activity. In this work, authors have deigned, synthesized, and experimentally validated molecules targeting discoidin domain receptor 1 (DDR1) in less than two months. The code for GENTRL is available here: https://github.com/insilicomedicine/gentrl.
Reference: Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology 2019, 37, 1038-1040.
GitHub Link to Text Mining Tool
I have created a GitHub page to share some of the codes that I used to conduct text mining to extract HBV-related genetic information from PubMed Central. This code is easily adaptable to search through sentences that satisfy your keyword search, so please take a look if you are interested: https://github.com/angoto/HBV_Code.
Note: GitHub page is currently unavailable online, but will be accessible in due course.
A Month in Basel – Summer 2019
I had an opportunity to visit Basel, Switzerland for a month between mid-July to mid-August. The first week began with the ISMB/ECCB Conference 2019, which was a 5 days event. The average temperature was 35 °C with a hottest day reaching up to 39 °C, which was rather too hot compared to a British weather. This weather was perfect to try out ‘floating’ in the Rhine river, which I missed the opportunity to, but would highly recommend it if anyone is visiting Basel in the future.
Read more