The combination of Python and the cheminformatics toolkit RDKit has opened up so many ways to explore chemistry on a computer. Jupyter — named for the three languages, Julia, Python, and R — ties interactivity and visualization together, creating wonderful environments (Notebooks and JupyterLab) to carry out, share and reproduce research, including:
“data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.”
—https://jupyter.org
At this year’s annual RDKit UGM (User Group Meeting), Cédric Bouysset shared a tutorial explaining how to create a grid of molecules that you can interact with, using his “mols2grid“:
It’s easy to filter the molecules interactively using the “Search” box in the bottom right corner either by “Text” or “SMARTS” (options hidden by the magnifying glass icon); and to sort them by Pandas data frame fields. Tooltips can also be shown just by hovering your cursor over a molecule in the grid. It’s even possible to use callbacks to trigger actions, like showing the 3D conformer of a molecule using py3Dmol, or calling arbitrary JavaScript. By checking individual molecules, the selection can be accessed in subsequent cells, or copied to the clipboard (but not in Colab notebooks as Google blocks this).
It’s easy to install using conda (or mamba, a faster conda included in CondaColab)—and highly recommended.
# How to install mols2grid using conda conda install -c conda-forge mols2grid # How to install mols2grid using mamba mamba install -c conda-forge mols2grid
Pat Walters shows how to use mols2grid to cluster chemical structures using RDKit’s implementation of the Butina clustering algorithm and Panda’s groupby function.
Also check out Justin Chavez’s “Building Web Applications From Python Scripts with Streamlit“, and Chanin Nantasenamat‘s YouTube video, “How to build a web app for Drug Discovery in Python”: