Seaborn is a Python-based data visualization library, which is based on matplotlib (https://seaborn.pydata.org/) . I would like to share some guidance/code to get started with drawing plots using this library! I will be using the dataset ‘flights’ from Seaborn (https://github.com/mwaskom/seaborn-data) to highlight an example.
Continue readingCategory Archives: Python
Calculating symmeterised small molecule RMSDs using graph automorphisms in python with GEMMI and NetworkX
When a ring flips, how do we calculate RMSD?
This surprisingly simple question leads to a very interesting problem! If we take a benzene molecule, say, and rotate it 180 degrees, then we have the exact same molecule, but if we have a data structure in which our atoms are labelled, and we apply the same transformation to the atomic positions, the numbering does not reflect that symmetry. If we were then naively to calculate the RMSD it would be huge, despite the fact that the molecule is, chemically speaking, identical.
How can we make our RMSD calculations reflect these symmetries?
Continue readingTracking machine learning projects with Weights & Biases
Optimising machine learning models requires extensive comparison of architectures and hyperparameter combinations. There are many frameworks that make logging and visualising performance metrics across model runs easier. I recently started using Weights & Biases. In the following, I give a brief overview over some basic code snippets for your machine learning python code to get started with this tool.
Continue readingUMAP Visualization of SARS-CoV-2 Data in ChEMBL
Using Python in PyMOL
Decades later, we owe Warren DeLano and his commitment to open source a great debt. Warren wrote PyMOL, an amazingly powerful and popular molecular visualization tool, but it has many hidden talents.
Perhaps its greatest strength is the use of the open source language, Python, as its control language.
Continue readingImproving your Python code quality using git pre-commit hooks
Intro
I recently completed an internship during which I spent a considerable amount of time doing software engineering. One of my main take-aways from this experience was that in industry, a lot more attention is spent on ensuring that code committed to a GitHub repo is clean and bug-free.
This is achieved through several means like code review (get other people to read your code), test-driven development (make sure your code works as you are adding functionality) or paired development (have two people work together on the same piece of code). Here, I will instead focus on a useful tool that is easy to integrate into your existing git workflow: Pre-commit hooks.
Continue readingFrom Jupyter to Slides using RISE
In preparation for remote teaching this year, I’ve spent the last few weeks converting the Doctoral Training Centre’s ‘Introduction to Computer Programming’ course into a series of Jupyter notebooks so that the course can be run entirely using Google Colaboratory.
Continue readingPyMOL: colouring proteins by property
We all love pretty, colourful pictures of proteins. There is quite a variety of programs to produce publication-quality images of proteins, some of the most popular being VMD, PyMOL and Chimera. Each has advantages and disadvantages — for example, VMD is particularly good to deal with molecular dynamics simulations (perhaps that’s why it is called “Visual Molecular Dynamics”?), and Chimera is able to produce breathtaking graphics with very little user input. In my work, however, I tend to peruse PyMOL: a Python interface is incredibly helpful to produce quick analyses.
Continue readingObserved Antibody Space + miAIRR
Today is the day for another (potentially penultimate) blog post from me. Using this opportunity, I would like to introduce to you our recent update to the Observed Antibody Space (OAS) resource.
Continue readingVisualising macromolecules and grids in Jupyter Notebooks with nglview
If you do most of your work in Jupyter notebooks, it can be convenient to have a quick visualisation tool to view the results of your latest computation from within the notebook, without having to flick between the notebook and your favourite molecule viewer.
I have recently started using NGLview, an IPython/Jupyter widget, to do this. It is based on the NGL viewer, an embeddable webapp for macromolecular visualisation. The nglvew module documentation can be found here, and in addition to handling the usual formats for molecular structure (.pdb, .mol2, .sdf, .pqr, etc.) and map density(.ccp4 and more), it supports visualising trajectories and even making movies.
Continue reading