We all love pretty, colourful pictures of proteins. There is quite a variety of programs to produce publication-quality images of proteins, some of the most popular being VMD, PyMOL and Chimera. Each has advantages and disadvantages — for example, VMD is particularly good to deal with molecular dynamics simulations (perhaps that’s why it is called “Visual Molecular Dynamics”?), and Chimera is able to produce breathtaking graphics with very little user input. In my work, however, I tend to peruse PyMOL: a Python interface is incredibly helpful to produce quick analyses.
Continue readingCategory Archives: Python
Observed Antibody Space + miAIRR
Today is the day for another (potentially penultimate) blog post from me. Using this opportunity, I would like to introduce to you our recent update to the Observed Antibody Space (OAS) resource.
Continue readingVisualising macromolecules and grids in Jupyter Notebooks with nglview
If you do most of your work in Jupyter notebooks, it can be convenient to have a quick visualisation tool to view the results of your latest computation from within the notebook, without having to flick between the notebook and your favourite molecule viewer.
I have recently started using NGLview, an IPython/Jupyter widget, to do this. It is based on the NGL viewer, an embeddable webapp for macromolecular visualisation. The nglvew module documentation can be found here, and in addition to handling the usual formats for molecular structure (.pdb, .mol2, .sdf, .pqr, etc.) and map density(.ccp4 and more), it supports visualising trajectories and even making movies.
Continue readingStoring variables in Jupyter Notebooks using %store magic
We’ve all been there. You’ve just run an expensive computation in your Jupyter Notebook and are about to draw those conclusions which will prove that your theories were right all along (until you find the sixteen bugs in your code which render them invalid, but that’s an issue for a different time). Then at the critical moment, your flatmate begins streaming their Lord Of The Rings marathon in 4k and your already temperamental Wi-Fi severs your connection to the department servers in protest, crashing your Jupyter Notebook, leaving your hopes and dreams in tatters.
Continue readingGEMMI: A Python Cookbook
General MacroMocelecular I/O, or GEMMI, is a C++ 11 header only library for low level crystalographic .
Because its header only it is certainly the easiest to access and use low level crystalographic C++ library, however GEMMI comes with python binding via Pybind11, making it arguably the easiest low level crystalographic library to access and use in python as well!
What follows is a cookbook of useful Python code that uses GEMMI to accomplish macromolecular crystalographic tasks.
Continue readingLightning-fast Python code
Scientific code is never fast enough. We need the results of that simulation before that pressing deadline, or that meeting with our advisor. Computational resources are scarce, and competition for a spot in the computing nodes (cough, cough) can be tiresome. We need to squeeze every ounce of performance. And we need to do it with as little effort as possible.
Continue readingMolecular dynamics analysis in MDAnalysis
Any opportunity to use rigorously tested and supported analysis tools rather than in-house code is, in my opinion, an opportunity you owe it to yourself to explore.
My preferred tool for analyzing the output of molecular dynamics (MD) simulations is MDAnalysis, a Python library that provides robust and easy-to-use tools for analyzing most common files output by MD packages (including PDB, DCD, COR, and XTC file formats). But, of course, MDAnalysis can analyze any PDB file, not just one output from an MD simulations. There may be an opportunity in your workflow to incorporate MDAnalysis to save time or to provide more robust error handling than whatever in-house code you currently use.
Functional Programming in Python
Introduction
The difficulty of reasoning about the behaviour of stateful programs, especially in concurrnent enviroments, has led to increased in intrest in a programming paradigm called functional programming. This style emphasises the connection between programs and mathematics, encouraging code that is easy to understand and, in some critical cases, even possible to prove properties of.
Continue readingHow to be a Bayesian – ft. a completely ridiculous example
Most of the stats we are exposed to in our formative years as statisticians are viewed through a frequentist lens. Bayesian methods are often viewed with scepticism, perhaps due in part to a lack of understanding over how to specify our prior distribution and perhaps due to uncertainty as to what we should do with the posterior once we’ve got it.
Continue readingGitHub Link to Text Mining Tool
I have created a GitHub page to share some of the codes that I used to conduct text mining to extract HBV-related genetic information from PubMed Central. This code is easily adaptable to search through sentences that satisfy your keyword search, so please take a look if you are interested: https://github.com/angoto/HBV_Code.
Note: GitHub page is currently unavailable online, but will be accessible in due course.