We’ve all been there. You’ve just run an expensive computation in your Jupyter Notebook and are about to draw those conclusions which will prove that your theories were right all along (until you find the sixteen bugs in your code which render them invalid, but that’s an issue for a different time). Then at the critical moment, your flatmate begins streaming their Lord Of The Rings marathon in 4k and your already temperamental Wi-Fi severs your connection to the department servers in protest, crashing your Jupyter Notebook, leaving your hopes and dreams in tatters.
Continue readingCategory Archives: Python
GEMMI: A Python Cookbook
General MacroMocelecular I/O, or GEMMI, is a C++ 11 header only library for low level crystalographic .
Because its header only it is certainly the easiest to access and use low level crystalographic C++ library, however GEMMI comes with python binding via Pybind11, making it arguably the easiest low level crystalographic library to access and use in python as well!
What follows is a cookbook of useful Python code that uses GEMMI to accomplish macromolecular crystalographic tasks.
Continue readingLightning-fast Python code
Scientific code is never fast enough. We need the results of that simulation before that pressing deadline, or that meeting with our advisor. Computational resources are scarce, and competition for a spot in the computing nodes (cough, cough) can be tiresome. We need to squeeze every ounce of performance. And we need to do it with as little effort as possible.
Continue readingMolecular dynamics analysis in MDAnalysis
Any opportunity to use rigorously tested and supported analysis tools rather than in-house code is, in my opinion, an opportunity you owe it to yourself to explore.
My preferred tool for analyzing the output of molecular dynamics (MD) simulations is MDAnalysis, a Python library that provides robust and easy-to-use tools for analyzing most common files output by MD packages (including PDB, DCD, COR, and XTC file formats). But, of course, MDAnalysis can analyze any PDB file, not just one output from an MD simulations. There may be an opportunity in your workflow to incorporate MDAnalysis to save time or to provide more robust error handling than whatever in-house code you currently use.
Functional Programming in Python
Introduction
The difficulty of reasoning about the behaviour of stateful programs, especially in concurrnent enviroments, has led to increased in intrest in a programming paradigm called functional programming. This style emphasises the connection between programs and mathematics, encouraging code that is easy to understand and, in some critical cases, even possible to prove properties of.
Continue readingHow to be a Bayesian – ft. a completely ridiculous example
Most of the stats we are exposed to in our formative years as statisticians are viewed through a frequentist lens. Bayesian methods are often viewed with scepticism, perhaps due in part to a lack of understanding over how to specify our prior distribution and perhaps due to uncertainty as to what we should do with the posterior once we’ve got it.
Continue readingGitHub Link to Text Mining Tool
I have created a GitHub page to share some of the codes that I used to conduct text mining to extract HBV-related genetic information from PubMed Central. This code is easily adaptable to search through sentences that satisfy your keyword search, so please take a look if you are interested: https://github.com/angoto/HBV_Code.
Note: GitHub page is currently unavailable online, but will be accessible in due course.
A Gentle Introduction to the GPyOpt Module
Manually tuning hyperparameters in a neural network is slow and boring. Using Bayesian Optimisation to do it for you is slightly less slower and you can go do other things whilst it’s running. Susan recently highlighted some of the resources available to get to grips with GPyOpt. Below is a copy of a Jupyter Notebook where we walk through a couple of simple examples and hopefully shed a little bit of light on how the algorithm works.
Continue readingThree things to help you get started on Bayesian Optimisation
In this blog post I will share with you the materials that I found most useful when I started doing some Bayesian Optimisation in my research. Bear in mind, I am a Chemist by training, so I approached this topic from a non-mathematical background (my eyes have to be persuaded to look at mathematical equations). Out of all the materials I have come across, I found these to be the most accessible.
Continue readingHow to Iterate in PyMOL
Sometimes pointing-and-clicking just doesn’t cut it. With PyMOL’s built-in Python interpreter, repetitive actions are made simple.
Continue reading