Tracking machine learning projects with Weights & Biases

Optimising machine learning models requires extensive comparison of architectures and hyperparameter combinations. There are many frameworks that make logging and visualising performance metrics across model runs easier. I recently started using Weights & Biases. In the following, I give a brief overview over some basic code snippets for your machine learning python code to get started with this tool.

Continue reading →

Miniproteins – small but mighty!

Proteins come in all shapes and sizes, ranging from thousands of amino acids in length to less than 20. However, smaller size does not correlate with reduced importance. Miniproteins, which are commonly defined as being less than 100 amino acids long, are receiving increased attention for their potential roles as pharmaceuticals. A recent paper by David Baker’s group put miniproteins into the spotlight, as the study authors were able to design miniproteins that bind the SARS-CoV-2 spike protein with as strong affinity as an antibody would – but in a tiny fraction of the size (Cao et al., 2020). These miniproteins are much cheaper to manufacture than antibodies (as they can be expressed in bacteria) and can be highly stable (with melting temperatures of >90º possible, meaning they can easily be stored at room temperature). The most promising miniprotein developed by the Baker group (LCB1) is currently undergoing testing to be used as a prophylactic nasal spray that provides protection against SARS-CoV-2 infection. These promising results – and the speed in which progress was made – brings the vast potential of miniproteins in healthcare to the fore.

Continue reading →

Making Pretty Pictures with PyMOL

There’s few things I like more in our field than the opportunity to make a really nice image of a protein structure. Don’t judge me, but I’ve been known to spend the occasional evening in front of the TV with a cup of tea and PyMOL open in front of me! I’ve presented on the subject at a couple of our research group retreats, and have wanted to type it up into a blog post for a while – and this is the last opportunity I will have, since I will be leaving in just a few weeks time, after nearly eight years (!) as an OPIGlet. So, here goes – my tips and tricks for making pretty pictures with PyMOL!

Ray Tracing

set ray_trace_mode, number

I always ray trace my images to make them higher quality. It can take a while for large proteins, but it’s always worth it! My favourite setting is 1, but 3 can be fun to make things a bit more cartoon-ish.

You can also improve the quality of the image by increasing the ‘surface_quality’ and ‘cartoon_sampling’ settings.

Continue reading →

Fragment-to-Lead Successes in 2019

In this blogpost, I want to highlight the excellent work by Jahnke and collaborators. For the past 5 years, they have published an annual perspective covering fragment-to-lead success stories from the previous year. Very helpfully, their work includes a table detailing the hit fragment(s) and lead molecule, together with key experimental results and parameters.

Continue reading →

AIRR Community Meeting V – December 2020

We attended the virtual Adaptive Immune Receptor Repertoire (AIRR) Community Meeting in early December. The three day conference is usually held every 18 months and covered a range of research talks, software demonstrations and poster presentations on the latest TCR and BCR (antibody) research. While we missed certain elements that were present at the last AIRR community meeting (namely focaccia), it was a really interesting meeting with technology all running very smoothly.

Given our current research on SARS-CoV-2 antibodies, we particularly enjoyed the work presented by Armita Nourmohammad from the University of Washington on “Dynamics of BCR in Covid”, based on the preprint on medRxiv. The research identified 34 significantly expanded rare clonal lineages shared among patients with SARS-CoV-2, which are potential candidates for covid response. In particular, the analysis includes an assessment of whether an antibody sequence identified in different individuals (known as a shared or public sequence) is likely to be found due to inherent biases in antibody recombination. Shared antibody sequences which are calculated as unlikely to be shared are potentially a response to a shared exposure such as SARS-CoV2, rather than randomly found in the antibody repertoire. In this way, Nourmohammed and colleagues identified ‘rare’ antibodies which were identified in more individuals than would statistically be expected, and therefore might be worthy of further experimental analysis.

A theme common across a short talk and poster by Hadas Neuman (Bar-Ilan) and a poster by Kenneth Hoehn (Yale), was class-switching dynamics revealed by phylogenetic inference (from IgM to IgA in the human gut in the former, and IgE and IgG4 in a paediatric patient with peanut allergy in the latter). Kenneth Hoehn’s poster also looked at B-cell differentiation during HIV infection – this can all be read about in this preprint. The methods developed in the paper for discrete trait analysis of differentiation, isotype switching and B-cell migration are implemented in the new R package dowser (https://bitbucket.org/kleinstein/dowser) which is part of the Immcantation suite (http://immcantation.org).

It was also really nice to see evidence of the burgeoning use of single-cell sequencing for immune repertoire profiling, with posters by Igor Snapkov (UiO), Indu Khatri (Leiden University Medical Centre), Nick Borcherding (Washington University in St. Louis) all using single-cell technologies, and a talk by Ivelin Georgiev on LIBRA-seq.

If you missed the conference and have had your interest piqued, some of the conference talks are available at the AIRRC youtube channel.

We look forward to AIRRC6, Dec 7 – 11, 2021!

Sarah and Eve

OPIGmas 2020, Pandemic Edition

Not even a global pandemic can halt our annual celebrations. Festivus, move over. OPIGmas is here.

We were all lucky enough to have electricity; computers with webcams and microphones (Dan’s dalek incantations notwithstanding); and network connections; and somehow (for some of us) the time, to gather together around our twenty-first century electronic hearths and celebrate: Zoom, Gather Town, Among Us, Skribbl.io, and Codenames.

The much-awaited Secret Santa often reveals how naughty or nice the sender is, and sometimes surprising details about the relationship of the sender and recipient (I’m looking in the general direction of Dominik and Brennan). The rules are simple: spend up to £10 GBP, and don’t buy anything the boss wouldn’t buy for someone… But despite the hypothesis that the longer someone had been in OPIG, the more ‘pointed’ the gift would be, exceptions could still be found.

Armed with her new Easy Learning “Times Tables Bumper Book”, the boss was anointed “CEO of ******* Everything”, with her new desk name plate. Without coordinating, the boss’ PA independently received a desk name plate as “Fixer of Everything”. Perfect, on both counts.

Continue reading →

An in vivo force sensor reveals varied mechanisms of co-translational force generation

This blog post comments on the results published by Fujiwara and co-workers in the 2020 Cell Reports article “Proteome-wide capture of co-translational protein dynamics in Bacillus subtilis using TnDR, a transposable protein-dynamics reporter.”

The study of mechanical force generation and its influence on biological systems has expanded in recent years. In the realm of nascent protein folding, we now know that both unstructured and folded nascent proteins generate forces on the order of piconewtons that propagate down the nascent chain. These forces can distort the functional site of the ribosome and may influence the rate of translation (PMIDs: 30824598, 29577725). It has also been shown that translational arrest can be relieved by mechanical force (PMID: 25908824). Much study has focused on so-called arrest peptides, short peptide sequences that interact so strongly with the ribosome exit tunnel that they can completely stall translation (e.g., SecM, MifM).

Continue reading →

Prediction of Parkinson subtypes at COXIC 2020

Last week I attended the COXIC seminar (joint seminar Oxford – Imperial focused on networks and complex systems) organised by Florian Klimm from Imperial College London (and former OPIG member!). We had several interesting at the seminar. However, one of them caught my eye more than the rest. It was the talk of Dr Sanjukta Krishnagopal (UCL) titled Predicting Parkinson’s Sub-types through Trajectory Clustering in Bipartite Networks, of which I will give a quick insight. Hope you like it (at least) as much as I did!

This blogpost is based on these two articles:

Continue reading →

Curious About the Origins of Computerized Molecules? Free Webinar Dec 22…

After the stunning announcement at CASP14 that DeepMind’s AlphaFold 2 had successfully predicted the structures of proteins from their sequence alone, it’s hard to believe we began this journey by representing molecules with punched cards…

Image of a punched card, showing 80 columns and 12 rows, with particular rectangular holes representing the 1 bits of binary numbers. The upper right corner is cut at an angle, to facilitate feeding the card into a punched card reader. The column numbers are printed along the bottom. The words “IBM UNITED KINGDOM LIMITED” are printed along the very bottom. This card is line 12 from a Fortran program, “12 PIFRA=(A(JB,37)-A(JB,99))/A(JB,47) PUX 0430”. Image Credit: Pete Birkinshaw, Manchester, U.K. CC BY 2.0

Tales of carrying stacks of punched cards to the computer centre with a line drawn diagonally on the side of the stack, to help put them back in order should you trip and fall—seem like another universe—but this is what passed for the human-computer interface in much of the mid-20th century.

Continue reading →

Oxford Protein Informatics Group

or "OPIG" to friends

Tracking machine learning projects with Weights & Biases

Miniproteins – small but mighty!

Making Pretty Pictures with PyMOL

Fragment-to-Lead Successes in 2019

AIRR Community Meeting V – December 2020

OPIGmas 2020, Pandemic Edition

An in vivo force sensor reveals varied mechanisms of co-translational force generation

Prediction of Parkinson subtypes at COXIC 2020

Curious About the Origins of Computerized Molecules? Free Webinar Dec 22…

When a ring flips, how do we calculate RMSD?