Category Archives: Uncategorized

On The Logic of GOing with Weisfeiler-Lehman

Recently, I was able to attend Martin Grohe’s talk on The Logic of Graph Neural Networks. Professor Grohe of RWTH Aachen University, is a titan of the fields of Logic and Complexity theory. Even so, he is modest about his achievements, and I was tickled when it was pointed out to me that the theorem he refers to as “a little complex”, one of his crowning achievements, involves a four-hundred page long book of a proof.

The theorem relates to the Weisfeiler-Lehmann (WL) algorithm, an algorithm for determining whether two graphs are equivalent (i.e. isomorphic). The algorithm has deep connections with combinatorics, complexity theory and first order logic. A system of logic that is remarkably similar to the relations present in ontologies such as the Gene Ontology (GO), which is commonly used to compare and predict protein function. Kernelised methods and other WL-based metrics present a new and possibly logically “complete” way to potentially compare the functions of proteins and infer their similarity.

The Gene Ontology follows a simple set of rules, very similar to first order logic. From the GO Database Description
Continue reading

COSTNET19 Conference

Last month, I attended the COSTNET19 Conference in Bilbao (Spain). This conference is organised by COSTNET, a COST Action which aims to foster international European collaboration on the emerging field of statistics of network data science. COSTNET facilitates interaction and collaboration between diverse groups of statistical network modellers, establishing a large and vibrant interconnected and inclusive community of network scientists.

Continue reading

Why you should care about startups as a researcher

I was recently awarded the EIT Health Translational Fellowship, which aims to fund DPhil projects with the goal of commercializing the research and addressing the funding gap between research and seed funding. In order to win, I had to deliver a short 5 minute startup pitch in front of a panel of investors and scientific experts to convince them that my DPhil project has impact as well as commercial viability. Besides the £5000 price, the fellowship included a week-long training course on how to improve your pitch, address pain points in your business strategy etc. I found the whole experience to be incredibly rewarding and the skills I picked up very important, even as a researcher. As a summary, this is why I think you should care about the startup world as a researcher.

Continue reading

A new way of eating too much

Fresh off the pages of Therapeutic Advances in Endocrinology and Metabolism comes a warning no self-respecting sweet tooth should ignore.

“Liquorice is not just a candy,” write a team of ten from Chicago. “Life-threatening complications can occur with excess use.” Hold on to your teabags. Liquorice – the Marmite of sweets – is about to become a lot more sinister.

Continue reading

Two Tools for Systematically Compiling Ensembles of Protein Structures

In order to know how a protein works, we generally want to know its 3-dimensional structure. We then can either try to solve it ourselves (which requires considerable time, skill, and resources), or look for it in the Protein Data Bank, in case it has already been solved. The vast majority of structures in the Protein Data Bank (PDB) are solved through protein crystallography, and represent a “snapshot” of the conformational space available to our protein of interest. Continue reading

AIRR community meeting

Hi everyone,

Today is the day for another blog post from me. Last month I attended an AIRR conference in Genoa, Italy (https://www.antibodysociety.org/airrc/meetings/communityiv/). It was the fourth AIRR conference, and I was nice to see lots of field-leading people participating. Compared to the last AIRR meeting almost 2 years ago, the agenda of the conference was dominated by machine learning and big data topics. In my short blog post, I will discuss two talks that covered these two exciting topics.

Continue reading

What is the hydrophobic-polar (HP) model?

Proteins are fascinating. They are ubiquitous in living organisms, carrying out all kinds of functions: from structural support to unbelievably powerful catalysis. And yet, despite their ubiquity, we are still bemused by their functioning, not to mention by how they came to be. As computational scientists, our research at OPIG is mostly about modelling proteins in different forms. We are a very heterogeneous group that leverages approaches of diverse scale: from modelling proteins as nodes in a complex interaction network, to full atomistic models that help us understand how they behave.

Continue reading

Exciting new studies in OAS

Hi everyone!

Today is the day for another blog post from me. Here, I would like to give you an update on new studies, which were deposited in the Observed Antibody Space (OAS) resource and take a closer look at one of these studies. To date, we have curated 57 studies in OAS, where we provide raw nucleotide and numbered amino acid sequences for download. These amino acid sequences have been filtered using ANARCI parsing, which ensures that the sequences align to respective species HMM profiles and do not have unusual indels and frameshifts. More than 660 million numbered amino acid sequences are deposited in OAS, where every sequence keeps a link to its corresponding nucleotide sequence. Recently we added two more studies to OAS: Sheng et al., (2017) and Setliff et al., (2018). We numbered roughly 2.8 and 46 million sequences in Sheng et al., and Setliff et al., studies respectively. In this blog post, I would like to talk more about the uniqueness of Setliff et al., data.

Continue reading