Slightly belated, these are our thoughts on the MABRA workshop at the University of Surrey, which five OPIGlegts attended in January 2020.
Robust networks to study omics data
One of the challenges that biology-related sciences are facing is the exponential increase of data. Nowadays, thanks to all the sequencing techniques which are available, we are generating more data than the amount we can study. We all love all the genomic, epigenomic, transcriptomic, proteomic, … , glycomic, lipidomic, and metagenomic studies because of the rich they are. However, most of the times, the analysis of the results uses only a fraction of all the generated data. For example, it is quite frequent to study the transcriptome of an organism in different environments and then just focus on identifying which 2 or 3 genes are upregulated. This type of analyses do not exploit the data to its maximum extent and here is where network analysis makes its appearance!
Cooking Up a (Deep)STORM with a Little Cup of Super Resolution Microscopy
Recently, I attended the Quantitative BioImaging (QBI) Conference 2020, served right here in Oxford. Amongst the many methods on the menu were new recipes for spicing up your Cryo-EM images with a bit of CiNNamon with a peppering of Poisson point processes in the inhomogeneous spatial case amongst many others. However, like many of today’s top tier restaurants most of the courses on offer were on the smaller side, nano-scale in fact, serving up the new field of Super Resolution Microscopy!
Continue readingFinding The Gene Responsible for Huntington’s Disease – The Story of Nancy Wexler.
Huntington’s Disease – an inherited disorder, which will result in the lack of movement and speech, dementia and ultimately death. Earliest symptoms include lack of coordination and unsteady gait; physical abilities worse until the complete physiological breakdown of the patient’s body. Meanwhile, the mental abilities worsen as well into dementia. Overall, Huntington’s disease results in the death of brain cells.
Continue readingEffect of Debiasing Protein-Ligand binding data on Generalization
Virtual screening is a computational technique used in drug discovery to search libraries of small molecules in order to identify those structures that bind tightly and specifically to a given protein target. Many machine learning (ML) models have been proposed for virtual screening, however, it is not clear whether these models can truly predict the molecular properties accurately across chemical space or simply overfit the training data. As chemical space contains clusters of molecules around scaffolds, memorising the properties of a few scaffolds can be sufficient to perform well, masking the fact that the model may not generalise beyond close analogue. Different debiasing algorithms have been introduced to address this problem. These algorithms systematically partition the data to reduce bias and provide a more accurate metric of the model performance.
Continue readingThe address of a gene
Most scientists working in the biological sciences or an overlapping field have encountered various ways of identifying genes and proteins. There are many different types of identifiers. For example, searching for the PDB ID: 2IW3 (which represents elongation factor 3 in yeast strain S288C) on UniProt gives us a results column labeled “Gene names” that includes no less than six (!) ways to refer to the gene that produces this particular protein. This can be frustrating – it is easy to get into trouble when you think you have a consistent gene naming scheme when you do not, especially if you want to cross-reference gene lists.
Functional Programming in Python
Introduction
The difficulty of reasoning about the behaviour of stateful programs, especially in concurrnent enviroments, has led to increased in intrest in a programming paradigm called functional programming. This style emphasises the connection between programs and mathematics, encouraging code that is easy to understand and, in some critical cases, even possible to prove properties of.
Continue readingAutoDock 4 and AutoDock Vina
A recently just-released publication from Ngyuen et al. ing JCIM pointed out that while AutoDock Vina is faster, AutoDock 4 tends to have better correlation with experimental binding affinity.1
[This post has been edited to provide more information about the cited paper, as well as providing additional citations.]
Ngyuyen et al. selected 800 protein-ligand complexes for 47 protein targets that had both experimental PDB structures complexed with a ligand, as well as their associated binding affinity values.
Continue readingGreen politics, the left, and Brexit
For my first non-technical blogpost, I thought I’d go in for something that we can all agree on and is entirely devoid of controversy: Brexit. Is that growning I hear from the back of the room?
One of my uncles is a professor of sociology; he returned to the UK for the first time in 10 years over Christmas 2019, and naturally we had plenty to talk about. He had left with two kids when I was a lanky, goofy teenager and had returned with four to a lanky, goofy adult. What were most interesting, though, were his views on green politics and their relationship with the traditional left-right spectrum.
Continue readingParallelising antigen-specific B-cell isolation with LIBRA-seq
Today is the day when I write a blog about an exciting research paper in the field of B-cell receptor (BCR) repertoires analysis. At OPIG, we (antibody people) are working hard to model and characterise antibody 3D configuration from its sequence. Significant progress has been made in modelling software development, so that we can predict antibody structures with high confidence. This task becomes considerably harder when we model the entirety of BCR repertoire sequences. Current methods of BCR repertoire sequencing operate primarily on the heavy chain only. This limits our capacity to generate refined 3D antibody models to just approximation of shapes of complementarity determining regions(CDRs).
Continue reading