Citizen Science in Video Games

What I really liked about visiting ISMB last year was their diversity of talks and subgroup meetings in all areas related to biology and computers. Last year I joined two talks about improving bioinformatics education which were really interesting because I hadn’t thought about that before. This year I joined a special session on citizen science.

Citizen science is public participation in scientific research and can be done by almost everyone. I had heard about Foldit or Rosetta@Home but (unfortunately) never participated. Those two projects deal with protein folding (how does a protein reach its final functional 3D structure?) which is an important scientific problem but is computationally very expensive to study. While one of the projects is a screensaver which uses free resources of personal computers, the other is a game where players can get highscores for folding protein fragments manually. Helping science in a playful way is cool by itself but the project that was presented in one of the talks brought this to the next level. A citizen science minigame was integrated into an action game for PCs and consoles.

Continue reading

Drug Promiscuity vs Selectivity

In drug discovery, compound promiscuity and selectivity refers to the ability of drug compounds to bind to several different- (promiscuous) or only one main target (selective). An important distinction here is that promiscuity is defined as specific interactions with multiple biological targets (polypharmacology) rather than a number of non-specific targets. At first glance, you might expect drugs to be designed to be as selective as possible, only hitting one biological target necessary to treat the disease and therefore reduce the chance of any side effects. This paradigm of single-target specificity has been challenged over the past two decades. Even between scientists in the drug discovery field, compound promiscuity is still a controversial topic. The field has increasingly paid attention to the topic of polypharmacology and studies have shown many pharmaceutically relevant compounds, including approved drugs to derive their biological activity from polypharmacology [1-3].

Continue reading

No labels, no problem! A quick introduction to Gaussian Mixture Models

Statistical Modelling Big Data AnalyticsTM is in vogue at the moment, and there’s nothing quite so fashionable as the neural network. Capable of capturing complex non-linear relationships and scalable for high-dimensional datasets, they’re here to stay.

For your garden-variety neural network, you need two things: a set of features, X, and a label, Y. But what do you do if labelling is prohibitively expensive or your expert labeller goes on holiday for 2 months and all you have in the meantime is a set of features? Happily, we can still learn something about the labels, even if we might not know what they are!

Continue reading

K-Means clustering made simple

The 21st century is often referred to as the age of “Big Data” due to the unprecedented increase in the volumes of data being generated. As most of this data comes without labels, making sense of it is a non-trivial task. To gain insight from unlabelled data, unsupervised machine learning algorithms have been developed and continue to be refined. These algorithms determine underlying relationships within the data by grouping data points into cluster families. The resulting clusters not only highlight associations within the data, but they are also critical for creating predictive models for new data.

Continue reading

Real Space Correlation Coefficient

Introduction

In crystalography we are often faced with the question of how well a part of our model fits the data. Now crystalography has well developed probability models for the reflection amplitudes given then entire fitted model, but these do not provide a metric for “how much of the ligand is inside the blob”. This is because the reflection based models are inherently global.

Continue reading

ICML 2020: Chemistry / Biology papers

ICML is one of the largest machine learning conferences and, like many other conferences this year, is running virtually from 12th – 18th July.

The list of accepted papers can be found here, with 1,088 papers accepted out of 4,990 submissions (22% acceptance rate). Similar to my post on NeurIPS 2019 papers, I will highlight several of potential interest to the chem-/bio-informatics communities. As before, given the large number of papers, these were selected either by “accident” (i.e. I stumbled across them in one way or another) or through a basic search (e.g. Ctrl+f “molecule”).

Continue reading

Uploading/downloading small files across systems

Sometimes you just want to quickly move a copy of a script, image or binary from, for example, your local (linux) machine to another (linux) machine. The usual tool would be SCP, but this can get complicated when there are several layers of ssh and sometimes it doesn’t work at all (as is the case for transfers between the Department of Statistics computers and the outside world).

Continue reading

ProCare: cavity similarity searching and its applications to fragment-based drug design

ProCare [1] is a package developed at the University of Strasbourg which is able to align and score the similarity of protein cavities. The aim is to find ligand binding sites between different proteins that are similar enough to bind the same ligand. The method used in ProCare is designed to look particularly at fragment (~⅓ size of a druglike ligand) binding sites. The aim is to predict potential fragment hits by comparing the cavities of the targets.

Continue reading

Journal Club: the Dynamics of Affinity Maturation

Last week at our group meeting I presented on a paper titled “T-cell Receptor Variable beta Domains Rigidify During Affinity Maturation” by Monica L. Fernández-Quintero, Clarissa A. Seidler and Klaus R. Liedl. The authors use metadynamics simulations of the same T-cell Receptor (TCR) at different stages of affinity maturation to study the conformational landscape of the complementarity-determining regions (CDRs), and how this might relate to an increase in affinity. Not only do they conclude that affinity maturation leads to rigidification of CDRs in solution, but they also present some evidence for the conformational selection model of biomolecular binding events in TCR-antigen interactions.

Continue reading