Category Archives: Conferences

Prediction of Parkinson subtypes at COXIC 2020

Last week I attended the COXIC seminar (joint seminar Oxford – Imperial focused on networks and complex systems) organised by Florian Klimm from Imperial College London (and former OPIG member!). We had several interesting at the seminar. However, one of them caught my eye more than the rest. It was the talk of Dr Sanjukta Krishnagopal (UCL) titled Predicting Parkinson’s Sub-types through Trajectory Clustering in Bipartite Networks​, of which I will give a quick insight. Hope you like it (at least) as much as I did!

This blogpost is based on these two articles:

  1. Sanjukta Krishnagopal, Rainer Von Coelln, Lisa Shulman, Michelle Girvan. “Identifying and predicting Parkinson’s disease subtypes through trajectory clustering via bipartite networks” PloS one (2020)​
  2. Sanjukta Krishnagopal. “Multi-later Trajectory Clustering Network Algorithm for Disease Subtyping” Biomedical Physics & Engineering Express (2020)​
Continue reading

BioDataScience101: a fantastic initiative to learn bioinformatics and data science

Last Wednesday, I was fortunate enough to be invited as a guest lecturer to the 3rd BioDataScience101 workshop, an initiative spearheaded by Paolo Marcatili, Professor of Bioinformatics at the Technical University of Denmark (DTU). This session, on amino acid sequence analysis applied to both proteomics and antibody drug discovery, was designed and organised by OPIG’s very own Tobias Olsen.

Continue reading

NeurIPS 2020: Chemistry / Biology papers

Another blog post, another look at accepted papers for a major ML conference. NeurIPS joins the other major machine learning conferences (and others) in moving virtual this year, running from 6th – 12th December 2020. In a continuation of past posts (ICML 2020, NeurIPS 2019), I will highlight several of potential interest to the chem-/bio-informatics communities

The list of accepted papers can be found here, with 1,903 papers accepted out of 9,467 submissions (20% acceptance rate).

In addition to the main conference, there are several workshops highly related to the type of research undertaken in OPIG: Machine Learning in Structural Biology and Machine Learning for Molecules.

The usual caveat: given the large number of papers, these were selected either by “accident” (i.e. I stumbled across them in one way or another) or through a basic search (e.g. Ctrl+f “molecule”). If you find any I have missed, please reach out and I will update accordingly.

Continue reading

Learning from Biased Datasets

Both the beauty and the downfall of learning-based methods is that the data used for training will largely determine the quality of any model or system.

While there have been numerous algorithmic advances in recent years, the most successful applications of machine learning have been in areas where either (i) you can generate your own data in a fully understood environment (e.g. AlphaGo/AlphaZero), or (ii) data is so abundant that you’re essentially training on “everything” (e.g. GPT2/3, CNNs trained on ImageNet).

This covers only a narrow range of applications, with most data not falling into one of these two categories. Unfortunately, when this is true (and even sometimes when you are in one of those rare cases) your data is almost certainly biased – you just may or may not know it.

Continue reading

Prerecording Conference Talks and Posters using OBS Studio

Seemingly every conference due to take place this year has either been cancelled or will be run virtually due to the COVID-19 pandemic. Many organisers have decided that running entirely live virtual programmes causes more trouble than it’s worth (e.g. due to unforseeable IT and internet issues disrupting the schedule), and so are asking their presenters to prerecord their talks, which are then broadcast “live” on the day.

I recently “presented” two virtual prerecorded talks at the ISMB conference using Open Broadcast Software Studio (OBS Studio), a free open-source software package most commonly used by live-streamers on Twitch and Youtube. It is super simple to use and achieves a professional output, with video overlaying a presentation slide deck/poster PDF. This blog is a “how-to” on getting started with OBS for conference talks/poster presentations.

Continue reading

Climate Change @ ISMB

Another special session I was listening to at ISMB 2020 was the Green stream. Several talks dealt with climate change and its relation to bioinformatics and computational biology. Two of them I found particularly interesting, one calculating the carbon footprint of ISMB itself and the other calculating the footprint of specific bioinformatics tools.

I believe most people have realised how important the issue of human-made climate change is and I assume that everyone has heard about some aspects of our life that are causing particularly many emissions compared to certain alternatives. For example, train rides vs. short-haul flights, eating the food’s food (veggies) vs. mass production of meat or renewable energies vs. coal plants, just to name some that are rather easy to change. Admittedly, I have also underestimated the urgency of the issue and I found this plot quite convincing:

(Screenshot from Alex Bateman’s talk)

What can we as computational researchers do about it?

Continue reading

Citizen Science in Video Games

What I really liked about visiting ISMB last year was their diversity of talks and subgroup meetings in all areas related to biology and computers. Last year I joined two talks about improving bioinformatics education which were really interesting because I hadn’t thought about that before. This year I joined a special session on citizen science.

Citizen science is public participation in scientific research and can be done by almost everyone. I had heard about Foldit or Rosetta@Home but (unfortunately) never participated. Those two projects deal with protein folding (how does a protein reach its final functional 3D structure?) which is an important scientific problem but is computationally very expensive to study. While one of the projects is a screensaver which uses free resources of personal computers, the other is a game where players can get highscores for folding protein fragments manually. Helping science in a playful way is cool by itself but the project that was presented in one of the talks brought this to the next level. A citizen science minigame was integrated into an action game for PCs and consoles.

Continue reading

ICML 2020: Chemistry / Biology papers

ICML is one of the largest machine learning conferences and, like many other conferences this year, is running virtually from 12th – 18th July.

The list of accepted papers can be found here, with 1,088 papers accepted out of 4,990 submissions (22% acceptance rate). Similar to my post on NeurIPS 2019 papers, I will highlight several of potential interest to the chem-/bio-informatics communities. As before, given the large number of papers, these were selected either by “accident” (i.e. I stumbled across them in one way or another) or through a basic search (e.g. Ctrl+f “molecule”).

Continue reading

Bayesian Optimization and Correlated Torsion Angles—in Small Molecules

Our collaborator, Prof. Geoff Hutchison from the University of Pittsburg recently took part in the Royal Society of Chemistry’s 2020 Twitter Poster Conference, to highlight the great work carried out by one of my DPhil students, Lucian Leung Chan, on the application of Bayesian optimization to conformer generation: