I recently went to Sheh Zaidi‘s brilliant introduction to Equivariance and Spherical Harmonics and I thought it would be useful to cement my understanding of it with a practical example. In this blog post I’m going to start with serotonin in two coordinate frames, and build a small equivariant neural network that featurises it.
Continue readingCategory Archives: Deep Learning
Understanding positional encoding in Transformers
Transformers are a very popular architecture in machine learning. While they were first introduced in natural language processing, they have been applied to many fields such as protein folding and design.
Transformers were first introduced in the excellent paper Attention is all you need by Vaswani et al. The paper describes the key elements, including multiheaded attention, and how they come together to create a sequence to sequence model for language translation. The key advance in Attention is all you need is the replacement of all recurrent layers with pure attention + fully connected blocks. Attention is very efficeint to compute and allows for fast comparisons over long distances within a sequence.
One issue, however, is that attention does not natively include a notion of position within a sequence. This means that all tokens could be scrambled and would produce the same result. To overcome this, one can explicitely add a positional encoding to each token. Ideally, such a positional encoding should reflect the relative distance between tokens when computing the query/key comparison such that closer tokens are attended to more than futher tokens. In Attention is all you need, Vaswani et al. propose the slightly mysterious sinusoidal positional encodings which are simply added to the token embeddings:
Current strategies to predict structures of multiple protein conformational states
Since the release of AlphaFold2 (AF2), the problem of protein structure prediction is widely believed to be solved. Current structure prediction tools, such as AF2, are able to model most proteins with high accuracy. These methods, however, have a major limitation as they have been trained to predict a single structure for a given protein. Proteins are highly dynamic molecules, and their function often depends on transitions between several conformational states. Despite research focusing on the task of predicting the structures of multiple conformations of a protein, currently, no accurate and reliable method is available. In this blog post, I will provide a short overview of the strategies developed for predicting protein conformations. I have grouped these into three sets of related approaches. To conclude, I will also demonstrate how to run one of these strategies on your own.
Continue readingA simple criterion can conceal a multitude of chemical and structural sins
We’ve been investigating deep learning-based protein-ligand docking methods which often claim to be able to generate ligand binding modes within 2Å RMSD of the experimental one. We found, however, this simple criterion can conceal a multitude of chemical and structural sins…
DeepDock attempted to generate the ligand binding mode from PDB ID 1t9b
(light blue carbons, left), but gave pretzeled rings instead (white carbons, right).
Lucubration or Gaslighting?
Or: The best lies have a nugget of truth in them.
Lucubration – The action or occupation of intensive study originally by candle or lamplight.
Gaslighting – Psychological abuse in which a person or group causes someone to question their own sanity, memories, or perception.
I was recently having a play with Google Bard. Bard, unlike ChatGPT has access to live data. It also undergoes live feedback and quality control. I was hoping to see if it would find me any journals with articles on prion research which I’d previously overlooked.
Me: Please show me some recent articles about prion research.
(Because always be polite to our AI overlords, they’ll remember!)
What can you do with the OPIG Immunoinformatics Suite? v3.0
OPIG’s growing immunoinformatics team continues to develop and openly distribute a wide variety of databases and software packages for antibody/nanobody/T-cell receptor analysis. Below is a summary of all the latest updates (follows on from v1.0 and v2.0).
Continue readingThe State of Computational Protein Design
Last month, I had the privilege to attend the Keystone Symposium on Computational Design and Modeling of Biomolecules in beautiful Banff, Canada. This conference gave an incredible insight into the current state of the protein design field, as we are on the precipice of advances catalyzed by deep learning.
Here are my key takeaways from the conference:
Continue readingCan AlphaFold predict protein-protein interfaces?
Since its release, AlphaFold has been the buzz of the computational biology community. It seems that every group in the protein science field is trying to apply the model in their respective areas of research. Already we are seeing numerous papers attempting to adapt the model to specific niche domains across a broad range of life sciences. In this blog post I summarise a recent paper’s use of the technology for predicting protein-protein interfaces.
Continue readingThe Most ReLU-iable Activation Function?
The Rectified Linear Unit (ReLU) activation function was first used in 1975, but its use exploded when it was used by Nair & Hinton in their 2010 paper on Restricted Boltzmann Machines. ReLU and its derivative are fast to compute, and it has dominated deep neural networks for years. The main problem with the activation function is the so-called dead ReLU problem, where significant negative input to a neuron can cause its gradient to always be zero. To rectify this (har har), modified versions have proposed, including leaky ReLU, GeLU and SiLU, wherein the gradient for x < 0 is not always zero.
A 2020 paper by Naizat et al., which builds upon ideas set out in a 2014 Google Brain blog post seeks to explain why ReLU and its variants seem to be better in general for classification problems than sigmoidal functions such as tanh and sigmoid.
Continue readingUniversal graph pooling for GNNs
Graph neural networks (GNNs) have quickly become one of the most important tools in computational chemistry and molecular machine learning. GNNs are a type of deep learning architecture designed for the adaptive extraction of vectorial features directly from graph-shaped input data, such as low-level molecular graphs. The feature-extraction mechanism of most modern GNNs can be decomposed into two phases:
- Message-passing: In this phase the node feature vectors of the graph are iteratively updated following a trainable local neighbourhood-aggregation scheme often referred to as message-passing. Each iteration delivers a set of updated node feature vectors which is then imagined to form a new “layer” on top of all the previous sets of node feature vectors.
- Global graph pooling: After a sufficient number of layers has been computed, the updated node feature vectors are used to generate a single vectorial representation of the entire graph. This step is known as global graph readout or global graph pooling. Usually only the top layer (i.e. the final set of updated node feature vectors) is used for global graph pooling, but variations of this are possible that involve all computed graph layers and even the set of initial node feature vectors. Commonly employed global graph pooling strategies include taking the sum or the average of the node features in the top graph layer.
While a lot of research attention has been focused on designing novel and more powerful message-passing schemes for GNNs, the global graph pooling step has often been treated with relative neglect. As mentioned in my previous post on the issues of GNNs, I believe this to be problematic. Naive global pooling methods (such as simply summing up all final node feature vectors) can potentially form dangerous information bottlenecks within the neural graph learning pipeline. In the worst case, such information bottlenecks pose the risk of largely cancelling out the information signal delivered by the message-passing step, no matter how sophisticated the message-passing scheme.
Continue reading