The spatial or 3D structure of a molecule is particularly relevant to modeling its activity in QSAR. The 3D structural information affects molecular properties and chemical reactivities and thus it is important to incorporate them in deep learning models built for molecules. A key aspect of the spatial structure of molecules is the flexible distribution of their constituent atoms known as conformation. Given the temperature of a molecular system, the probability of each of its possible conformation is defined by its formation energy and this follows a Boltzmann distribution [McQuarrie and Simon, 1997]. The Boltzmann distribution tells us the probability of a certain confirmation given its potential energy. The different conformations of a molecule could result in different properties and activity. Therefore, it is imperative to consider multiple conformers in molecular deep learning to ensure that the notion of conformational flexibility is embedded in the model developed. The model should also be able to capture the Boltzmann distribution of the potential energy related to the conformers.
There have been two strategies for incorporating conformed ensembles in molecular representation learning in literature:
- Conformer sampling
- Ensemble learning
Conformer sampling
One simple and direct approach to incorporating multiple conformers would be to randomly sample a conformed for a given molecule during each training epoch. During inference, the lowest energy conformed will be used. The idea behind this strategy is to ensure that the molecular representation learned is conformation-invariant.
Ensemble learning
This strategy is more chemically nuanced in that the entire conformer ensemble is considered at once without the need for any type of sampling. The core idea of this strategy is to generate an embedding per conformer and combine them using a set encoder. There are a few ways to combine the conformed embeddings, namely, mean pooling, DeepSets [Zaheer et. al.], and Self-attention [Bahdanau et. al.]. Mean pooling simply computes the mean of all conformer embeddings. This approach is not a scientifically sound approach as it assumes a normal distribution of the potential energies conformers whereas it should follow a Boltzmann distribution as explained earlier. The DeepSets approach applies a multilayer perception to each conformer embedding and then aggregates these embeddings through sum pooling followed by another multilayer layer perception. Again this approach does not weight the multiple conformers according to their probabilities. However, the final approach of Self-attention can be more sensitive to the Boltzmann distribution of these conformers as it learns a weighted sum of the conformer embeddings. Out of these three approaches, self-attention takes the most scientifically nuanced approach of taking a weighted sum of the conformer embeddings. For more details on these approaches, please refer to the paper by Zhu et.al.
In 3D-InfoMax, an approach for 3D molecular representation learning, Stark et. al. take a combined approach of sampling n multiple conformers and learning across them. In this work, a mutual-information based approach is devised where the similarity of the embeddings of n multiple conformers and the molecular graph for each molecule is maximized through a novel loss function as depicted below.
There could be a lot more approaches to incorporating the multiple conformers available. However, the key to choosing or devising an appropriate approach is understanding if it respects the Boltzmann distribution of the conformations.