Conference feedback: AI in Chemistry 2023

Last month, a drift of OPIGlets attended the royal society of chemistry’s annual AI in chemistry conference. Co-organised by the group’s very own Garrett Morris and hosted in Churchill College, Cambridge, during a heatwave (!), the two days of conference featured aspects of artificial intelligence and deep machine learning methods to applications in chemistry. The programme included a mixture of keynote talks, panel discussion, oral presentations, flash presentations, posters and opportunities for open debate, networking and discussion amongst participants from academia and industry alike. 

To kick off the series of shorter talks at the conference, Daniel Probst spoke about his work on the explainable prediction of catalyzing enzymes from predictions. He employed DRFP fingerprints and Shapley values to enable the model to suggest which chemical groups in a molecule could have influenced yield. A key takeaway from his talk, and a constant theme throughout the conference, was the lack of negative data, i.e., data for reactions that did not work, which hampers our models’ ability to fully understand their domain.

Next, Samuel Genheden from AstraZeneca discussed his group’s work on retrosynthesis. Interestingly, his group had spent some time investigating which single-step tool, whether template-based or template-free, was more accurate in the context of a fully implemented retrosynthesis tool. He pointed out that typically single-step methods are evaluated only for their accuracy in single-step prediction, and he found that it did not correlate with their accuracy when combined with a route search algorithm. Furthermore, they found that the popular benchmark USPTO-50K was inadequate for measuring the performance of these tools due to its limited size, compared to the more commonly used larger datasets.

On the conference’s second day Laksh Aithani of Charm Therapeutics discussed their protein-ligand cofolding model, DragonFold. While this is a tool designed to dock ligands and was compared with DiffDock and GLIDE, the cofolding aspect allows the protein to undergo conformational shifts. For proteins which are in the apo state or have been predicted computationally (without the ligand), this could be very useful as proteins commonly have slight or significant changes between the apo and the holo state. 

Later, we saw a talk from Kohulan Rajan from DECIMER.ai who presented the latest developments to the chemical image recognition tool. An interesting recent improvement was the use of synthetic hand-drawn-like images generated by RanDepict. Although the previous model (not trained on hand-drawn images) could identify hand-drawn images with reasonable accuracy, the inclusion of synthetic hand-drawn-like images in the training data led to almost perfect accuracy in the image-to-SMILES prediction.

These were just a few of the talks that we enjoyed over the course of the conference: other highlights included a whistlestop tour of LLMs in Chemistry from Andrew White, a deep dive into geometric machine learning with Michael Bronstein, and, very excitingly, a talk from our very own Martin on PoseBusters. Overall, would absolutely AI in Chemistry again!

Written by Anna, Martin, Guy, Arun and Lucy

Author