Recently, I have become quite interested in how countries have been shaping their national AI strategies or frameworks. Since the launch of ChatGPT, several concerns have been raised about AI safety and how such groundbreaking AI technologies could augment or adversely affect our daily lives. To address the public’s concerns and set standards and practices for AI development, some countries have recently released their national AI frameworks. As a budding academic researcher in this space who is keen to make AI more useful for medicine and healthcare, there are two key aspects from the few frameworks I have looked at (specifically the US, UK and Singapore) that are of interest to me, namely, the multi-stakeholder approach and focus on AI education which I will delve further into in this post.
Category Archives: AI
Where do we go from here?
This is an experiment.
We will prompt a model with a piece of text and assess how well it understands the contents.
The model sits behind your eyes.
The data used to train this model is your entire life, up to and including this very moment.
Continue readingUnderstanding positional encoding in Transformers
Transformers are a very popular architecture in machine learning. While they were first introduced in natural language processing, they have been applied to many fields such as protein folding and design.
Transformers were first introduced in the excellent paper Attention is all you need by Vaswani et al. The paper describes the key elements, including multiheaded attention, and how they come together to create a sequence to sequence model for language translation. The key advance in Attention is all you need is the replacement of all recurrent layers with pure attention + fully connected blocks. Attention is very efficeint to compute and allows for fast comparisons over long distances within a sequence.
One issue, however, is that attention does not natively include a notion of position within a sequence. This means that all tokens could be scrambled and would produce the same result. To overcome this, one can explicitely add a positional encoding to each token. Ideally, such a positional encoding should reflect the relative distance between tokens when computing the query/key comparison such that closer tokens are attended to more than futher tokens. In Attention is all you need, Vaswani et al. propose the slightly mysterious sinusoidal positional encodings which are simply added to the token embeddings:
Conference feedback: AI in Chemistry 2023
Last month, a drift of OPIGlets attended the royal society of chemistry’s annual AI in chemistry conference. Co-organised by the group’s very own Garrett Morris and hosted in Churchill College, Cambridge, during a heatwave (!), the two days of conference featured aspects of artificial intelligence and deep machine learning methods to applications in chemistry. The programme included a mixture of keynote talks, panel discussion, oral presentations, flash presentations, posters and opportunities for open debate, networking and discussion amongst participants from academia and industry alike.
Continue readingConference feedback — with a difference
At OPIG Group Meetings, it’s customary to give “Conference Feedback” whenever any of us has recently attended a conference. Typically, people highlight the most interesting talks—either to them or others in the group.
Having just returned from the 6th RSC-BMCS / RSC-CICAG AI in Chemistry Symposium, it was my turn last week. But instead of the usual perspective—of an attendee—I spoke briefly about how to organize a conference.
Continue readingThe Surprising Shape of Normal Distributions in High Dimensions
Multivariate Normal distributions are an essential component of virtually any modern deep learning method—be it to initialise the weights and biases of a neural network, perform variational inference in a probabilistic model, or provide a tractable noise distribution for generative modelling.
What most of us (including—until very recently—me) aren’t aware of, however, is that these Normal distributions begin to look less and less like the characteristic bell curve that we associate them with as their dimensionality increases.
Continue readingAI Can’t Believe It’s Not Butter
Recently, I’ve been using a Convolutional Neural Network (CNN), and other methods, to predict the binding affinity of antibodies from their sequence. However, nine months ago, I applied a CNN to a far more important task – distinguishing images of butter from margarine. Please check out the GitHub link below to learn moo-re.
https://github.com/lewis-chinery/AI_cant_believe_its_not_butter
A simple criterion can conceal a multitude of chemical and structural sins
We’ve been investigating deep learning-based protein-ligand docking methods which often claim to be able to generate ligand binding modes within 2Å RMSD of the experimental one. We found, however, this simple criterion can conceal a multitude of chemical and structural sins…
DeepDock attempted to generate the ligand binding mode from PDB ID 1t9b
(light blue carbons, left), but gave pretzeled rings instead (white carbons, right).
Lucubration or Gaslighting?
Or: The best lies have a nugget of truth in them.
Lucubration – The action or occupation of intensive study originally by candle or lamplight.
Gaslighting – Psychological abuse in which a person or group causes someone to question their own sanity, memories, or perception.
I was recently having a play with Google Bard. Bard, unlike ChatGPT has access to live data. It also undergoes live feedback and quality control. I was hoping to see if it would find me any journals with articles on prion research which I’d previously overlooked.
Me: Please show me some recent articles about prion research.
(Because always be polite to our AI overlords, they’ll remember!)
What can you do with the OPIG Immunoinformatics Suite? v3.0
OPIG’s growing immunoinformatics team continues to develop and openly distribute a wide variety of databases and software packages for antibody/nanobody/T-cell receptor analysis. Below is a summary of all the latest updates (follows on from v1.0 and v2.0).
Continue reading