Speaking about Sequence and Structure at a Summit

A couple of weeks ago I was lucky enough to be asked to speak at the 5th Computational Drug Discovery & Development for Biologics Summit. This was my first virtual conference – it was a shame I didn’t get to visit Boston, and presenting to my empty room was slightly bizarre, but it was great to hear what people have been working on, and there’s definitely something to be said for attending a conference in fluffy socks…

A, antibody structure. An antibody is made up of four chains: two light (orange) and two heavy (blue). Each chain is made up of a series of domains—the variable domains of the light and heavy chains together are known as the Fv region (shown on the right; PDB entry 12E8). The Fv features six loops known as complementarity determining regions or CDRs (shown in dark blue); these are mainly responsible for antigen binding. B, example sequences for the VH and VL, highlighting the CDR regions and the genetic composition. It is estimated that the human antibody repertoire contains up to 1013 unique sequences, enabling the immune system to respond to almost any antigen. This is possible through the recombination of V, D and J gene segments, junctional diversification, and somatic hypermutation.

My talk was focused on the content of a review paper Charlotte and I published earlier this year: “How Repertoire Data Are Changing Antibody Science” (Journal of Biological Chemistry). Having developed the Observed Antibody Space database, a collection of nearly 2 billion antibody sequences, much of the work in our group now looks into using that data to learn more about the immune system. Some key research projects are:

  • We have shown that antibody sequence repositories can potentially be mined for therapeutic leads. An analysis of OAS found that of a set of 242 post-phase II therapeutic antibodies, sequences with over 90% identity were available for 90 heavy chains and 158 light chains, and 54 perfect H3 matches were also found, which is many more than would be expected by chance.
  • Paratyping is a new method developed in OPIG for predicting which antibodies bind to the same epitope from their sequences. Unlike clonotyping, paratyping does not look at germline and only considers the residues which are likely to be involved in antigen binding, enabling us to ‘hop’ between different germline scaffolds.
  • By annotating antibody sequences with their likely structural templates, we have been able to identify a subset of structures that are common to multiple individuals (the public repertoire), and also look for structural differences between antibodies from different stages of the immune response.

More details for each of these can be found in the papers I’ve linked to. Watch this space for more exciting research involving antibody repertoires in the future!

Author