This blogpost is a short notice about recent quality-of-life and feature updates to our Therapeutic Structural Antibody Database (Thera-SAbDab). We hope these changes will make the database more user-friendly and facilitate new analyses…
- The contents of the “Target” field have been revised to include alternative target names
Previously, antigen labels in Thera-SAbDab did not consider alternative target names. For example, some therapeutics were labelled as binding PDL1, and others as binding CD274, making it seem like they had different specificities, despite PDL1 and CD274 being alternative names for the same target. Therefore, we decided to mine the web for antigen synonyms and have integrated this information into the database.
You will now find antigen labels in the following syntax:
A/A’/A”
> where the therapeutic is monomeric, and binds target A with alternative names A’, A”
A/A’;B/B’/B”
> where the therapeutic is multispecific, with the first variable domain of the entry targeting A (alternative name A’) and the second targeting B (alternative names B’, B”).
Where an antigen has alternative names, they are always exhaustively listed and in the same order. This should make it much quicker to work out how many/which therapeutics bind to a single target of interest. - A new field has been added: “Alternative therapeutic names”
While the database is still indexed based on the International Nonproprietary Name (INN) for consistency, we recognise that drugs are known by several other names, including investigational and trade names. Therefore, we’ve mined the web for alternative names for each therapeutic antibody. This new field should help not only to identify the INN of a therapeutic from a search for the alternate name, but also to draw connections with published early-stage research on drug candidates, which often occurs prior to INN assignment. - A new field has been added: “Genetics”
When we first built Thera-SAbDab, we assumed that a column for developmental origin was largely redundant due to the information provided by the suffix of the INN (-zumab, -ximab, etc.). However, as I covered in a previous blogpost, the change in naming convention adopted by the WHO in 2022 means that therapeutic INNs no longer capture developmental origins. Therefore, for therapeutics released since this date, it was impossible to tell which were humanised, chimeric, or fully human. We have now scraped this data from the text of the WHO proposed lists and incorporate it into our database. We note that “fully human” covers a range of origins, from transgenic mice to human gene-based phage libraries, to human B cells; hence the decision to title the column “genetics” not “origin”.
We continue to keep Thera-SAbDab updated with the latest sequence and structural data: it now contains 1045 variable domain sequences, of which 280 have structural information. You can download all the latest data here: .xlsx, .csv
And, in closing, a big thank you to all the users of Thera-SAbDab who have taken the time to get in touch with us over the years, drawing our attention to bugs or suggesting improvements (several of which related to the changes described above).