Computational protein design methods often use a known molecule with a well-characterised structure as a template or scaffold. The chosen scaffold is modified so that its function (e.g. what it binds) is repurposed. Ideally, one wants to be confident that the expressed protein’s structure is going to be the same as the designed conformation. Therefore, successful designed proteins tend to be rigid, formed of collections of regular secondary structure (e.g. α-helices and β-sheets) and have active site shapes that do not perturb far from the scaffold’s backbone conformation (see this review).
A recent paper (Lapidoth et al 2015) from the Fleishman group proposes a new protocol to incorporate backbone variation (read loop conformations) into computational protein design (Figure 1). Using an antibody as the chosen scaffold, their approach aims to design a molecule that binds a specific patch (epitope) on a target molecule (antigen).
Protein design works in the opposite direction to structure prediction. i.e. given a structure tell me what sequence will allow me to achieve that shape and to bind a particular patch in the way I have chosen. To do this one first needs to select a shape that could feasibly be achieved in vivo. We would hope that if a backbone conformation has previously been seen in the Protein Data Bank that it is one of such a set of feasible shapes.
Lapidoth et al sample conformations by constructing a backbone torsion angle database derived from known antibody structures from the PDB. From the work of North et al and others we also know that certain loop shapes can be achieved with multiple different sequences (see KK’s recent post). The authors therefore reduce the number of possible backbone conformations by clustering them by structural similarity. Each conformational cluster is represented by a representative and a position specific substitution matrix (PSSM). The PSSM represents how the sequence can vary whilst maintaining the shape.
The Rosetta design pipeline that follows uses the pre-computed torsion database to make a scaffold antibody structure (1x9q) adopt different backbone conformations. Proposed sequence mutations are sampled from the corresponding PSSM for the conformation. Shapes and the sequences that can adopt them, are ranked with respect to a docked pose with the antigen using several structure-based filters and Rosetta energy scores. A trade off is made between predicted binding and stability energies using a ‘fuzzy logic’ scheme.
After several rounds of optimisation the pipeline produces a predicted structure and sequence that should bind the chosen epitope patch and fold to form a stable protein when expressed. The benchmark results show promise in terms of structural similarity to known molecules that bind the same site (polar interactions, buried surface area). Sequence similarity between the predicted and known binders is perhaps lower than expected. However, as different natural antibody molecules can bind the same antigen, convergence between a ‘correct’ design and the known binder may not be guaranteed anyway.
In conclusion, my take home message from this paper is that to sensibly sample backbone conformations for protein design use the variation seen in known structures. The method presented demonstrates a way of predicting more structurally diverse designs and sampling the sequences that will allow the protein to adopt these shapes. Although, as the authors highlight, it is difficult to assess the performance of the protocol without experimental validation, important lessons can be learned for computational design of both antibodies and general proteins.