Category Archives: Journal Club

Proteins evolve on the edge of supramolecular self-assembly

Inspired by Eoin’s interesting talks on prions and prion diseases, and Nick’s discussion of how Cyro-Electron microscopy is going to be the end of an era for Crystallography. I thought I’d look at a paper that discusses aggregation of protein complexes, with some cryo-electron microscopy thrown in for good measure.

a, A molecule gaining a single self-interacting patch forms a finite dimer. A self-interacting patch repeated on opposite sides of a symmetric molecule can result in infinite assembly. b, A point mutation in a dihedral octamer creates a new self-interacting patch (red), triggering assembly into a fibre.

Supramolecular assemblies are folded protein complexes forming into much larger units. This formation can be triggered by a mutation on a copy of the constituent homomers of the complex, acting as a self-interacting patch. If this patch were to form in a non-symmetric complex, it would likely form a finite assemble with a limited number of copies of the complex. However, if the complex has dihedral symmetry such that a patch is accessible at multiple separated locations, then complex can potentially form near infinite supramolecular assemblies. Continue reading →

Slowing the progress of prion diseases

At present, the jury is still out on how prion diseases affect the body let alone how to cure them. We don’t know if amyloid plaques cause neurodegeneration or if they’re the result of it. Due to highly variable glycophosphatidylinositol (GPI) anchors, we don’t know the structure of prions. Due to their incredible resistance to proteolysis, we don’t know a simple way to destroy prions even using in an autoclave. The current recommendation[0] by the World Health Organisation includes the not so subtle: “Immerse in a pan containing 1N sodium hydroxide and heat in a gravity displacement autoclave at 121°C”.

There are several species including Water Buffalo, Horses and Dogs which are immune to prion diseases. Until relatively recently it was thought that rabbits were immune too. “Despite rabbits no longer being able to be classified as resistant to TSEs, an outbreak of ‘mad rabbit disease’ is unlikely”.[1] That being said, other than the addition of some salt bridges and additional H-bonds, we don’t know if that’s why some animals are immune.

We do know at least two species of lichen (P. sulcata and L. plumonaria) have not only discovered a way to naturally break down prions, but they’ve evolved two completely independent pathways to do so. How they accomplish this? We’re still not sure in fact, it was only last year that it was discovered that lichens may be composed of three symbiotic partnerships and not two as previously thought.[3]

With all this uncertainty, one thing is known: PrPSc, the pathogenic form of the Prion converts PrPC, the cellular form. Just preventing the production of PrPC may not be a good idea, mainly because we don’t know what it’s there for in the first place. Previous studies using PrP-knockout have shown hints that:

Hematopoietic stem cells express PrP on their cell membrane. PrP-null stem cells exhibit increased sensitivity to cell depletion. [4]
In mice, cleavage of PrP proteins in peripheral nerves causes the activation of myelin repair in Schwann Cells. Lack of PrP proteins caused demyelination in those cells. [5]
Mice lacking genes for PrP show altered long-term potentiation in the hippocampus. [6]
Prions have been indicated to play an important role in cell-cell adhesion and intracellular signalling.[7]

However, an alternative approach which bypasses most of the unknowns above is if it were possible to make off with the substrate which PrPSc uses, the progress of the disease might be slowed. A study by R Diaz-Espinoza et al. was able to show that by infecting animals with a self-replicating non-pathogenic prion disease it was possible to slow the fatal 263K scrapie agent. From their paper [8], “results show that a prophylactic inoculation of prion-infected animals with an anti-prion delays the onset of the disease and in some animals completely prevents the development of clinical symptoms and brain damage.”

[0] https://www.cdc.gov/prions/cjd/infection-control.html
[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3323982/
[2] https://blogs.scientificamerican.com/artful-amoeba/httpblogsscientificamericancomartful-amoeba20110725lichens-vs-the-almighty-prion/
[3] http://science.sciencemag.org/content/353/6298/488
[4] “Prion protein is expressed on long-term repopulating hematopoietic stem cells and is important for their self-renewal”. PNAS. 103 (7): 2184–9. doi:10.1073/pnas.0510577103
[5] Abbott A (2010-01-24). “Healthy prions protect nerves”. Nature. doi:10.1038/news.2010.29
[6] Maglio LE, Perez MF, Martins VR, Brentani RR, Ramirez OA (Nov 2004). “Hippocampal synaptic plasticity in mice devoid of cellular prion protein”. Brain Research. Molecular Brain Research. 131 (1-2): 58–64. doi:10.1016/j.molbrainres.2004.08.004
[7] Málaga-Trillo E, Solis GP, et al. (Mar 2009). Weissmann C, ed. “Regulation of embryonic cell adhesion by the prion protein”. PLoS Biology. 7 (3): e55. doi:10.1371/journal.pbio.1000055
[8] http://www.nature.com/mp/journal/vaop/ncurrent/full/mp201784a.html

Journal Club: Statistical database analysis of the role of loop dynamics for protein-protein complex formation and allostery

As I’ve mentioned on this blog a few (ok, more than a few) times before, loops are often very important regions of a protein, allowing it to carry out its function effectively. In my own research, I develop methods for loop structure prediction (in particular for antibody CDR H3), and look at loop conformational changes and flexibility. So, when I came across a paper that has the words ‘loops’, ‘flexibility’ and ‘antibody’ in its abstract, it was the obvious choice to present at my most recent journal club!

In the paper, entitled “Statistical database analysis of the role of loop dynamics for protein-protein complex formation and allostery”, the authors focus on how loop dynamics change upon the formation of protein-protein complexes. To do this, they use an algorithm they previously published called ToeLoop – given a protein structure, this classifies the loop regions as static, slow, or fast, based on both sequential and structural features:

relative amino acid frequencies;
the frequency of loop secondary structure types as annotated by DSSP (bends, β-bridges etc.);
the average solvent accessible surface area;
the average hydrophobicity index for the loop residues;
loop length;
contacts between atoms of the loop and the rest of the protein.

Two scores are calculated using the properties listed above: one that distinguishes ‘static’ loops from ‘mobile’ loops (with a reported 81% accuracy), and another that further categorises the mobile loops into ‘slow’ and ‘fast’ (74% accuracy). Results from the original ToeLoop paper indicate that fast loops are shorter, have more negatively charged residues, larger solvent accessibilities, lower hydrophobicity, and fewer contacts.

Gu et al. use ToeLoop to investigate the dynamic behaviour of loops during protein-protein complex formation. For a set of 230 protein complexes, they classified the loops of the proteins in both their free and complexed forms (illustrated by the figure below).

The loops from 230 protein complexes, in both free and bound forms, were categorised as fast, slow, or static using the ToeLoop algorithm. The loops are coloured according to their predicted dynamics. Allosteric loops, defined as those whose mobility increases upon binding, are indicated using blue arrows.

In the uncomplexed form, the majority of loops were annotated as static (63.6%), followed by slow (26.2%) and finally fast (10.2%). This indicates that most loops are inflexible. After complex formation, the number of static loops increases and the number of mobile loops decreases (67.8%, 23.0%, and 9.2% for static, slow and fast respectively). Mobility, on the whole, is therefore reduced upon binding, which is as expected – the presence of a binding partner restricts the range of possible movement.

The authors then divided the loops into two groups, interface and non-interface, according to the average minimum distance of each loop residue to the binding partner (cutoff values from 4 to 8 Å were tested and each gave broadly similar results). The dynamics of non-interface loops changed less upon binding than those of the interface loops (again, this was as expected). However, an interesting result is that slow loops are more common at the interface than any other parts of the protein, with 37.2% of interface loops being annotated as slow compared to 24.8% of non-interface loops. It is suggested by the authors that this is due to protein promiscuity; i.e. slow loops allow proteins to bind to different partners.

The 4600 loops analysed in the study were split into two groups based on their proximity to the interface. As expected, interface loops are affected more by binding than non-interface loops. Slow loops are more prevalent at the interface than elsewhere on the protein.

Binding-induced dynamic changes were then investigated in more detail, by dividing the loops into 9 categories based on the transition (i.e. static-static, slow-static, slow-fast etc.). The dynamic behaviour of most loops (4120 out of 4600) does not change, and those loops whose mobility decreased upon binding were found close to the interface (average distance of ~12 Å). A small subset of the loops (termed allosteric by the authors) demonstrated an increase in flexibility upon complex formation (142 out of 4600); these tended to be located further away from the interface (average distance of ~30 Å).

One of these allosteric loops was investigated further as part of a case study. The complex in question was an antibody-antigen complex, in which one loop distant from the binding site transitioned from static to slow upon binding. The loops directly involved in binding (the CDRs) either displayed reduced flexibility or remained static. The presence of an allosteric loop was supported by experimental data – the loop is shown to change conformation upon binding (RMSD of 3.6 Å between bound and unbound crystal structures from the PDB), and the average B-factor for the loop atoms increased on complex formation from around 26 Å² to approximately 140 Å². The authors also carried out MD simulations of the unbound antibody and antigen as well as the complex, and showed that the loop moved more in the complex than in the free antibody. The authors propose that the increased flexibility of the loop offsets the entropy loss that occurs due to binding, thereby increasing the strength of binding. ToeLoop could, therefore, be a useful tool in the development of antibody therapies (or other protein drugs) – it could be used in tandem with an antibody modelling protocol, allowing the dynamic behaviour of loop regions to be monitored and possibly designed to increase affinities.

Finally, the authors explored the link between loop dynamics and binding affinity. Again, they used ToeLoop to predict the flexibility of loops, but this time the complexes were from a set of 170 with known affinity. They demonstrated that affinity is correlated with the number of static loop residues present at the interface – ‘strong’ binders (those with picomolar affinity) tend to contain more static residues than more weakly binding pairs of proteins. This is in accordance with the theory that the rigidification of flexible loops upon binding leads to lower affinities, due to the loss of entropy.

When Does Chemical Elaboration Induce a Ligand To Change Its Binding Mode?

When Does Chemical Elaboration Induce a Ligand To Change Its Binding Mode?

For my journal club in June, I chose to present a Journal of Medicinal Chemistry article entitled “When Does Chemical Elaboration Induce a Ligand To Change Its Binding Mode?” by Malhotra and Karanicolas. This article uses a large scale collection of ligand pairs to investigate the circumstances in which elaborations of a ligand change the original binding mode.

One of the primary goals in medicinal chemistry is the optimisation of biological activity by chemical elaboration of a hit compound. This hit-to-lead optimisation often assumes that addition of functional groups to a given hit scaffold will not change the original binding mode.

In order to investigate the circumstances in which this assumption holds true and how often it holds true, they built up a large-scale collection of 297 related ligand pairs solved in complex with the same protein partner. Each pair consisted of a larger and smaller ligand; the larger ligand could have arisen from elaboration of the smaller ligand. They found that for 41 out of the 297 pairs (14%), the binding mode changed upon elaboration of the smaller ligand.

They investigated many physicochemical properties of the ligand, the protein-ligand complex and the protein binding pocket. They summarise the statistical significance and predictive power of the investigated properties with the table shown below.

They found that the property with the lowest p-value was the “rmsd after minimisation of the aligned complex” (RMAC). They developed this metric to probe whether the larger ligand could be accommodated in the protein without changing binding mode. They did so by aligning the shared substructure of the larger ligand onto the smaller ligand’s complex and then carrying out an energy minimisation. By monitoring the RMSD difference of the larger ligand relative to the initial pose (RMAC), they can gauge how compatible the larger ligand is with the protein. Larger RMAC values indicate greater incompatibility, hence a greater likelihood for the binding mode to not be preserved.

The authors generated receiver operating characteristic (ROC) plots to compare the predictive power of the properties considered. ROC curves are made by plotting the true positive rate (TPR) against the false positive rate (FPR). A random classifier would yield the dotted line from the bottom left to the top right, shown in the plots below. The best predictors would give a point in the top left corner of the plot. The properties that do well include RMAC, pocket volume, molecular weight, lipophilicity and potency.

They also combined properties to enhance predictive power and conclude that RMAC and molecular weight together offers good predictivity.Finally, the authors look at the pairs that have low RMAC values (i.e. the elaboration should be compatible with the protein pocket), yet show a change in binding mode. For these cases, a specific substitution may enable formation of a new, stronger interaction or for pseudosymmetric ligands, the alternate pose can mimic many of the interactions of the original pose.

Experimental Binding Modes of Small Molecules in Protein-Ligand Docking

Protein-ligand docking tends to be very good at generating binding modes that resemble experimental binding modes from X-ray crystallography and other methods (assuming we have a high quality structure…); but it is also very good at generating plausible models for ligands that don’t bind. These so-called “false positives” lead to reduced accuracy in structure-based virtual screening campaigns.

Structure-based methods are not the only way of approaching virtual screening: when all we know is the chemical structure of an active molecule, but nothing about its target (or targets), we can use ligand-based virtual screening methods, which operate on the principle of molecular similarity (Maggiora et al., 2014).

But what if we combine both methods?

Continue reading →

Prions

The most recent paper presented to the OPIG journal club from PLOS Pathogens, The Structural Architecture of an Infectious Mammalian Prion Using Electron Cryomicroscopy. But prior to that, I presented a bit of a background to prions in general.

In the 1960s, work was being undertaken by Tikvah Alper and John Stanley Griffith on the nature of a transmissible infection which caused scrapie in sheep. They were interested in how studies of the infection showed it was somehow resistant to ionizing radiation. Infectious elements such as bacteria or viruses were normally destroyed by radiation with the amount of radiation required having a relationship with the size of the infectious particle. However, the infection caused by the scrapie agent appeared to be too small to be caused by even a virus.

In 1982, Stanley Prusiner had successfully purified the infectious agent, discovering that it consisted of a protein. “Because the novel properties of the scrapie agent distinguish it from viruses, plasmids, and viroids, a new term “prion” was proposed to denote a small proteinaceous infectious particle which is resistant to inactivation by most procedures that modify nucleic acids.”
Prusiner’s discovery led to him being awarded the Nobel Prize in 1997.

Whilst there are many different forms of infection, such as parasites, bacteria, fungi and viruses, all of these have a genome. Prions on the other hand are just proteins. Coming in two forms, the naturally occurring cellular (PrP^C) and the infectious form PrP^SC (Sc referring to scrapie), through an as yet unknown mechanism, PrP^SC prions are able to reproduce by forcing beneign PrP^C forms into the wrong conformation. It’s believed that through this conformational change, the following diseases are caused.

Bovine Spongiform encephalopathy (mad cow disease)
Scrapie in:
- Sheep
- Goats

Chronic wasting disease in:
- Deer
- Elk
- Moose
- Reindeer

Ostrich spongiform encephalopathy
Transmissible mink encephalopathy
Feline spongiform encephalopathy
Exotic ungulate encephalopathy
- Nyala
- Oryx
- Greater Kudu

Creutzfeldt-Jakob disease in humans

Whilst it’s commonly accepted that prions are the cause of the above diseases there’s still debate whether the fibrils which are formed when prions misfold are the cause of the disease or caused by it. Due to the nature of prions, attempting to cure these diseases proves extremely difficult. PrP^SC is extremely stable and resistant to denaturation by most chemical and physical agents. “Prions have been shown to retain infectivity even following incineration or after being subjected to high autoclave temperatures“. It is thought that chronic wasting disease is normally transmitted through the saliva and faeces of infected animals, however it has been proposed that grass plants bind, retain, uptake, and transport infectious prions, persisting in the environment and causing animals consuming the plants to become infected.

It’s not all doom and gloom however, lichens may long have had a way to degrade prion fibrils. Not just a way, but because it’s apparently no big thing to them, have done so twice. Tests on three different lichens species: Lobaria pulmonaria, Cladonia rangiferina and Parmelia sulcata, indicated at least two logs of reduction, including reduction “following exposure to freshly-collected P. sulcata or an aqueous extract of the lichen”. This has the potential to inactivate the infectious particles persisting in the landscape or be a source for agents to degrade prions.

Addressing the Role of Conformational Diversity in Protein Structure Prediction

For my journal club last week, I chose to look at a recent paper entitled “Addressing the Role of Conformational Diversity in Protein Structure Prediction”, by Palopoli et al [1]. In the study of proteins, structures are incredibly useful tools, offering information about how they carry out their function, and allowing informed decisions to be made in many areas (e.g. drug design). Since the experimental determination is difficult, however, the computational prediction of protein structures has become very important (and a number of us here at OPIG work on this!).

A problem, however, in both experimental structure determination and computational structure prediction, is that proteins are generally treated as static – the output of an X-ray crystallography experiment is a single structure, and in the majority of cases the goal of structure prediction is to produce one model that closely resembles the native structure. The accuracy of structure prediction algorithms is also normally measured by comparing the resulting model to a single, known experimentally-determined structure. The issue here is that proteins are not static – they are constantly moving and may adopt a number of different conformations; the structure observed experimentally is just a snapshot of that motion. The dynamics of a protein may even play an important role in its function; an example is haemoglobin, which after binding to oxygen changes conformation to increase affinity for further binding. It may be more appropriate, then, to represent a protein as an ensemble of structures, and not just one.

Conformational diversity helps the protein haemoglobin carry out its function (the transportation of oxygen in the blood). Haemoglobin has four subunits, each containing a haem group, shown in red. When oxygen binds to this group (blue), a histidine residue moves, shifting the position of an alpha helix. This movement is propagated throughout the entire structure, and increases the affinity for oxygen of the other subunits – binding therefore becomes increasingly easy (this is known as co-operative binding). Gif shown is from the PDB-101 Molecule of the Month series: S. Dutta and D. Goodsell, doi:10.2210/rcsb_pdb/mom_2003_5

How, though, could this be incorporated into protein structure prediction? This is the question being considered by the authors of this paper. They consider conformational diversity by looking at different conformers of the same protein – there are many proteins whose structures have been solved experimentally multiple times, and as such have a number of structures available in the PDB. Information about this is stored in a useful database called CoDNaS [2], which was developed by some of the authors of the paper under discussion. In some cases, there are model (or decoy) structures available for these proteins, generated by various structure prediction algorithms – for example, all models submitted for the CASP experiments [3], where the current accuracy of structure prediction is monitored through blind prediction, are freely available for download. The authors curated a collection of decoy sets for 91 different proteins for which multiple experimental structures are present in the PDB.

As mentioned previously, the accuracy of a model is normally evaluated by measuring its structural similarity to one known (or reference) structure – only one conformer of the protein is considered. The authors show that the model rankings achieved by this are highly dependent on the chosen reference structure. If the possible choices (i.e. the observed conformers) are quite similar the effect is small, but if there is a large difference, then two completely different decoys could be designated as the most accurate depending on which reference structure is used.

The key figure from this paper, in my opinion, is the one shown below. For the two most dissimilar experimentally-observed conformers for each protein in the set, the RMSD of the best decoy in relation to one conformer is plotted against the RMSD of the best decoy when measured against the other:

The straight line on this graph indicates what would be observed if there are decoys in the set that equally represent the two conformers; for example, if the best decoy with reference to conformer 1 has an RMSD of 1 Å, then there is also a decoy that is 1 Å away from conformer 2. Most points are on or near this line – this means that the sets of decoy structures are not biased towards one of the conformers. Therefore, structure prediction algorithms seem to be able to generate models for multiple conformations of proteins, and so the production of an ensemble of models is not an impossible dream. Several obstacles remain, however – although of equal distance to both conformers, the decoys could still be of poor quality; and decoy selection is often inaccurate, and so finding these multiple conformations amongst all others is a challenge.

[1] – Palopoli, N., Monzon, A. M., Parisi, G., and Fornasari, M. S. (2016). Addressing the Role of Conformational Diversity in Protein Structure Prediction. PLoS One, 11, e0154923.

[2] – Monzon, A. M., Juritz, E., Fornasari, S., and Parisi, G. (2013). CoDNaS: a database of conformational diversity in the native state of proteins. Bioinformatics, 29, 2512–2514.

[3] – Moult, J., Pedersen, J. T., Judson, R., and Fidelis, K. (1995). A Large-Scale Experiment to Assess Protein Structure Prediction Methods. Proteins, 23, ii–iv.

The Emerging Disorder-Function Paradigm

It’s rare to find a paper that connects all of the diverse areas of research of OPIG, but “The rules of disorder or why disorder rules” by Gsponer and Babu (2009) is one such paper. Protein folding, protein-protein interaction networks, protein loops (Schlessinger et al., 2007), and drug discovery all play a part in this story. What’s great about this paper is that it gives numerous examples of proteins and the evidence supporting that they are partially or completely unstructured. These are the so-called intrinsically unstructured proteins or IUPs, although more recently they are also being referred to as intrinsically disordered proteins, or IDPs. Intrinsically disordered regions (IDRs) “are polypeptide segments that do not contain sufficient hydrophobic amino acids to mediate co-operative folding” (Babu, 2016).

Such proteins contradict the classic “lock and key” hypothesis of Fischer, and challenge Continue reading →

Do we need the constant regions of Antibodies and T-cell receptors in molecular simulations?

At this week’s journal club I presented my latest results on the effect of the constant regions of antibodies (ABs) and T-cell receptors (TCRs) on the dynamics of the overall system. Not including constant regions in such simulations is a commonly used simplification that is found throughout the literature. This is mainly due a massive saving in computational runtime as illustrated below:

The constant regions contain about 210 residues but an additional speed up comes from the much smaller solvation box. If a cubic solvation box is used then the effect is even more severe:

But the question is: “Is is OK to remove the constant regions of an AB or TCR and simulate without them?”.

Using replica simulations we found that simulations with and without constant regions lead to (on average) significantly different results. The detail of our analysis will soon be submitted to a scientific journal. The current working title is “Why constant regions are essential in antibody and T-cell receptor Molecular Dynamics simulations”.

Oxford Protein Informatics Group

or "OPIG" to friends