Conformational Diversity in Proteins, Revisited

A while ago I blogged about CoDNaS, the Conformational Diversity of the Native State protein conformation database (Monzon et al., 2013). It’s worth revisiting to highlight more recent developments.

Version 2.6 was released in March 2021, building on CoDNaS 2.0 published in 2016 (Monzon et al., 2016). More recently versions of CoDNaS compiling the conformational diversity in the native state of RNA (González Buitrón et al., 2022) and of the quaternary structure of protein assemblies (Escobedo et al., 2022) have been published.

CoDNaS 2.0

The 2.0 version of CoDNaS doubled the number of entries, and expanded to include NMR structures as well as X-ray diffraction crystal structures, expanding coverage of the PDB to ~70%. CoDNaS 2.0 characterizes conformers by its experimental conditions, including pH, temperature, presence of ligands, any mutations, post-translational modifications, and importantly, the presence of intrinsically disordered regions (IDRs). It also cross-references other databases, including UniProt, CATH, Enzyme Commission, and Gene Ontology; and in addition to the PDB, CoDNaS 2.0 uses data from PDBsum, BioLip, PISA, MobiDB, and SIFTS. The new web-server supports sequence searches, the display of structural flexibility profiles, and the ability to browse for different structural classes of proteins; as well as analyze conformational diversity as a function of taxonomy, experimental conditions, and biological function.

The new version highlights the importance of PTMs, experimental conditions, and oligomeric state on the conformational diversity observed in proteins. It allows users to compare and visualize all possible pairs of conformers for a given protein, and download reports. The latest version, 2.6, contains 29,148 different proteins, and 430,151 protein chain conformations (14.8 conformers/protein).

An analysis of protein conformational diversity across species is shown below, for cAMP-dependent protein kinase catalytic subunit α. The greatest variation is seen for Homo sapiens and Mus musculus, which also happen to be the most prevalent species in the PDB (H. sapiens: ~31%; M. musculus: ~4.5%); while the species with next most diverse range of conformational variations, Arabidopsis thaliana and Plasmodium falciparum, represent ~0.7% and ~0.5% of the structures in the PDB.

Conformational diversity of pairwise RMSD between all conformers per protein, by species, for the catalytic subunit alpha of cAMP-dependent protein kinase: revealing a wide range of degrees of observed conformational diversity across different species. (Escobedo et al., 2022).

References

  • Monzon, A.M., E Juritz, M.S. Fornasari, and G Parisi. 2013. “CoDNaS: A Database of Conformational Diversity in the Native State of Proteins.” Bioinformatics 29 (19): 2512–14. doi:10.1093/bioinformatics/btt405.
  • Monzon, A.M., C.O. Rohr, M.S. Fornasari, and G Parisi. 2016. “CoDNaS 2.0: A Comprehensive Database of Protein Conformational Diversity in the Native State.” Database (Oxford) 2016: baw038. doi:10.1093/database/baw038.
  • González Buitrón, M, R.R. Tunque Cahui, E García Ríos, L Hirsh, G Parisi, M.S. Fornasari, and N Palopoli. 2022. “CoDNaS-RNA: A Database of Conformational Diversity in the Native State of RNA.” Bioinformatics 38 (6): 1745–48. doi:10.1093/bioinformatics/btab858.
  • Escobedo, N, R.R. Tunque Cahui, G Caruso, E García Ríos, L Hirsh, A.M. Monzon, G Parisi, and N Palopoli. 2022. “CoDNaS-Q: A Database of Conformational Diversity of the Native State of Proteins with Quaternary Structure.” Bioinformatics 38 (21): 4959–61. doi:10.1093/bioinformatics/btac627.

Author