Timeline of protein structure prediction

From Timelines
Jump to navigation Jump to search

This is a timeline of protein structure prediction.

Sample questions

The following are some interesting questions that can be answered by reading this timeline:

Big picture

Time period Development summary More details

Full timeline

Year Event type Details
1951–1953 Conceptual foundation Linus Pauling and colleagues describe the α-helix and β-sheet, establishing the fundamental principles of protein secondary structure.[1]
1958–1960 Experimental milestone The first high-resolution protein structures (e.g., myoglobin and hemoglobin) are solved using X-ray crystallography, providing ground truth for future prediction methods.
1961 Structural biology The concept of protein domains emerges, recognizing that many proteins are composed of independently folding structural units.
1963 Theoretical framework Ramachandran, Ramakrishnan, and Sasisekharan introduce the Ramachandran plot, defining sterically allowed backbone conformations.
1969 Thermodynamic principle Cyrus Levinthal formulates Levinthal’s paradox, highlighting the improbability of random conformational search and motivating algorithmic approaches to folding.
1973 Energy modeling Empirical force fields for proteins begin to be developed, enabling early energy-based folding simulations.
1970s Computational method Early statistical approaches attempt secondary structure prediction directly from amino acid sequences.
1978 Classification system Early attempts to classify protein folds lay the groundwork for later structural taxonomies.
1981 Secondary structure theory Chou–Fasman parameters formalize amino-acid propensities for α-helices and β-sheets.
1980s Computational method Homology (comparative) modeling becomes the dominant prediction strategy as experimentally solved structures accumulate.
1984 Secondary structure prediction The GOR (Garnier–Osguthorpe–Robson) method introduces information-theoretic approaches to secondary structure prediction.
1987 Database infrastructure The Protein Data Bank (PDB) becomes the central public repository for experimentally determined protein structures, enabling large-scale comparative modeling.
early 1990s Secondary structure prediction Neural-network-based predictors (e.g., early NN methods) improve accuracy of secondary structure classification from sequence.
1992 Protein secondary structure prediction program launch Predictprotein is released.
1992 Structural alignment Algorithms for structural alignment enable quantitative comparison between predicted and experimental protein structures.
1994 Benchmarking The first CASP (Critical Assessment of protein Structure Prediction) experiment is launched, establishing a community-wide blind evaluation framework.
1995 Fold libraries Curated libraries of known protein folds are assembled to support threading and comparative modeling.
mid-1990s Computational method Threading and fold-recognition techniques are developed to detect distant structural similarity beyond sequence homology.
1998 Protein secondary structure prediction program launch Jpred is released.
1998 Evaluation metric Root-mean-square deviation (RMSD) and related structural similarity metrics become standardized for evaluating prediction accuracy.
1999 Protein secondary structure prediction program launch PSIPRED is released.
late 1990s Computational method Ab initio (de novo) protein folding methods attempt structure prediction from physical principles without templates.
1999–2005 Algorithmic advance The Rosetta framework demonstrates practical de novo structure prediction for small proteins using fragment assembly and energy minimization.
2000 Structural classification SCOP (Structural Classification of Proteins) is introduced, providing a hierarchical organization of protein structures.
2001 Structural classification CATH classification system is launched, offering an alternative domain-based protein structure taxonomy.
2002 Protein secondary structure prediction program launch GOR method is introduced.
2002 Automation Fully automated homology modeling pipelines (e.g., MODELLER-based workflows) become widely used in structural biology.
2003 Physics-based modeling Molecular dynamics simulations reach nanosecond scales for small proteins, improving validation of predicted folds.
2006 Distributed computing Large-scale distributed computing projects (e.g., Folding@home) simulate protein folding dynamics, contributing to force-field refinement.
2007 Template detection Profile–profile alignment methods significantly improve detection of remote homologs.
2008 Fragment-based modeling Fragment assembly approaches are refined, improving sampling efficiency in de novo structure prediction.
2009 Quality assessment Model quality assessment programs (MQAPs) are developed to estimate the reliability of predicted structures.
2010 Meta-prediction Consensus and meta-predictor systems combine multiple prediction methods to improve robustness and accuracy.
2011 Protein secondary structure prediction program launch RaptorX is released.
2011–2014 Data-driven method Coevolutionary analysis methods such as Direct Coupling Analysis infer residue–residue contacts from large multiple sequence alignments.
2012 Contact-assisted folding Predicted residue contacts are routinely incorporated as restraints in 3D folding pipelines.
2013 Contact prediction Sparse inverse covariance estimation techniques strengthen contact prediction from sequence coevolution data.
2015 Residue distance modeling Distance-based constraints replace binary contact maps, enabling more accurate 3D reconstruction.
2016–2018 Machine learning Deep learning approaches significantly improve contact and distance map prediction, surpassing traditional statistical methods.
2017 Deep representation learning Deep residual networks dramatically improve long-range contact prediction accuracy.
2018 End-to-end learning Neural networks begin predicting atomic coordinates directly rather than assembling structures from predicted contacts.
2020 AI breakthrough AlphaFold2 achieves near-experimental accuracy at CASP14, marking a paradigm shift in protein structure prediction.[2]
2021 Infrastructure DeepMind releases the AlphaFold Protein Structure Database, providing predicted structures for most known proteins.
2021 Multimer prediction Deep learning systems are extended to predict protein–protein interfaces and multimeric assemblies.
2021 AI system RoseTTAFold introduces a three-track neural network architecture for end-to-end protein structure prediction.
2022 Model confidence Per-residue confidence scores and predicted alignment error metrics become standard outputs of structure prediction systems.
2022 Post-prediction analysis Structure prediction pipelines incorporate automated error estimation and structural refinement stages.
2023 Community access Open-source and cloud-based structure prediction tools significantly lower barriers for non-specialists.
2023 Disorder modeling Increasing attention is given to intrinsically disordered regions, which challenge classical structure prediction paradigms.
2023–2024 Scope expansion Structure prediction methods extend to protein complexes, multimers, and protein–ligand interactions.
2024 Functional annotation Predicted structures are systematically linked to enzyme function, binding sites, and mutational effects.
2024–2026 Research frontier Integration of structure prediction with protein design, functional inference, and modeling of protein dynamics and ensembles becomes a central research focus.
2025 Hybrid modeling Integration of experimental data (cryo-EM, NMR restraints) with AI-based predictions becomes routine in structural workflows.

Meta information on the timeline

How the timeline was built

The initial version of the timeline was written by Sebastian.

Funding information for this timeline is available.

Feedback and comments

Feedback for the timeline can be provided at the following places:

  • FIXME

What the timeline is still missing

Timeline update strategy

See also

References

  1. Eisenberg, David (9 September 2003). "The discovery of the α-helix and β-sheet, the principal structural features of proteins". Proceedings of the National Academy of Sciences of the United States of America. 100 (20): 11207–11210. doi:10.1073/pnas.2034522100. PMC 208735. PMID 12966187. Retrieved 26 February 2026.
  2. Eraslan, Gökcen (2022). "Title of the Paper". arXiv:2212.07702. Retrieved 26 February 2026.