Timeline of protein structure prediction
Jump to navigation
Jump to search
This is a timeline of protein structure prediction.
Sample questions
The following are some interesting questions that can be answered by reading this timeline:
Big picture
| Time period | Development summary | More details |
|---|
Full timeline
| Year | Event type | Details |
|---|---|---|
| 1951–1953 | Conceptual foundation | Linus Pauling and colleagues describe the α-helix and β-sheet, establishing the fundamental principles of protein secondary structure.[1] |
| 1958–1960 | Experimental milestone | The first high-resolution protein structures (e.g., myoglobin and hemoglobin) are solved using X-ray crystallography, providing ground truth for future prediction methods. |
| 1961 | Structural biology | The concept of protein domains emerges, recognizing that many proteins are composed of independently folding structural units. |
| 1963 | Theoretical framework | Ramachandran, Ramakrishnan, and Sasisekharan introduce the Ramachandran plot, defining sterically allowed backbone conformations. |
| 1969 | Thermodynamic principle | Cyrus Levinthal formulates Levinthal’s paradox, highlighting the improbability of random conformational search and motivating algorithmic approaches to folding. |
| 1973 | Energy modeling | Empirical force fields for proteins begin to be developed, enabling early energy-based folding simulations. |
| 1970s | Computational method | Early statistical approaches attempt secondary structure prediction directly from amino acid sequences. |
| 1978 | Classification system | Early attempts to classify protein folds lay the groundwork for later structural taxonomies. |
| 1981 | Secondary structure theory | Chou–Fasman parameters formalize amino-acid propensities for α-helices and β-sheets. |
| 1980s | Computational method | Homology (comparative) modeling becomes the dominant prediction strategy as experimentally solved structures accumulate. |
| 1984 | Secondary structure prediction | The GOR (Garnier–Osguthorpe–Robson) method introduces information-theoretic approaches to secondary structure prediction. |
| 1987 | Database infrastructure | The Protein Data Bank (PDB) becomes the central public repository for experimentally determined protein structures, enabling large-scale comparative modeling. |
| early 1990s | Secondary structure prediction | Neural-network-based predictors (e.g., early NN methods) improve accuracy of secondary structure classification from sequence. |
| 1992 | Protein secondary structure prediction program launch | Predictprotein is released. |
| 1992 | Structural alignment | Algorithms for structural alignment enable quantitative comparison between predicted and experimental protein structures. |
| 1994 | Benchmarking | The first CASP (Critical Assessment of protein Structure Prediction) experiment is launched, establishing a community-wide blind evaluation framework. |
| 1995 | Fold libraries | Curated libraries of known protein folds are assembled to support threading and comparative modeling. |
| mid-1990s | Computational method | Threading and fold-recognition techniques are developed to detect distant structural similarity beyond sequence homology. |
| 1998 | Protein secondary structure prediction program launch | Jpred is released. |
| 1998 | Evaluation metric | Root-mean-square deviation (RMSD) and related structural similarity metrics become standardized for evaluating prediction accuracy. |
| 1999 | Protein secondary structure prediction program launch | PSIPRED is released. |
| late 1990s | Computational method | Ab initio (de novo) protein folding methods attempt structure prediction from physical principles without templates. |
| 1999–2005 | Algorithmic advance | The Rosetta framework demonstrates practical de novo structure prediction for small proteins using fragment assembly and energy minimization. |
| 2000 | Structural classification | SCOP (Structural Classification of Proteins) is introduced, providing a hierarchical organization of protein structures. |
| 2001 | Structural classification | CATH classification system is launched, offering an alternative domain-based protein structure taxonomy. |
| 2002 | Protein secondary structure prediction program launch | GOR method is introduced. |
| 2002 | Automation | Fully automated homology modeling pipelines (e.g., MODELLER-based workflows) become widely used in structural biology. |
| 2003 | Physics-based modeling | Molecular dynamics simulations reach nanosecond scales for small proteins, improving validation of predicted folds. |
| 2006 | Distributed computing | Large-scale distributed computing projects (e.g., Folding@home) simulate protein folding dynamics, contributing to force-field refinement. |
| 2007 | Template detection | Profile–profile alignment methods significantly improve detection of remote homologs. |
| 2008 | Fragment-based modeling | Fragment assembly approaches are refined, improving sampling efficiency in de novo structure prediction. |
| 2009 | Quality assessment | Model quality assessment programs (MQAPs) are developed to estimate the reliability of predicted structures. |
| 2010 | Meta-prediction | Consensus and meta-predictor systems combine multiple prediction methods to improve robustness and accuracy. |
| 2011 | Protein secondary structure prediction program launch | RaptorX is released. |
| 2011–2014 | Data-driven method | Coevolutionary analysis methods such as Direct Coupling Analysis infer residue–residue contacts from large multiple sequence alignments. |
| 2012 | Contact-assisted folding | Predicted residue contacts are routinely incorporated as restraints in 3D folding pipelines. |
| 2013 | Contact prediction | Sparse inverse covariance estimation techniques strengthen contact prediction from sequence coevolution data. |
| 2015 | Residue distance modeling | Distance-based constraints replace binary contact maps, enabling more accurate 3D reconstruction. |
| 2016–2018 | Machine learning | Deep learning approaches significantly improve contact and distance map prediction, surpassing traditional statistical methods. |
| 2017 | Deep representation learning | Deep residual networks dramatically improve long-range contact prediction accuracy. |
| 2018 | End-to-end learning | Neural networks begin predicting atomic coordinates directly rather than assembling structures from predicted contacts. |
| 2020 | AI breakthrough | AlphaFold2 achieves near-experimental accuracy at CASP14, marking a paradigm shift in protein structure prediction.[2] |
| 2021 | Infrastructure | DeepMind releases the AlphaFold Protein Structure Database, providing predicted structures for most known proteins. |
| 2021 | Multimer prediction | Deep learning systems are extended to predict protein–protein interfaces and multimeric assemblies. |
| 2021 | AI system | RoseTTAFold introduces a three-track neural network architecture for end-to-end protein structure prediction. |
| 2022 | Model confidence | Per-residue confidence scores and predicted alignment error metrics become standard outputs of structure prediction systems. |
| 2022 | Post-prediction analysis | Structure prediction pipelines incorporate automated error estimation and structural refinement stages. |
| 2023 | Community access | Open-source and cloud-based structure prediction tools significantly lower barriers for non-specialists. |
| 2023 | Disorder modeling | Increasing attention is given to intrinsically disordered regions, which challenge classical structure prediction paradigms. |
| 2023–2024 | Scope expansion | Structure prediction methods extend to protein complexes, multimers, and protein–ligand interactions. |
| 2024 | Functional annotation | Predicted structures are systematically linked to enzyme function, binding sites, and mutational effects. |
| 2024–2026 | Research frontier | Integration of structure prediction with protein design, functional inference, and modeling of protein dynamics and ensembles becomes a central research focus. |
| 2025 | Hybrid modeling | Integration of experimental data (cryo-EM, NMR restraints) with AI-based predictions becomes routine in structural workflows. |
Meta information on the timeline
How the timeline was built
The initial version of the timeline was written by Sebastian.
Funding information for this timeline is available.
Feedback and comments
Feedback for the timeline can be provided at the following places:
- FIXME
What the timeline is still missing
- Protein structure prediction
- Protein function prediction
- List of protein secondary structure prediction programs
- List of protein structure prediction software
- De novo protein structure prediction
- Protein–protein interaction prediction
- Protein–DNA interaction site predictor
- Structural genomics
- Protein Structure Initiative
- Category:Proteomics
- International Protein Index
Timeline update strategy
See also
External links
References
- ↑ Eisenberg, David (9 September 2003). "The discovery of the α-helix and β-sheet, the principal structural features of proteins". Proceedings of the National Academy of Sciences of the United States of America. 100 (20): 11207–11210. doi:10.1073/pnas.2034522100. PMC 208735. PMID 12966187. Retrieved 26 February 2026.
- ↑ Eraslan, Gökcen (2022). "Title of the Paper". arXiv:2212.07702. Retrieved 26 February 2026.