Timeline of protein structure prediction

This is a timeline of protein structure prediction.

Sample questions

The following are some interesting questions that can be answered by reading this timeline:

Big picture

Time period	Development summary	More details

Full timeline

Year	Event type	Details
1951–1953	Conceptual foundation	Linus Pauling and colleagues describe the α-helix and β-sheet, establishing the fundamental principles of protein secondary structure.^[1]
1958–1960	Experimental milestone	The first high-resolution protein structures (e.g., myoglobin and hemoglobin) are solved using X-ray crystallography, providing ground truth for future prediction methods.
1961	Structural biology	The concept of protein domains emerges, recognizing that many proteins are composed of independently folding structural units.
1963	Theoretical framework	Ramachandran, Ramakrishnan, and Sasisekharan introduce the Ramachandran plot, defining sterically allowed backbone conformations.
1969	Thermodynamic principle	Cyrus Levinthal formulates Levinthal’s paradox, highlighting the improbability of random conformational search and motivating algorithmic approaches to folding.
1973	Energy modeling	Empirical force fields for proteins begin to be developed, enabling early energy-based folding simulations.
1970s	Computational method	Early statistical approaches attempt secondary structure prediction directly from amino acid sequences.
1978	Classification system	Early attempts to classify protein folds lay the groundwork for later structural taxonomies.
1981	Secondary structure theory	Chou–Fasman parameters formalize amino-acid propensities for α-helices and β-sheets.
1980s	Computational method	Homology (comparative) modeling becomes the dominant prediction strategy as experimentally solved structures accumulate.
1984	Secondary structure prediction	The GOR (Garnier–Osguthorpe–Robson) method introduces information-theoretic approaches to secondary structure prediction.
1987	Database infrastructure	The Protein Data Bank (PDB) becomes the central public repository for experimentally determined protein structures, enabling large-scale comparative modeling.
early 1990s	Secondary structure prediction	Neural-network-based predictors (e.g., early NN methods) improve accuracy of secondary structure classification from sequence.
1992	Protein secondary structure prediction program launch	Predictprotein is released.
1992	Structural alignment	Algorithms for structural alignment enable quantitative comparison between predicted and experimental protein structures.
1994	Benchmarking	The first CASP (Critical Assessment of protein Structure Prediction) experiment is launched, establishing a community-wide blind evaluation framework.
1995	Fold libraries	Curated libraries of known protein folds are assembled to support threading and comparative modeling.
mid-1990s	Computational method	Threading and fold-recognition techniques are developed to detect distant structural similarity beyond sequence homology.
1998	Protein secondary structure prediction program launch	Jpred is released.
1998	Evaluation metric	Root-mean-square deviation (RMSD) and related structural similarity metrics become standardized for evaluating prediction accuracy.
1999	Protein secondary structure prediction program launch	PSIPRED is released.
late 1990s	Computational method	Ab initio (de novo) protein folding methods attempt structure prediction from physical principles without templates.
1999–2005	Algorithmic advance	The Rosetta framework demonstrates practical de novo structure prediction for small proteins using fragment assembly and energy minimization.
2000	Structural classification	SCOP (Structural Classification of Proteins) is introduced, providing a hierarchical organization of protein structures.
2001	Structural classification	CATH classification system is launched, offering an alternative domain-based protein structure taxonomy.
2002	Protein secondary structure prediction program launch	GOR method is introduced.
2002	Automation	Fully automated homology modeling pipelines (e.g., MODELLER-based workflows) become widely used in structural biology.
2003	Physics-based modeling	Molecular dynamics simulations reach nanosecond scales for small proteins, improving validation of predicted folds.
2006	Distributed computing	Large-scale distributed computing projects (e.g., Folding@home) simulate protein folding dynamics, contributing to force-field refinement.
2007	Template detection	Profile–profile alignment methods significantly improve detection of remote homologs.
2008	Fragment-based modeling	Fragment assembly approaches are refined, improving sampling efficiency in de novo structure prediction.
2009	Quality assessment	Model quality assessment programs (MQAPs) are developed to estimate the reliability of predicted structures.
2010	Meta-prediction	Consensus and meta-predictor systems combine multiple prediction methods to improve robustness and accuracy.
2011	Protein secondary structure prediction program launch	RaptorX is released.
2011–2014	Data-driven method	Coevolutionary analysis methods such as Direct Coupling Analysis infer residue–residue contacts from large multiple sequence alignments.
2012	Contact-assisted folding	Predicted residue contacts are routinely incorporated as restraints in 3D folding pipelines.
2013	Contact prediction	Sparse inverse covariance estimation techniques strengthen contact prediction from sequence coevolution data.
2015	Residue distance modeling	Distance-based constraints replace binary contact maps, enabling more accurate 3D reconstruction.
2016–2018	Machine learning	Deep learning approaches significantly improve contact and distance map prediction, surpassing traditional statistical methods.
2017	Deep representation learning	Deep residual networks dramatically improve long-range contact prediction accuracy.
2018	End-to-end learning	Neural networks begin predicting atomic coordinates directly rather than assembling structures from predicted contacts.
2020	AI breakthrough	AlphaFold2 achieves near-experimental accuracy at CASP14, marking a paradigm shift in protein structure prediction.^[2]
2021	Infrastructure	DeepMind releases the AlphaFold Protein Structure Database, providing predicted structures for most known proteins.
2021	Multimer prediction	Deep learning systems are extended to predict protein–protein interfaces and multimeric assemblies.
2021	AI system	RoseTTAFold introduces a three-track neural network architecture for end-to-end protein structure prediction.
2022	Model confidence	Per-residue confidence scores and predicted alignment error metrics become standard outputs of structure prediction systems.
2022	Post-prediction analysis	Structure prediction pipelines incorporate automated error estimation and structural refinement stages.
2023	Community access	Open-source and cloud-based structure prediction tools significantly lower barriers for non-specialists.
2023	Disorder modeling	Increasing attention is given to intrinsically disordered regions, which challenge classical structure prediction paradigms.
2023–2024	Scope expansion	Structure prediction methods extend to protein complexes, multimers, and protein–ligand interactions.
2024	Functional annotation	Predicted structures are systematically linked to enzyme function, binding sites, and mutational effects.
2024–2026	Research frontier	Integration of structure prediction with protein design, functional inference, and modeling of protein dynamics and ensembles becomes a central research focus.
2025	Hybrid modeling	Integration of experimental data (cryo-EM, NMR restraints) with AI-based predictions becomes routine in structural workflows.

Meta information on the timeline

How the timeline was built

The initial version of the timeline was written by Sebastian.

Funding information for this timeline is available.

Feedback and comments

Feedback for the timeline can be provided at the following places:

FIXME

What the timeline is still missing

Timeline update strategy

External links

References

↑ Eisenberg, David (9 September 2003). "The discovery of the α-helix and β-sheet, the principal structural features of proteins". Proceedings of the National Academy of Sciences of the United States of America. 100 (20): 11207–11210. doi:10.1073/pnas.2034522100. PMC 208735. PMID 12966187. Retrieved 26 February 2026.
↑ Eraslan, Gökcen (2022). "Title of the Paper". arXiv:2212.07702. Retrieved 26 February 2026.

[1] Eisenberg, David (9 September 2003). "The discovery of the α-helix and β-sheet, the principal structural features of proteins". Proceedings of the National Academy of Sciences of the United States of America. 100 (20): 11207–11210. doi:10.1073/pnas.2034522100. PMC 208735. PMID 12966187. Retrieved 26 February 2026.

[2] Eraslan, Gökcen (2022). "Title of the Paper". arXiv:2212.07702. Retrieved 26 February 2026.

[1]

[2]

Timeline of protein structure prediction

Contents

Sample questions

Big picture

Full timeline

Meta information on the timeline

How the timeline was built

Feedback and comments

What the timeline is still missing

Timeline update strategy

See also

External links

References

Navigation menu

Timeline of protein structure prediction

Sample questions

Big picture

Full timeline

Meta information on the timeline

How the timeline was built

Feedback and comments

What the timeline is still missing

Timeline update strategy

See also

External links

References

Navigation menu

Search