Advancing Protein Simulation with Machine Learning
Developing a general CG model capable of capturing protein folding and dynamics has been a persistent challenge for scientists over the last fifty years.
Image Credit: Thomas Splettstoesser
Prof. Cecilia Clementi
Image Credit: Rice University
The research team led by Professor Dr. Cecilia Clementi has achieved a breakthrough in the computer simulation of proteins – the building blocks of life.
In the July issue of the journal Nature Chemistry, the international team presented CGSchNet, a machine-learned coarse-grained (CG) model that can accurately and efficiently simulate proteins like never before.
Operating significantly faster than traditional all-atom molecular dynamics, CGSchNet enables larger proteins and complex systems to be explored – offering potential applications in drug discovery and protein engineering that could advance cancer treatment methods for example
News from Jul 28, 2025
Developing a general CG model capable of capturing protein folding and dynamics has been a persistent challenge for scientists over the last fifty years. “This work is the first to demonstrate that deep learning can overcome this barrier and lead to a simulation system that approximates all-atom protein simulations without explicitly modeling solvent or atomic detail,” says Cecilia Clementi.
In CGSchNet, Clementi’s team trained a graph neural network to learn the effective interactions between the particles of the coarse protein simulation to reproduce the dynamics of a diverse set of thousands of all-atom simulations. Unlike structure prediction tools, CGSchNet models the dynamical process, including intermediate states relevant to misfolding processes like the formation of amyloids, which are pathological protein aggregates that appear in cases of Alzheimer’s disease, for example.
The model also simulates transitions between folded states – key to protein function – and generalizes to proteins outside its training set, demonstrating strong chemical transferability. Moreover, it accurately predicts metastable states of folded, unfolded, and disordered proteins, which constitutes the majority of biologically active proteins. Such predictions were extremely difficult in the past due to the flexibility of these proteins.
The model is also able to estimate the relative folding free energies of protein mutants, which previous simulation methods could not achieve due to computational limitations.
Read publication in Nature Chemistry
Read press-release of Freie Universtität Berlin form July 18, 2025
Keywords
- biophysics
- Cecilia Clementi
- CG model
- CGSchNet
- Machine Learning
- machine-learned coarse-grained model
- Nature
- Nature Chemistry
- physics
- protein engineering
- protein simulation
- publication