Learning millisecond protein dynamics from what is missing in NMR spectra
Tuesday April 30th, 4-5pm EST | Gina El Nesr, PhD student (Stanford) and Hannah K. Wayment-Steele, PhD (UW-Madison)
Abstract: Many proteins' biological functions rely on interconversions between multiple conformations occurring at micro- to millisecond (μs-ms) timescales. A lack of standardized, large-scale experimental data has hindered obtaining a more predictive understanding of these motions. After curating >100 Nuclear Magnetic Resonance (NMR) relaxation datasets, we realized an observable for μs-ms motion was hiding in plain sight. Millisecond motions can cause NMR signals to broaden beyond detection, leaving some residues not assigned in the chemical shift datasets of ~10,000 proteins deposited in the Biological Magnetic Resonance Data Bank (BMRB). We made the bold assumption that residues missing assignments are exchange-broadened due to μs-ms motions, and trained various deep learning models to predict missing assignments as markers for such dynamics. Strikingly, these models also predict μs-ms motion directly measured in NMR relaxation experiments. The best of these models, which we named Dyna-1, leverages an intermediate layer of the multimodal language model ESM-3. Notably, dynamics directly linked to biological function—including enzyme catalysis and ligand binding—are particularly well predicted by Dyna-1, which parallels our findings that residues with μs-ms motions are highly conserved. We anticipate the datasets and models presented here will be transformative in unlocking the common language of dynamics and function.
Paper: https://www.biorxiv.org/content/10.1101/2025.03.19.642801v1
Gina is a Ph.D. candidate in Biophysics at Stanford University. Her doctoral research focuses on developing deep learning methods for de novo protein design toward enzymatic function and allostery. She is more broadly interested in modeling the dynamic nature of proteins for new-to-nature mechanisms. Gina received her bachelors from Johns Hopkins University in computer science, applied mathematics & statistics, and biophysics. She is currently an NSF Graduate Research Fellow and serves as an organizer of the MLSB Workshop at NeurIPS.
Hannah completed her Ph.D. at Stanford University in Chemistry, where she worked with Vijay Pande and Rhiju Das, using machine learning to tackle problems related to protein and RNA conformational ensembles. She was a Jane Coffin Childs postdoctoral fellow with Dorothee Kern at Brandeis University and visiting researcher at Google Brain with Lucy Colwell. Hannah is starting as an assistant professor at University of Wisconsin-Madison in Biochemistry, where her lab is using deep learning and data science in conjunction with experiment to access the dynamical underpinnings of biology.