Dynamic-backbone protein-ligand complex structure prediction with multiscale generative diffusion models

** RESCHEDULED FOR MARCH 21ST, 2023 **

Zhuoran Qiao, PhD — Lead Machine Learning Scientist at Entos, Inc.

The binding complexes formed by proteins and small-molecule ligands are ubiquitous, and predicting their structures can facilitate both biological discoveries and the design of novel enzymes or drug molecules. Despite great recent advancements in protein structure prediction, existing algorithms are yet unable to systematically predict the binding ligands along with their regulatory effects on protein folding. To address this discrepancy we present NeuralPLexer, a computational approach capable of directly predicting protein-ligand complex structures and their conformational changes at an atomistic resolution. NeuralPLexer adopts a deep generative model to sample the 3D structures of the binding complex using protein sequence and molecular graphs as inputs, combined with auxiliary features from protein language models and templates from folding networks. The generative model leverages a learned diffusion process that incorporates essential biophysical constraints, and a multi-scale neural network system to iteratively sample residue-level contact maps and all heavy-atom coordinates in a hierarchical manner. NeuralPLexer achieves state-of-the-art performance compared to existing physics-based and learning-based methods on benchmarks for both rigid-protein blind ligand docking and flexible binding site structure prediction. Moreover, owing to its specificity for sampling both ligand-free and ligand-bound state ensembles, NeuralPLexer on average outperforms AlphaFold2 in terms of global protein structure prediction accuracy on contrasting apo-holo structure pairs and recently-determined structures for which the protein folding landscapes are significantly altered by small-molecule ligand binding. Our results reveal that a data-driven approach can accurately capture the structural cooperativity among protein and small-molecule entities, showing promise for the computational identification of novel drug targets and the design of functional small-molecules and ligand-binding proteins.

Recording: https://youtu.be/73blwIx9QUg

Preprint: https://arxiv.org/abs/2209.15171

Website: https://zrqiao.github.io/