Mutational Effect Transfer Learning for protein design
Tuesday June 20th, 4-5 pm EST | Sam Gelman — PhD Candidate, UW-Madison
Abstract: Neural networks have tremendous potential to assist with protein design by predicting protein sequence-function relationships from labeled examples. However, small or biased training datasets can restrict a model's ability to generalize beyond the training data, decreasing practical utility. To overcome these challenges, we propose Mutational Effect Transfer Learning (METL), a method for predicting quantitative protein function that bridges the gap between traditional biophysics-based and machine learning approaches. We pretrain a transformer encoder on millions of molecular simulations to capture the relationship between protein sequence, structure, energetics, and stability. We then fine-tune the neural network to harness these fundamental biophysical signals and apply them when predicting protein functional scores from experimental assays. Across nine experimental datasets, METL improves generalization performance over existing baselines and protein language models when trained on only 10s to 100s of sequence-function examples.
Preprint:
Google Scholar: https://scholar.google.com/citations?user=WRSRfy8AAAAJ&hl=en&oi=ao
Recording link: https://youtu.be/38M6kOTR5gI?si=7bD-jpWBTH21tM79