About

Recent advances in high-throughput experimental methods and machine learning approaches have fueled interest in ML-driven protein design. These advances may enable more rapid development of designed proteins with applications ranging from biopharmaceuticals, catalysis, material design and basic science research. However, this excitement has exposed important research questions across the foundation of this emerging engineering discipline. For example:

  • What experimental approaches can feed the data-driven design cycle?

  • Which machine learning models and parameterizations of proteins hold the right inductive biases?

  • What are the limits of the growing structural and evolutionary data in the PDB and UniProt?

  • How do we use our trained models to guide data collection?

We think these questions will be best addressed by a collaborative, interdisciplinary community. Thus, the ML4Protein Engineering community runs a bi-weekly seminar series to address these advances and other outstanding problems, such as high-throughput screening, model-based optimization, and representation learning.

To access announcements, please follow us on Twitter! You can also visit our YouTube to see recordings of past talks! Also, be sure to join our NEW Slack Community, where we discuss even more opportunities beyond the seminar series!

Check out our Slack Community!

Upcoming Seminars

Every other Tuesday 4-5pm EST unless otherwise noted

For a list of past seminars and recordings, check the full schedule page.

September- October 2024:

September 3rd — Kaiyi Jiang, PhD Candidate (MIT)

Rapid protein evolution by few-shot learning with a protein language model

September 17th — Jeff Ruffolo, PhD (Profluent Bio)

Adapting protein language models for structure-conditioned design

October 1st — Amy Lu, PhD student (UC Berkeley)

Tokenized and Continuous Embedding Compressions of Protein Sequence and Structure

October 15th — Kapil Devkota, PhD (Duke)

Template-based protein editing using Raygun

October 29th — Andre Cornman, PhD (Tatta Bio)

The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling

 

Organizers

Meg Taylor
Duke University Biomedical Engineering PhD Student

Divya Nori
MIT Computer Science BS & MEng Candidate

Jason Yang
CalTech Chemical Engineering PhD Candidate

Ria Vinod
Brown University Computational Biology PhD Candidate

Bo Qiang
University of Washington Computer Science PhD Student

Past Organizers

Kevin K. Yang
Senior Researcher, Microsoft Research

Brian L. Trippe
Postdoctoral Fellow, Columbia University

Ava P. Soleimany
Senior Researcher, Microsoft Research

Lucy Colwell
Research Scientist, Google Research

Jody Mou
MIT HST PhD Student

Amy Lu
UC Berkeley EECS PhD Student

Alex X. Lu
Senior Researcher, Microsoft Research

Marshall Case
Computational Biologist, Manifold Bio

Tianyu Lu
Stanford Bioengineering PhD Student

David Belanger
Research Scientist, Google Research

Andreea Gane
Research Scientist, Google Research

Tianyu Lu
Stanford Bioengineering PhD Student