Machine Learning-Assisted Protein Engineering with ftMLDE and evSeq

Tuesday Feb 15th, 4-5pm EST | Bruce Wittman, California Institute of Technology Bioengineering

Applying machine learning to protein engineering comes with its own unique challenges, both in terms of computation and application. I will highlight some of these challenges and introduce new, practically applicable tools and strategies for overcoming them. I will first discuss the challenge of applying machine learning to proteins with fitness landscapes dominated by “holes” (protein variants with zero or extremely low fitness). Using a strategy known as “focused training machine learning-assisted directed evolution (ftMLDE)” as an example, I will demonstrate how auxiliary information from protein sequence and structure greatly improves machine learning-assisted navigation of “holey” protein fitness landscapes. I will also discuss the practical and financial challenges associated with collecting the sequence-fitness information needed to train machine learning models and present every variant sequencing (evSeq) as a low-cost, democratized solution.


This talk will feature an introduction by Professor Frances Arnold, Linus Pauling Professor of Chemical Engineering at Caltech.

Preprint: https://doi.org/10.1101/2021.11.18.469179

Recording Link: https://youtu.be/hae6IpcDCc0

Please note that zoom link will only be available to mailing list members.