Kaggle’s 1st place winner for “CAFA 5 Protein Function Prediction”, GoCurator
Tuesday March 5th, 7-8pm EST | Shaojun Wang et al. — Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
Abstract: Protein function prediction stands as a significant and classic challenge in bioinformatics. The Critical Assessment of Functional Annotation (CAFA) endeavors to advance automated function prediction (AFP) performance through community efforts, which helps researchers to have a deeper comprehension of protein roles in living organisms. In CAFA5, we collected multi-source information and proposed a novel pipeline, GOCurator, built upon NetGO 3.0. We introduced new component methods to extract functional insights from protein 3D structures, textual descriptions, and scientific literature. Leveraging a learning to rank framework, we eventually integrated multiple component methods effectively. Notably, GOCurator emerged as the top performer across both public and private leaderboards. In particular, GOCurator achieved a remarkable 5.8% improvement over the second-place contender.
Kaggle solution: https://www.kaggle.com/competitions/cafa-5-protein-function-prediction/discussion/466917