Computational prediction of protein functional annotations
Abstract
Protein function prediction is a crucial task in bioinformatics and computational biology, as it enables the understanding of disease mechanisms, development of new therapeutics, and improvement of crop yields. Despite significant advances, the majority of protein functions remain unknown or poorly annotated, hindering our understanding of biological systems. This review provides a comprehensive overview of the available methods for protein function prediction, categorizing them into eight classes based on the sources of information they use. We examine over 35 methods, including traditional sequence-based approaches and recent advances in machine learning and natural language processing. We also discuss the incorporation of background knowledge in Gene Ontology and zero-shot predictions. To improve protein function prediction, we highlight the need for developing more accurate and robust methods that integrate multiple sources of information. We provide several practical notes for choosing and interpreting the results of protein function prediction methods.