Computational methods for functional metagenomics: from protein functions to multi-scale interactions

Overview

Multi-scale systems methods for characterising microbial community functions via protein function prediction; downstream produced DeepGOPlus and successor models.

Period: 2022–2024

Funding

  • KAUST Competitive Research Grant — Grant ID: URF/1/4675-01-01 (PI) — USD 247,500

Team

  • Robert Hoehndorf — PI (KAUST (Professor of Computer Science))
  • Takashi Gojobori — CoI (KAUST (CBRC))
  • Maxat Kulmanov — PhD (alumnus), Postdoc (KAUST (Research Scientist))
  • Rund Tawfiq — PhD (alumnus) (Sano Centre Krakow (Postdoctoral researcher))
  • Daulet Toibazar — MSc (alumnus)
  • Amal Alhelal — MSc (alumnus)
  • Md Nurul Muttakin — MSc (alumnus)
  • Shahad Qatan — MSc (alumnus)
  • Kexin Niu — MSc (alumnus)
  • Asaad Mohammedsaleh — MSc (alumnus)

Software

Publications acknowledging this project (16)

  • (2025) Lattice-based $\mathcalALC$ ontology embeddings with saturation
  • (2024) Predicting protein functions using positive-unlabeled ranking with ontology-based priors Supplementary Material
  • (2024) Neuro-symbolic AI in Life Sciences
  • (2023) DeepGOMeta: Functional Insights into Microbial Communities with Deep Learning-Based Protein Function Prediction
  • (2022) Exploring the Use of Ontology Components for Distantly-Supervised Disease and Phenotype Named Entity Recognition
  • (2022) Context-based protein function prediction in bacterial genomes
  • (2022) INDIGENA: inductive prediction of disease--gene associations using phenotype ontologies Supplementary Material
  • (2022) mOWL: revision document
  • (2022) Large-Scale Knowledge Integration for Enhanced Molecular Property Prediction
  • (2018) Ontology Embedding: A Survey of Methods, Applications and Resources
  • (2015) The application of Large Language Models to the phenotype-based prioritization of causative genes in rare disease patients
  • (2015) The application of Large Language Models to the phenotype-based prioritization of causative genes in rare disease patients
  • (2012) Exploring the Use of Ontology Components for Distantly-Supervised Disease and Phenotype Named Entity Recognition
  • (2012) Improving the classification of cardinality phenotypes using collections
  • (2012) STARVar: Symptom-based Tool for Automatic Ranking of Variants using evidence from literature and genomes
  • … and 1 more.

Topics: Applied Ontology, Microbial communities, Neuro-symbolic AI, Protein function