Ontology engineering

Ontology engineering and semantic interoperability address the practical problem of turning hundreds of independently developed biomedical ontologies into an infrastructure that can be queried, reasoned over, and combined at scale. Our group designs architectures for processing large, heterogeneous datasets using Semantic Web standards, with a particular emphasis on bringing automated reasoning into routine data-access workflows. The angle taken at KAUST is engineering-led: we treat ontology-based data access as a service that must be fast enough for interactive use, expressive enough to exploit OWL semantics, and robust against the inconsistencies that inevitably arise when many ontologies are combined.

AberOWL and reasoning as a service

The AberOWL: an ontology portal with OWL EL reasoning infrastructure was developed to make OWL EL reasoning available as a routine query primitive over hundreds of bio-ontologies. Aber-OWL: a framework for ontology-based data access in biology set out the underlying architecture, and Using Aber-OWL for fast and scalable reasoning over BioPortal ontologies together with Experiences with Aber-OWL, an Ontology Repository with OWL EL Reasoning showed that reasoning at repository scale is tractable when the right OWL fragment is targeted. The recent Evaluating Different Methods for Semantic Reasoning Over Ontologies revisits these choices in light of newer reasoners. Building on this stack, Vec2SPARQL: integrating SPARQL queries and knowledge graph embeddings and Using SPARQL to Unify Queries over Data, Ontologies, and Machine Learning Models in the PhenomeBrowser Knowledgebase extend SPARQL endpoints with embedding-based similarity, so that vector-space queries and logical queries can be expressed in a common language. DeepGOWeb: fast and accurate protein function prediction on the (Semantic) Web shows how the same approach exposes machine-learning models themselves as Semantic Web resources.

Interoperability across distributed knowledge

A second strand of work targets the interoperability of ontologies and the data annotated with them. Interoperability between biomedical ontologies through relation expansion, upper-level ontologies and automatic reasoning and A common layer of interoperability for biomedical ontologies based on OWL EL introduced reasoning-driven mechanisms for aligning content across ontologies, while Interoperability between phenotype and anatomy ontologies and Quantitative evaluation of ontology design patterns for combining pathology and anatomy ontologies evaluated these design patterns empirically. The RICORDO approach to semantic interoperability for biomedical data and models and An infrastructure for ontology-based information systems in biomedicine: RICORDO case study applied the same ideas to physiology data and models. Standards-level contributions include FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation, FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration, and The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery. The recent An open source knowledge graph ecosystem for the life sciences documents how these standards can be assembled into an integrated translational-research substrate.

The expressiveness gained from formal axioms is not free: combining ontologies tends to introduce contradictions and unintended entailments. Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies and To MIREOT or not to MIREOT? A case study of the impact of using MIREOT in the Experimental Factor Ontology (EFO) quantify these effects and propose semi-automatic repair strategies, while Formal axioms in biomedical ontologies improve analysis and interpretation of associated data demonstrates that the additional axiomatic content nevertheless pays for itself in downstream analysis. Large-Scale Reasoning over Functions in Biomedical Ontologies tackles the related problem of reasoning over function representations at the scale of the OBO Foundry, and SPARQL2OWL: Towards Bridging the Semantic Gap Between RDF and OWL addresses the boundary between graph and description-logic semantics. Explanation-oriented tooling such as Klarigi: Characteristic explanations for semantic biomedical data rounds out the engineering toolkit.

These methods are delivered through AberOWL, vec2SPARQL, Onto2Graph, UNMIREOT, OntoFunc, and the mOWL library, and underpin our work on explainable machine learning with biomedical ontologies. The same infrastructure is reused across projects on variant prioritization, microbial cell factories, functional metagenomics, and the Bio2Vec analytics platform.

Projects

Software

Publications (45)

  • (2026) Mashkova, Zhapa-Camacho, Hoehndorf. DELE: Deductive EL++ Embeddings for Knowledge Base Completion Neurosymbolic Artificial Intelligence.
  • (2026) Song, Ma, Liu, Luo et al.. Robust Knowledge Graph Embedding via Denoising The Semantic Web -- ESWC 2026.
  • (2026) Zhapa-Camacho, Hoehndorf. Fully Geometric Multi-hop Reasoning on Knowledge Graphs with Transitive Relations The Semantic Web -- ESWC 2026.
  • (2024) Callahan, Tripodi, Stefanski, Cappelletti et al.. An open source knowledge graph ecosystem for the life sciences Scientific Data.
  • (2023) Luke T. Slater, John A. Williams, Paul N. Schofield, Sophie Russell et al.. Klarigi: Characteristic explanations for semantic biomedical data Computers in Biology and Medicine.
  • (2023) Fernando Zhapa-Camacho, Robert Hoehndorf. Evaluating Different Methods for Semantic Reasoning Over Ontologies Joint Proceedings of Scholarly QALD 2023 and SemREC 2023 co-located with 22nd International Semantic Web Conference ISWC 2023, Athens, Greece, November 6-10, 2023.
  • (2022) Ali Syed, Senay Kafkas, Maxat Kulmanov, Robert Hoehndorf. Using SPARQL to Unify Queries over Data, Ontologies, and Machine Learning Models in the PhenomeBrowser Knowledgebase Proceedings of the 13th International Conference on Semantic Web Applications and Tools for Health Care and Life Sciences, SWAT4HCLS 2022.
  • (2021) Maxat Kulmanov, Fernando Zhapa-Camacho, Robert Hoehndorf. DeepGOWeb: fast and accurate protein function prediction on the (Semantic) Web Nucleic Acids Research.
  • (2020) Sara Althubaiti, Senay Kafkas, Marwa Abdelhakim, Robert Hoehndorf. Combining lexical and context features for automatic ontology extension Journal of Biomedical Semantics.
  • (2020) Smaili, Gao, Hoehndorf. Formal axioms in biomedical ontologies improve analysis and interpretation of associated data Bioinformatics.
  • (2020) Luke T. Slater, Georgios V. Gkoutos, Robert Hoehndorf. Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies BMC Medical Informatics and Decision Making.
  • (2020) Rutger A. Vos, Toshiaki Katayama, Hiroyuki Mishima, Shin Kawano et al.. BioHackathon 2015: Semantics of data for life sciences and reproducible research F1000Research.
  • (2020) . JOWO 2020: The Joint Ontology Workshops : Proceedings of the Joint Ontology Workshops co-located with the Bolzano Summer of Knowledge (BOSK 2020) CEUR-WS.
  • (2019) Sarah M. Alghamdi, Beth A. Sundberg, John P. Sundberg, Paul N. Schofield et al.. Quantitative evaluation of ontology design patterns for combining pathology and anatomy ontologies Scientific Reports.
  • (2019) Katayama, Kawashima, Micklem, Kawano et al.. BioHackathon series in 2013 and 2014: improvements of semantic interoperability in life science data and services F1000Research.
  • (2018) Maxat Kulmanov, Senay Kafkas, Andreas Karwath, Alexander Malic et al.. Vec2SPARQL: integrating SPARQL queries and knowledge graph embeddings Proceedings of the 11th International Conference Semantic Web Applications and Tools for Life Sciences, SWAT4LS 2018, Antwerp, Belgium, December 3-6, 2018..
  • (2018) Damion M. Dooley, Emma J. Griffiths, Gurinder S. Gosal, Pier L. Buttigieg et al.. FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration Science of Food.
  • (2018) Keenan, McKerlie, Gkoutos, Ward et al.. A Review of Current Standards and the Evolution of Histopathology Nomenclature for Laboratory Animals ILAR Journal.
  • (2017) Alshahrani, Khan, Maddouri, Kinjo et al.. Neuro-symbolic representation learning on biological knowledge graphs Bioinformatics.
  • (2017) Salhi, Negrao, Essack, Morton et al.. DES-TOMATO: A Knowledge Exploration System Focused On Tomato Species Scientific Reports.
Show 25 more
  • (2017) Kafkas, Sarntivijai, Hoehndorf. Usage of cell nomenclature in biomedical literature BMC Bioinformatics.
  • (2016) Bolleman, Mungall, Strozzi, Baran et al.. FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation Journal of Biomedical Semantics.
  • (2016) Salhi, Essack, Radovanovic, Marchand et al.. DESM: portal for microbial knowledge exploration systems Nucleic Acids Research.
  • (2016) Slater, Rodriguez-Garcia, O'Shea, Schofield et al.. Experiences with Aber-OWL, an Ontology Repository with OWL EL Reasoning Ontology Engineering: 12th International Experiences and Directions Workshop on OWL, OWLED 2015, co-located with ISWC 2015, Bethlehem, PA, USA, October 9-10, 2015, Revised Selected Papers.
  • (2016) Robert Hoehndorf, Liam Mencel, Georgios V. Gkoutos, Paul N. Schofield. Large-Scale Reasoning over Functions in Biomedical Ontologies Formal Ontology in Information Systems.
  • (2016) Luke Slater, Georgios V. Gkoutos, Paul N Schofield, Robert Hoehndorf. To MIREOT or not to MIREOT? A case study of the impact of using MIREOT in the Experimental Factor Ontology (EFO) International Conference on Biomedical Ontology and BioCreative (ICBO BioCreative 2016).
  • (2016) Mona Alshahrani, Hussein Almashouq, Robert Hoehndorf. SPARQL2OWL: Towards Bridging the Semantic Gap Between RDF and OWL Proceedings of the Joint International Conference on Biological Ontology and BioCreative, Corvallis, Oregon, United States, August 1-4, 2016..
  • (2015) Robert Hoehndorf, Luke Slater, Paul N Schofield, Georgios V Gkoutos. Aber-OWL: a framework for ontology-based data access in biology BMC Bioinformatics.
  • (2015) Luke Slater, Georgios Gkoutos, Paul N. Schofield, Robert Hoehndorf. Using Aber-OWL for fast and scalable reasoning over BioPortal ontologies Proceedings of International Conference on Biomedical Ontologies (ICBO).
  • (2015) Luke Slater, Georgios Gkoutos, Paul N. Schofield, Robert Hoehndorf. AberOWL: an ontology portal with OWL EL reasoning Proceedings of International Conference on Biomedical Ontologies (ICBO).
  • (2014) Dumontier, Baker, Baran, Callahan et al.. The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery Journal of Biomedical Semantics.
  • (2013) Hoehndorf, Schofield, Gkoutos. An integrative, translational approach to understanding rare and orphan genetically based diseases Interface Focus.
  • (2012) Hoehndorf, Harris, Herre, Rustici et al.. Semantic integration of physiology phenotypes with an application to the Cellular Phenotype Ontology Bioinformatics.
  • (2012) Gkoutos, Hoehndorf. Ontology-based cross-species integration and analysis of Saccharomyces cerevisiae phenotypes Journal of Biomedical Semantics.
  • (2012) Gkoutos, Schofield, Hoehndorf. The Units Ontology: a tool for integrating units of measurement in science Database.
  • (2012) Georgios V. Gkoutos, Paul N. Schofield, Robert Hoehndorf. Chapter Four - The Neurobehavior Ontology: An Ontology for Annotation and Integration of Behavior and Behavioral Phenotypes Bioinformatics of Behavior: Part 1.
  • (2012) Robert Hoehndorf, Michel Dumontier, Georgios V. Gkoutos. Integration of knowledge for personalized medicine: a pharmacogenomics case-study Proceedings of the Virtual Physiological Human Conference 2012 (VPH2012).
  • (2011) Hiroshi Masuya, Georgios V. Gkoutos, Nobuhiko Tanaka, Kazunori Waki et al.. Investigation of the fundamental strategy for interoperability of description of biological measurements Proceedings of the Second International Conference on Biomedical Ontology.
  • Hoehndorf, Oellrich, Rebholz-Schuhmann. Interoperability between phenotype and anatomy ontologies Bioinformatics.
  • Robert Hoehndorf, Michel Dumontier, Anika Oellrich, Sarala Wimalaratne et al.. A common layer of interoperability for biomedical ontologies based on OWL EL Bioinformatics.
  • Robert Hoehndorf, Michel Dumontier, Anika Oellrich, Dietrich Rebholz-Schuhmann et al.. Interoperability between biomedical ontologies through relation expansion, upper-level ontologies and automatic reasoning PLOS ONE.
  • de Bono, Hoehndorf, Wimalaratne, Gkoutos et al.. The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions. BMC Research Notes.
  • Hoehndorf, Ngonga Ngomo, Pyysalo, Ohta et al.. Ontology design patterns to disambiguate relations between genes and gene products in GENIA Journal of Biomedical Semantics.
  • Jupp, Stevens, Hoehndorf. Logical Gene Ontology Annotations (GOAL): exploring gene ontology annotations with OWL Journal of Biomedical Semantics.
  • Wimalaratne, Grenon, Hoehndorf, Gkoutos et al.. An infrastructure for ontology-based information systems in biomedicine: RICORDO case study Bioinformatics.