Phenotype informatics
Phenotypes are the observable consequences of genotype, environment, and their interaction, and they remain the principal currency by which disease is recognised, model organisms are characterised, and plant traits are catalogued. Our work develops the informatics infrastructure that makes phenotype data computable across species and clinical settings: the phenotype ontologies themselves, the cross-species crosswalks that link them, the tools that capture and standardise phenotype descriptions from text and images, and the computational pipelines that connect phenotype evidence back to genes, variants, and diseases.
Ontologies and cross-species integration
A sustained strand of work has built the formal scaffolding for phenotype data. The anatomy of phenotype ontologies: principles, properties and applications synthesises the design principles underlying the Human Phenotype Ontology (HPO), the Mammalian Phenotype Ontology (MP), and their counterparts in zebrafish, plants, and yeast. The Entity-Quality formalism that powers these ontologies was articulated in Towards improving phenotype representation in OWL and Interoperability between phenotype and anatomy ontologies, which showed how phenotype classes can be decomposed into affected entities and qualities, then reasoned over with description logic. Cross-species integration is delivered by PhenomeNET: a whole-phenome approach to disease gene discovery and Integrating phenotype ontologies with PhenomeNET, which transform species-specific ontologies into a common semantic space and enable comparison of mouse, fish, fly, worm, yeast, and human phenotypes for disease gene prioritisation. Quantitative comparison of mapping methods between Human and Mammalian Phenotype Ontology and Semantic integration of physiology phenotypes with an application to the Cellular Phenotype Ontology extend the framework across model systems and physiological scales. Domain-specific ontologies built in this programme include The flora phenotype ontology (FLOPO): tool for integrating morphological traits and phenotypes of vascular plants, An ontology approach to comparative phenomics in plants, DermO; an ontology for the description of dermatologic disease, and the neurobehaviour ontology described in Best behaviour? Ontologies and the formal description of animal behaviour.
From phenotypes to genes, variants, and diagnosis
On the predictive side, DeepPheno: Predicting single gene loss-of-function phenotypes using an ontology-aware hierarchical classifier learns gene-to-HPO associations from function and expression data while respecting the ontology hierarchy, and Ontology-based validation and identification of regulatory phenotypes uses background knowledge to infer regulatory phenotypes from omics data. Phenotype-driven discovery of digenic variants in personal genome sequences and the more recent CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs) apply these resources to clinical genome interpretation. For uncurated narrative text, Multi-faceted semantic clustering with text-derived phenotypes and Improved characterisation of clinical text through ontology-based vocabulary expansion build phenotype profiles directly from clinical free text, while Taxon and trait recognition from digitized herbarium specimens using deep convolutional neural networks demonstrates the same idea for plant specimen images. Large-scale evidence comes from Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics and Systematic Analysis of Experimental Phenotype Data Reveals Gene Functions, with the contribution of model-organism data to disease genetics quantified in Contribution of model organism phenotypes to the computational identification of human disease genes and reviewed in Mouse genetic and phenotypic resources for human genetics and The informatics of developmental phenotypes.
This infrastructure is the substrate on which our diagnostic software stack runs. PhenomeNET-VP, DeepSVP, EmbedPVP, STARVar, INDIGENA, GenomeLinter, predCAN, and DeepViral together translate ontologies, cross-species phenotype data, and gene-phenotype predictions into tools for variant prioritisation, structural variant interpretation, cancer driver prediction, and pathogen-host interaction analysis. The same phenotype machinery also supports environmental and ecological work through plant trait ontologies and herbarium-scale trait extraction.
Software
- PhenomeNET-VP — Phenotype-driven variant prioritization for whole-exome and whole-genome sequencing data; widely used implementation of the phenotype-aware variant ranking approach.
- DeepSVP — Prioritizes structural and copy-number variants by combining patient phenotype with gene-function similarity learned from biomedical ontologies.
- EmbedPVP — Embedding-based phenotype-aware variant predictor that ranks candidate causative variants using joint sequence- and phenotype-derived representations.
- STARVar — Symptom-based tool for automatic ranking of variants using evidence from the biomedical literature and population genomes; combines text mining with phenotype matching.
- INDIGENA — Inductive prediction of disease–gene associations from phenotype ontologies; generalises to unseen diseases via ontology-aware embeddings.
- GenomeLinter — AI-powered clinical decision-support tool that ingests annotated VCFs and synthesises diagnostic interpretations for rare-disease patients without requiring deep bioinformatics expertise.
- predCAN — Ontology-based prediction of cancer driver genes by integrating phenotype, pathway and function knowledge with somatic-variant features.
- DeepViral — Predicts virus–host protein-protein interactions from sequence and infectious-disease phenotypes; trained jointly across coronaviruses, influenza, and other RNA viruses.
Publications (31)
- (2025) Aspromonte, Del Conte, Zhu, Tan et al.. CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs) Human Genetics.
- (2025) Schofield, Hoehndorf, Gkoutos, Smith. The informatics of developmental phenotypes Kaufman’s Atlas of Mouse Development Supplement.
- (2022) Sarah Alghamdi, Paul N. Schofield, Robert Hoehndorf. Contribution of model organism phenotypes to the computational identification of human disease genes Disease Models & Mechanisms.
- (2021) Luke T. Slater, John A. Williams, Andreas Karwath, Hilary Fanning et al.. Multi-faceted semantic clustering with text-derived phenotypes Computers in Biology and Medicine.
- (2021) Luke T. Slater, William Bradlow, Simon Ball, Robert Hoehndorf et al.. Improved characterisation of clinical text through ontology-based vocabulary expansion Journal of Biomedical Semantics.
- (2020) Kulmanov, Hoehndorf. DeepPheno: Predicting single gene loss-of-function phenotypes using an ontology-aware hierarchical classifier PLOS Computational Biology.
- (2019) Timothy K. Cooper, Kathleen A. Silva, Victoria E. Kennedy, Sarah M. Alghamdi et al.. Hyaline Arteriolosclerosis in 30 Strains of Aged Inbred Mice Veterinary Pathology.
- (2019) Linn, Mustonen, Silva, Kennedy et al.. Nail abnormalities identified in an ageing study of 30 inbred mouse strains Experimental Dermatology.
- (2018) Georgios V. Gkoutos, Paul N. Schofield, Robert Hoehndorf. The anatomy of phenotype ontologies: principles, properties and applications Briefings in Bioinformatics.
- (2018) Kulmanov, Schofield, Gkoutos, Hoehndorf. Ontology-based validation and identification of regulatory phenotypes Bioinformatics.
- (2018) Sohaib Younis, Claus Weiland, Robert Hoehndorf, Stefan Dressler et al.. Taxon and trait recognition from digitized herbarium specimens using deep convolutional neural networks Botany Letters.
- (2017) Imane Boudellioua, Maxat Kulmanov, Paul N Schofield, Georgios V Gkoutos et al.. Phenotype-driven discovery of digenic variants in personal genome sequences Proceedings of VarI-SIG.
- (2016) Fisher, Hoehndorf, Bazelato, Dadras et al.. DermO; an ontology for the description of dermatologic disease Journal of Biomedical Semantics.
- (2016) Hoehndorf, Alshahrani, Gkoutos, Gosline et al.. The flora phenotype ontology (FLOPO): tool for integrating morphological traits and phenotypes of vascular plants Journal of Biomedical Semantics.
- (2016) Miguel Rodriguez-Garcia, Georgios V. Gkoutos, Paul N. Schofield, Robert Hoehndorf. Integrating phenotype ontologies with PhenomeNET Proceedings of Ontology Matching Workshop 2016.
- (2015) Gkoutos, Hoehndorf, Tsaprouni, Schofield. Best behaviour? Ontologies and the formal description of animal behaviour Mammalian Genome.
- (2015) Martin Hrab\ve de Angelis, George Nicholson, Mohammed Selloum, Jacqueline K White et al.. Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics Nature Genetics.
- (2015) Oellrich, Walls, Cannon, Cannon et al.. An ontology approach to comparative phenomics in plants Plant Methods.
- (2013) Hoehndorf, Hardy, Osumi-Sutherland, Tweedie et al.. Systematic Analysis of Experimental Phenotype Data Reveals Gene Functions PLoS ONE.
- (2012) Hoehndorf, Harris, Herre, Rustici et al.. Semantic integration of physiology phenotypes with an application to the Cellular Phenotype Ontology Bioinformatics.
- … and 11 more.