Phenotype informatics

Phenotypes are the observable consequences of genotype, environment, and their interaction, and they remain the principal currency by which disease is recognized, model organisms are characterized, and plant traits are cataloged. Our work develops the informatics infrastructure that makes phenotype data computable across species and clinical settings: the phenotype ontologies themselves, the cross-species crosswalks that link them, the tools that capture and standardize phenotype descriptions from text and images, and the computational pipelines that connect phenotype evidence back to genes, variants, and diseases.

Ontologies and cross-species integration

A sustained strand of work has built the formal scaffolding for phenotype data. The anatomy of phenotype ontologies: principles, properties and applications synthesizes the design principles underlying the Human Phenotype Ontology (HPO), the Mammalian Phenotype Ontology (MP), and their counterparts in zebrafish, plants, and yeast. The Entity-Quality formalism that powers these ontologies was articulated in Towards improving phenotype representation in OWL and Interoperability between phenotype and anatomy ontologies, which showed how phenotype classes can be decomposed into affected entities and qualities, then reasoned over with description logic. Cross-species integration is delivered by PhenomeNET: a whole-phenome approach to disease gene discovery and Integrating phenotype ontologies with PhenomeNET, which transform species-specific ontologies into a common semantic space and enable comparison of mouse, fish, fly, worm, yeast, and human phenotypes for disease gene prioritization. Quantitative comparison of mapping methods between Human and Mammalian Phenotype Ontology and Semantic integration of physiology phenotypes with an application to the Cellular Phenotype Ontology extend the framework across model systems and physiological scales. Domain-specific ontologies built in this program include The flora phenotype ontology (FLOPO): tool for integrating morphological traits and phenotypes of vascular plants, An ontology approach to comparative phenomics in plants, DermO; an ontology for the description of dermatologic disease, and the neurobehaviour ontology described in Best behaviour? Ontologies and the formal description of animal behaviour.

From phenotypes to genes, variants, and diagnosis

On the predictive side, DeepPheno: Predicting single gene loss-of-function phenotypes using an ontology-aware hierarchical classifier learns gene-to-HPO associations from function and expression data while respecting the ontology hierarchy, and Ontology-based validation and identification of regulatory phenotypes uses background knowledge to infer regulatory phenotypes from omics data. Phenotype-driven discovery of digenic variants in personal genome sequences and the more recent CAGI6 ID panel challenge: assessment of phenotype and variant predictions in 415 children with neurodevelopmental disorders (NDDs) apply these resources to clinical genome interpretation. For uncurated narrative text, Multi-faceted semantic clustering with text-derived phenotypes and Improved characterisation of clinical text through ontology-based vocabulary expansion build phenotype profiles directly from clinical free text, while Taxon and trait recognition from digitized herbarium specimens using deep convolutional neural networks demonstrates the same idea for plant specimen images. Large-scale evidence comes from Analysis of mammalian gene function through broad-based phenotypic screens across a consortium of mouse clinics and Systematic Analysis of Experimental Phenotype Data Reveals Gene Functions, with the contribution of model-organism data to disease genetics quantified in Contribution of model organism phenotypes to the computational identification of human disease genes and reviewed in Mouse genetic and phenotypic resources for human genetics and The informatics of developmental phenotypes.

This infrastructure is the substrate on which our diagnostic software stack runs. PhenomeNET-VP, DeepSVP, EmbedPVP, STARVar, INDIGENA, GenomeLinter, predCAN, and DeepViral together translate ontologies, cross-species phenotype data, and gene-phenotype predictions into tools for variant prioritization, structural variant interpretation, cancer driver prediction, and pathogen-host interaction analysis. The same phenotype machinery also supports environmental and ecological work through plant trait ontologies and herbarium-scale trait extraction.

Software

Publications (31)

Show 11 more
  • (2012) Oellrich, Gkoutos, Hoehndorf, Rebholz-Schuhmann. Quantitative comparison of mapping methods between Human and Mammalian Phenotype Ontology Journal of Biomedical Semantics.
  • (2012) Loebe, Stumpf, Hoehndorf, Herre. Towards improving phenotype representation in OWL Journal of Biomedical Semantics.
  • (2012) Gkoutos, Hoehndorf. Ontology-based cross-species integration and analysis of Saccharomyces cerevisiae phenotypes Journal of Biomedical Semantics.
  • (2012) Georgios V. Gkoutos, Paul N. Schofield, Robert Hoehndorf. Chapter Four - The Neurobehavior Ontology: An Ontology for Annotation and Integration of Behavior and Behavioral Phenotypes Bioinformatics of Behavior: Part 1.
  • (2011) Georgios V. Gkoutos, Robert Hoehndorf. Ontology-based cross-species integration and analysis of Saccharomyces cerevisiae phenotypes Proceedings of the 3rd Workshop for Ontologies in Biomedicine and Life sciences (OBML).
  • (2011) Anika Oellrich, Robert Hoehndorf, Georgios V. Gkoutos, Dietrich Rebholz-Schuhmann. Quantitative comparison of mapping methods between Human and Mammalian Phenotype Ontology Proceedings of the 3rd Workshop for Ontologies in Biomedicine and Life sciences (OBML).
  • (2011) Frank Loebe, Frank Stumpf, Robert Hoehndorf, Heinrich Herre. Towards Improving Phenotype Representation in OWL Proceedings of the 3rd Workshop for Ontologies in Biomedicine and Life sciences (OBML).
  • Hoehndorf, Oellrich, Rebholz-Schuhmann. Interoperability between phenotype and anatomy ontologies Bioinformatics.
  • Hoehndorf, Schofield, Gkoutos. PhenomeNET: a whole-phenome approach to disease gene discovery Nucleic Acids Research.
  • Schofield, Sundberg, Hoehndorf, Gkoutos. New approaches to the representation and analysis of phenotype knowledge in human diseases and their animal models Briefings in Functional Genomics.
  • Schofield, Hoehndorf, Gkoutos. Mouse genetic and phenotypic resources for human genetics Human Mutation.