Safana Bakheet
About
Safana Bakheet completed her MSc in Bioengineering at KAUST in 2025 under the supervision of Robert Hoehndorf. Her work was carried out in collaboration with Fernando Zhapa-Camacho and developed methods for prioritising disease-causing genes by combining ontology-based phenotype representations with supervised knowledge graph embeddings.
Her thesis, An Inductive, Supervised Gene-Disease Associations Method, introduces a ranking model that scores candidate disease-causing genes based on the similarity of their associated phenotypes, represented in the Mammalian Phenotype Ontology (MP) and the Human Phenotype Ontology (HP). The method is designed to be both supervised and inductive, generalising to genes and diseases that were not seen during training. The thesis systematically compares supervised knowledge graph embedding models (including TransD) under two scoring strategies — best-match-average over MP/HP phenotype sets, and direct scoring of MGI/OMIM entities — and evaluates the effect of varying the ontology graph structure, generalisation to unseen genes and diseases via folded evaluation, and a comparison to a Resnik semantic-similarity baseline. TransD with BMA-based phenotype scoring achieved the best performance, and the inductive setup outperformed the Resnik baseline on unseen entities.
The work contributes to the group's research programme on variant prioritisation, ontology-based phenotype analysis, and neuro-symbolic methods over biomedical knowledge graphs. The results are being prepared for submission as a short paper to ISCB.