Mohammad Ashhad successfully defends MSc thesis on machine learning for survival analysis

About

Congratulations to Mohammad Ashhad, who successfully defended his Master’s thesis “Machine Learning Methods for Survival Analysis: From Outcome-Conditioned Data Synthesis to Decoupled Ranking and Calibration” on 3 May 2026 in KAUST’s Bioengineering Program. Mohammad will now continue in the group as a PhD student in the Computer Science Program.

The thesis tackles two persistent obstacles to building accurate, generalizable survival models from biomedical data: the scarcity of shareable training data, and the trade-off between expressiveness and robustness that classical and deep-learning survival models all run into. Mohammad’s work develops two complementary contributions.

Outcome-conditioned synthesis of survival data

Existing tabular generative models struggle with survival data because they cannot reproduce the joint distribution of observed and censored event times. Mohammad reverses the conventional conditioning direction: rather than sampling covariates and then predicting event times, the method samples event times and censoring indicators directly from a one-dimensional Dirichlet Process Mixture Model, then generates covariates conditioned on these outcomes using any standard tabular generator (VAEs, GANs, diffusion models, or LLMs). The approach guarantees that synthetic event-time distributions match the real data and removes the need for survival-specific generators. Evaluations on five real-world medical datasets show consistent gains in covariate fidelity, event-time distributions, and downstream-model performance.

GRAFT: Gated Residual Accelerated Failure Time models

The second contribution is GRAFT, a new survival model that decouples prognostic ranking from probability calibration. GRAFT combines a linear accelerated-failure-time (AFT) component with a non-linear residual network, using Stochastic Gates for automatic feature selection so the model stays robust in high-dimensional and noisy settings. Training optimizes a differentiable approximation of Spearman’s rank correlation, with censored observations imputed stochastically from local Kaplan–Meier estimators. A lightweight post-training step converts the learned prognostic scores into well-calibrated survival probabilities. Across six benchmark datasets GRAFT outperforms both classical and deep-learning baselines in discrimination and calibration, with stochastic gating providing the strongest robustness against irrelevant features.

Mohammad's PhD work in the Computer Science Program will continue the line of research connecting survival analysis with neuro-symbolic methods, including integrating ontology-derived background knowledge into prognostic models.