Rapid advances in omics technologies, coupled with major improvements in computational and analytical tools, have enabled the generation of high-quality, cost-efficient multi-omics data that are now driving the emergence of precision health and precision medicine approaches. Yet current medical practice remains largely reactive and reductionist, addressing symptoms rather than the systemic molecular causes of disease. As discussed by Mardinoglu et al. (2025, Molecular Systems Biology), this non-holistic paradigm neglects the complex, multi-layered interactions among genes, proteins, metabolites, and environmental factors that collectively determine human health and disease. Despite major initiatives such as the Human Protein Atlas (HPA) and The Cancer Genome Atlas (TCGA), current artificial intelligence (AI) models remain predominantly correlative and lack causal interpretability.
This PhD project aims to develop a novel AI-driven, semantic retrobiosynthetic framework for mechanistic discovery in cancer systems biology. Retrobiosynthesis—traditionally a tool for metabolic pathway design—is repurposed here to infer upstream regulatory and metabolic processes from observed multi-omics phenotypes. The proposed Graph–Language Diffusion Model (GLDM) will integrate graph neural networks (GNNs) to encode biological topology, denoising diffusion processes to generate plausible mechanistic trajectories, and large language model (LLM)-guided reasoning to ensure semantic and biological interpretability.
The research will first integrate multi-omics data from HPA and TCGA into a unified, ontology-aligned knowledge graph, followed by development of the retrobiosynthetic generative model and semantic reasoning layer. Applications will focus on different type of cancers, such as liver hepatocellular carcinoma and lung adenocarcinoma, which together exemplify metabolically and transcriptionally driven cancers.
By uniting systems biology with semantic, generative, and graph-based AI, this project will produce mechanistically interpretable digital-twin models capable of simulating patient-specific molecular states. The research will advance explainable AI for biomedicine, offering a transferable framework for causal inference, biomarker discovery, and predictive modelling in precision oncology.

