This project aims to harness state-of-the-art multimodal and interaction-based analysis methods with multi-omics datasets to improve our understanding of disease mechanisms, particularly in neurodegenerative disorders, and to advance personalized diagnostics and prognostics. With the rapid growth of high-throughput technologies, researchers can now test millions of biological factors, such as gene expression levels or genetic variants, for associations with disease traits. However, traditional univariate testing—examining each factor independently—fails to detect weak or rare associations and overlooks complex, non-linear interactions among biological factors. Such limitations are especially problematic for complex diseases, which result from interactions between genetic, metabolic, and environmental factors.
Multimodal and interaction-based analysis methods offer solutions to identify optimal combinations of features drawn from multiple data modalities in relation to a measurable outcome. By applying machine learning and prior biological knowledge, these methods can efficiently explore large datasets while maintaining statistical power. Previous work has shown their utility in studying genetic variants (SNPs) and gene expression, enabling patient stratification into biologically distinct subgroups. Yet, the integration of multiple omic layers—such as genomics, transcriptomics, proteomics, and methylomics—remains underexplored.
Using extensive multi-omics datasets from individuals with neurodegenerative diseases (including ALS, Alzheimer’s, Parkinson’s, and schizophrenia) and healthy controls, the project will develop a comprehensive pipeline for multimodal and interaction-based analysis. The goal is to uncover disease drivers, stratify patients by molecular and clinical features, and identify cross-disease mechanisms that contribute to multiple conditions. A range of machine learning and conventional statistical methods will be applied to integrate diverse biological data types. Ultimately, this work will produce a robust methodology that fully exploits complex biological data, advancing personalized medicine and offering insights applicable beyond neurodegenerative diseases.

