Back to projects

Project Code

2025_014

Start date

1 October 2026

Primary supervisor

Dr Joel Winston

Secondary supervisor

Professor Mark Richardson

Topic Areas

AI, Machine Learning, and Multimodal Data, EHRs, NLP, and LLMs

Co-Funded

Yes (A*STAR Institute, Singapore)

Improving clinical phenotyping in epilepsy using machine learning with free text clinical records and multimodal clinical data

Up to three 4-year, fully funded ‘Joint A*STAR – EPSRC DRIVE-Health Studentships’ are available to support PhDs commencing October 2026, covering tuition fees, stipend, and bench fee.

Students recruited to these studentships must spend a minimum of 18 months and a maximum of 24 months at A*STAR Research Institute in Singapore with the named A*STAR supervisor(s) as part of the research and training programme. This is called the “attachment” period, and it will start in their second academic year.

Applications are accepted from citizens of the UK, the EU, the USA, Canada, Latin America, and Australia.

Please read the specific Key Dates and How to Apply sections on this collaboration with A*STAR. Apart from the regular DRIVE-Health entry requirements and application process, A*STAR applicants will have a 2-tier interview process: by a KCL academic panel and a panel from Singapore.

Epilepsy is one of the most common neurological diseases worldwide. Current digitized clinical records contain significant proportions of free-text, requiring slow poring through records to glean a summary of the patient’s journey. This also renders it inaccessible for research and entails labour-intensive manual extraction of information. Faster and better methods to extract these valuable data from clinical notes could lead to better phenotyping, population segmentation, prognostication and management. This project will focus on developing a machine learning approach to extract information from clinical notes and narrative investigational reports, and to integrate the derived insights with multimodal clinical data for downstream applications. In so doing, we will improve our understanding of the spectrum of epilepsy and improve the granularity of our phenotyping, the better to guide prognostication and management of this complex disease.

We aim to leverage real-world multimodal health records data across two large unique healthcare systems in the United Kingdom and Singapore to develop and validate the machine learning approaches and models to ensure generalizability across different healthcare systems and populations. We will thus improve ascertainment of seizure control and clinical phenotyping to identify the different patients across the spectrum of epilepsy and build foundations for integration of narrative clinical investigation reports and multimodal data into digital solutions for clinical platforms