Background
Modern electronic health records (EHRs) combine free-text narratives with structured clinical data. However, converting this rich information into reliable, actionable guidance at the point of care remains a major challenge. Predictive models often show strong performance within a single institution but degrade when applied elsewhere. Risk estimates may be poorly calibrated, explanations are often lacking, and outputs rarely suggest specific clinical actions. Earlier work in our department (Foresight) demonstrated that forecasting clinical events from patient timelines is possible but highlighted key limitations in interpretability, generalisability, and day-to-day clinical usability. Meanwhile, recent large language models (LLMs) can answer medical questions with impressive accuracy but struggle to match the nuanced reasoning and decision-making of experienced clinicians in real-world settings.
Novelty & Importance
Foresight XL addresses these challenges by integrating LLMs with context engineering (CE) and post-training (PT) technologies to build a clinically aligned, generalisable EHR forecasting model. It will be the world’s first LLM with clinical alignment developed through significant real-world doctor involvement, ensuring safe, explainable, and useful behaviour without hallucinations. The model will be co-developed with NHS partners and evaluated across diverse populations, settings, and datasets. In later phases, a novel mixture-of-experts (doctors) architecture will combine specialty-specific models to reflect multidisciplinary care in hospital settings.
Aims & Objectives
Foresight XL will incorporate CE methods such as prompt engineering and retrieval-augmented generation, alongside PT techniques including parameter-efficient fine-tuning and reinforcement learning with human (clinician) feedback. The model will output transparent, interpretable predictions, combining structured and unstructured EHR data into concise summaries with evidence-linked predictions. The model will be developed using public datasets such as MIMIC-IV, eICU, and EHRSHOT, as well as private NHS datasets through collaboration with King’s College Hospital and the CogStack platform. Key goals include improving predictive accuracy and calibration, whiling enhancing clinician trust and usability.

