Clinical reports are essential for communicating diagnostic findings, yet current reporting practices are time-consuming, inconsistent, and often inaccessible to patients. This project aims to develop an explainable, multimodal AI framework for automated clinical report generation that supports cardiovascular healthcare for both clinicians and patients through transparent and interactive interfaces.
The research will advance two key dimensions of interpretability. First, clinical interpretability: the system will align physiological signals, such as ECG traces, with generated textual findings, allowing clinicians to understand how data features inform the report and to verify or adjust outputs confidently. Second, transparency and patient engagement: an interactive question-answering interface will enable patients to explore their reports in natural language, promoting accessibility and trust. Supported by large labelled, text-paired medical datasets, such as MIMIC-IV, and LongHealth, a multimodal learning approach will integrate diverse healthcare data streams, including physiological signals, text records, and imaging summaries, to produce coherent and clinically robust reports.
Over four years, the project will progress from baseline model development and multimodal alignment to interactive evaluation and user-centred co-design. In addition, the student will have the opportunity to work closely with cardiologists from King’s College Hospital, ensuring the generated medical reports are clinically accurate, contextually appropriate, and aligned with real-world diagnostic workflows. The outcome will be a next-generation clinical reporting system that enhances explainability, efficiency, and patient understanding, directly contributing to EPSRC’s goals for Next-Generation Clinical User Interfaces and Multimodal Patient Data Streams in data-driven healthcare.

