There is pressing clinical demand for tools that can predict the course of psychotic disorders at the individual patient level, enabling earlier, preventive interventions, yet such tools have proved elusive (Coutts et al., 2023). Our prior work showed that Natural Language Processing (NLP) markers of altered language use such as reduced semantic coherence have significant predictive power for psychosis (Morgan et al., 2021; Spencer et al., 2021). However, most studies focus on group-level differences in NLP metrics, rather than modelling longitudinal changes within individual patients.
This project aims to bridge that gap by building digital twins of patients that integrate language and clinical data to predict significant shifts in clinical state ahead of time. To that end, we will combine anonymised NLP features from patients’ electronic messages and clinical records, creating human behavioural models that profile the temporal evolution of patients (Ferdousi et al., 2021). The ultimate goals of the digital twin model are to predict future disease trajectories and identify timepoints when clinical interventions might be most effective.
The first part of the project will focus on developing an app to extract summary NLP features from previously sent electronic messages, which the student will then use to collect timeseries of NLP features from patients with psychotic disorders. The student will then create a human digital twin, combining NLP features and clinical data to build a detailed, multimodal representation of the patient. Finally, we will assess whether the digital twin can be used to predict psychotic episodes and hospitalisations that occurred at a later timepoint. Ultimately, predicting individual trajectories for patients with psychotic disorders ahead of time could provide new opportunities for preventive interventions.

