AI Models Mine Medical Records for Cancer Prognosis
Source PublicationScientific Reports
Primary AuthorsSun, Hadjiiski, Bruno et al.

Doctors rely on accurate data from Electronic Medical Records (EMRs) to predict patient outcomes, but extracting this information from unstructured notes is often a laborious manual process. A recent study investigates whether 'Large Language Models' (LLMs)—sophisticated AI systems trained on vast amounts of text—can effectively shoulder this burden for bladder cancer prognosis.
The researchers evaluated several models, including GPT-4, Dolly, Vicuna, and Llama, using records from 163 patients. They assessed how factors such as input length and model evolution impacted performance. The results were striking: GPT-4 emerged as the clear leader, achieving accuracy consistently above 93% and statistical reliability scores (Fleiss' Kappa) exceeding 0.97.
Crucially, for scenarios requiring offline privacy, the Llama-2.0-13b and Llama-3.3-70b models proved the most reliable alternatives. This research underscores the potential for AI to streamline predictive modelling in oncology, transforming messy clinical notes into life-saving insights, though challenges regarding model variability remain.