AI Proves Superior at Decoding Complex ICU Clinical Notes
Source PublicationJournal of the American Medical Informatics Association
Primary AuthorsGuggilla, Kang, Bak et al.

Identifying patients with weakened immune systems in the Intensive Care Unit (ICU) is a critical yet notoriously difficult task for automated systems. Doctors’ clinical notes are often unstructured and complex, making it challenging for standard computer programmes to extract accurate data. Traditionally, researchers have relied on rule-based structured data algorithms or basic Natural Language Processing (NLP) to scan diagnosis codes and medication orders, but these methods frequently suffer from limited accuracy and poor generalisability.
A new study has pitted these established methods against the cutting edge of Artificial Intelligence: Large Language Models (LLMs). Researchers analysed hospital admission notes from a primary cohort of 827 ICU patients at Northwestern Memorial Hospital. They evaluated how well the AI, specifically GPT-4o, could identify seven immunosuppressive conditions and six specific medications compared to standard structured data algorithms and older NLP approaches.
The results were compelling. While structured algorithms achieved F1 scores—a statistical measure of accuracy—ranging widely from 0.30 to 0.97, the LLM consistently matched or outperformed them, scoring between 0.51 and 1 across all variables. Crucially, the AI proved its robustness in a separate validation group of 200 patients at Beth Israel Deaconess Medical Center, achieving a perfect score for 8 out of 13 variables. These findings suggest that modern AI can interpret complex clinical text with a level of nuance that older, rigid systems simply cannot match.