AI Model Rapidly Deciphers Breast Cancer Pathology Reports

Manually extracting data from thousands of medical records is a slow, laborious bottleneck for clinical research. To solve this, researchers developed an automated system using Natural Language Processing (NLP), a type of AI that helps computers understand human language.

They tested four different machine learning models on a set of 1,795 breast cancer pathology reports. The standout performer was a model called PubMedBERT, which became even more accurate after additional training on a general question-answering dataset.

The final model achieved an overall accuracy of 97.4%, successfully extracting key details from the complex reports. This not only surpassed the 95.6% accuracy of a previous rule-based algorithm but also provides a reliable and scalable tool for researchers. By automating this crucial data-gathering step, the new system promises to enhance research efficiency and could ultimately help to improve clinical outcomes for patients.

Cite this Article (Harvard Style)