Medbrief

AI Matches Pathologists in Diagnosing Celiac Disease

TOPLINE:

A machine learning model trained on duodenal biopsy images achieves human pathologist-level performance in accurately diagnosed celiac disease (CD), with sensitivity and specificity above 95%.

METHODOLOGY:

  • There is a critical shortage of pathologists in developing and developed countries, often leading to long delays for patient diagnosis. Also, studies have found high levels of difference among pathologists in diagnosing CD from histological features in a duodenal biopsy, the gold standard.
  • To learn if artificial intelligence (AI) tools can address the clinical need, researchers trained a machine learning classifier for CD diagnosis using 3383 whole-slide images of hematoxylin- and eosin-stained duodenal biopsies from four hospitals in the United Kingdom featuring five different scanners, along with their clinical diagnoses.
  • An independent test dataset of 644 previously unseen biopsy scans from a fifth UK hospital was used to assess generalizability.
  • Model predictions on a subset of 30 images from the test data were compared to diagnoses from four specialist gastrointestinal pathologists. 

TAKEAWAY:

  • In cross-validation in the training set, the model achieved mean accuracy, sensitivity, and specificity of 96.8%, 95.4%, and 97.2%, respectively.
  • Performance remained high in the test dataset: 97.5% accuracy, 95.5% sensitivity, and 97.8% specificity, with an area under the receiver operating characteristic curve exceeding 99%, suggesting that the model has the potential to outperform human pathologists.
  • Agreement between the AI and pathologists was statistically indistinguishable, underscoring the potential of AI to automatically diagnose CD.
  • The model performed equally well in male and female patients for all ages older than 19 years, with slightly lower performance at ages 10-19 years, perhaps due to the small sample size. 

IN PRACTICE:

“This represents a crucial step toward the clinical implementation of machine-learning-assisted pathology for diagnosing CD. This study shows AI achieving human-level performance in CD diagnosis on a genuine, multicenter, clinically representative cohort of patient samples. This level of generalizability is crucial for deploying AI models in real-world clinical environments, where variability in staining protocols and scanner technology can significantly impact diagnostic accuracy,” the authors wrote. 

SOURCE:

The study, with first author Florian Jaeckle, PhD, Department of Pathology, University of Cambridge, Cambridge, England, and Lyzeum Ltd. was published online in NEJM AI. 

LIMITATIONS:

The chief limitation is the accuracy, or the ground truth, of the diagnostic data used to train and evaluate the model, due to known disagreements in CD diagnosis among pathologists. The study team aimed to mitigate this limitation with the pathologist concordance experiment, but the subset only included 30 cases and does not provide perfect ground truth diagnoses either.

DISCLOSURES:

Funding was provided by Coeliac UK and Innovate UK, UK National Institute for Health and Care, and the Cambridge Centre for Data-Driven Discovery. Author disclosures are available at ai.nejm.org.

TOP PICKS FOR YOU

Comments

3090D553-9492-4563-8681-AD288FA52ACE
Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.