Article Text

Download PDFPDF

Automated categorisation of clinical incident reports using statistical text classification
Free
  1. Mei-Sing Ong,
  2. Farah Magrabi,
  3. Enrico Coiera
  1. Centre for Health Informatics, University of New South Wales, Sydney, Australia
  1. Correspondence to Ms Mei-Sing Ong, Centre for Health Informatics, University of New South Wales, Sydney 2052, Australia; meisingong{at}gmail.com

Abstract

Objectives To explore the feasibility of using statistical text classification techniques to automatically categorise clinical incident reports.

Methods Statistical text classifiers based on Naïve Bayes and Support Vector Machine algorithms were trained and tested on incident reports submitted by public hospitals to identify two classes of clinical incidents: inadequate clinical handover and incorrect patient identification. Each classifier was trained on 600 reports (300 positives, 300 negatives), and tested on 372 reports (248 positives, 124 negatives). The results were evaluated using standard measures of accuracy, precision, recall, F-measure and area under curve (AUC) of receiver operating characteristics (ROC). Classifier learning rates were also evaluated, using classifier accuracy against training set size.

Results All classifiers performed well in categorising clinical handover and patient identification incidents. Naïve Bayes attained the best performance on handover incidents, correctly identifying 86.29% of reporter-classified incidents (precision=0.84, recall=0.90, F-measure=0.87, AUC=0.93) and 91.53% of expert-classified incidents (precision=0.87, recall=0.98, F-measure=0.92, AUC=0.97). For patient identification incidents, the best results were obtained when Support Vector Machine with radial-basis function kernel was used to classify reporter-classified reports (accuracy=97.98%, precision=0.98, recall=0.98, F-measure=0.98, AUC=1.00); and when Naïve Bayes was used on expert-classified reports (accuracy=95.97%, precision=0.95, recall=0.98, F-measure=0.96, AUC=0.99). A relatively small training set was found to be adequate, with most classifiers achieving an accuracy above 80% when the training set size was as small as 100 samples.

Conclusions This study demonstrates the feasibility of using text classification techniques to automatically categorise clinical incident reports.

  • Incident reporting
  • adverse event
  • patient safety
  • machine learning
  • text classification

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Footnotes

  • Funding This study was funded by the Australian Commission on Safety and Quality in Health Care (ACSQHC) and undertaken as part of the Learning from patient safety incidents: patient identification and clinical handover project. This research is also supported in part by grants from the Australian Research Council (ARC) LP0775532 and NHMRC Programme Grant 568612. FM is supported by an (ARC) APDI Fellowship and the University of New South Wales, Faculty of Medicine. M-SO is supported by an ARC APA(I) Scholarship.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.