Article Text
Abstract
Progress in reducing diagnostic errors remains slow partly due to poorly defined methods to identify errors, high-risk situations, and adverse events. Electronic trigger (e-trigger) tools, which mine vast amounts of patient data to identify signals indicative of a likely error or adverse event, offer a promising method to efficiently identify errors. The increasing amounts of longitudinal electronic data and maturing data warehousing techniques and infrastructure offer an unprecedented opportunity to implement new types of e-trigger tools that use algorithms to identify risks and events related to the diagnostic process. We present a knowledge discovery framework, the Safer Dx Trigger Tools Framework, that enables health systems to develop and implement e-trigger tools to identify and measure diagnostic errors using comprehensive electronic health record (EHR) data. Safer Dx e-trigger tools detect potential diagnostic events, allowing health systems to monitor event rates, study contributory factors and identify targets for improving diagnostic safety. In addition to promoting organisational learning, some e-triggers can monitor data prospectively and help identify patients at high-risk for a future adverse event, enabling clinicians, patients or safety personnel to take preventive actions proactively. Successful application of electronic algorithms requires health systems to invest in clinical informaticists, information technology professionals, patient safety professionals and clinicians, all of who work closely together to overcome development and implementation challenges. We outline key future research, including advances in natural language processing and machine learning, needed to improve effectiveness of e-triggers. Integrating diagnostic safety e-triggers in institutional patient safety strategies can accelerate progress in reducing preventable harm from diagnostic errors.
- electronic health records
- health information technology
- triggers
- medical informatics
- patient safety
- diagnostic errors
- diagnostic delays
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0
Statistics from Altmetric.com
- electronic health records
- health information technology
- triggers
- medical informatics
- patient safety
- diagnostic errors
- diagnostic delays
Nearly two decades after the Institute of Medicine’s report ‘To Err is Human’,1 medical errors remain frequent.2–4 Methods are needed to efficiently and effectively identify high-risk situations to prevent harm as well as identify patient safety events to enable organisational learning for error prevention.5 6 Measurement needed for improving diagnosis is particularly challenging due to the complexity of an evolving diagnostic process.7 Use of health information technology (HIT) is essential to monitor patient safety8 but has received limited application in diagnostic error detection. Widespread recent adoption of comprehensive electronic health records (EHR) and clinical data warehouses have advanced our ability to collect, store, use and analyse vast amounts of electronic clinical data that helps map the diagnostic process.
Triggers have helped measure safety in hospitals; for example, use of inpatient naloxone administration outside of the postanaesthesia recovery room could suggest oversedation due to opioid administration. Trigger development and use have steadily increased over the past decade in prehospital,9 emergency room,10 inpatient,11 ambulatory care12 and home health settings,13 and helped identify adverse drug reactions,10 surgical complications14 15 and other potentially preventable harm.14 Electronic trigger (e-trigger) tools,16 which mine vast amounts of clinical and administrative data to identify signals for likely adverse events,17–19 offer a promising method to detect patient safety events. Such tools are more efficient and effective in detecting adverse events as compared with voluntary reporting or use of patient safety indicators20 21 and offer the ability to quickly mine large data sets, reducing the number of records requiring human review to those at highest risk of harm. While most e-triggers rely on structured (non-free text) data, some can detect specific words within progress notes or reports.22
The most widely used trigger tools (the Institute for Healthcare Improvement’s Global Trigger Tools)23 include both manual and e-trigger tools to detect inpatient events.20 24–28 However, they were not designed to detect diagnostic errors. Meanwhile, other types of trigger tools have been developed for diagnostic errors and are getting ready for integration within existing patient safety surveillance systems.4 29 30 To stimulate progress in this area, we present a knowledge discovery framework, the Safer Dx Trigger Tools Framework, that could enable health systems to develop and implement e-trigger tools that measure diagnostic errors using comprehensive EHR data. Health systems would also need to leverage and/or develop their existing safety and quality improvement infrastructure and personnel (such as clinical leadership, HIT professionals, safety managers and risk management) to operationalise this framework. In addition to showcasing the application of diagnostic safety e-trigger tools, we highlight several strategies to bolster their development and implementation. Triggers can identify diagnostic events, allowing health systems to monitor event rates and study contributory factors, and thus potentially learn from these events and prevent similar events in the future. Some e-triggers additionally allow monitoring of data more prospectively and help identify patients at high risk for future adverse events, enabling clinicians, patients or safety personnel to take preventive actions proactively.
Conceptualising diagnostic safety e-triggers
Triggers are not new to patient safety measurement. Several existing triggers focus on identifying errors related to medications, such as administering incorrect dosages, or procedure complications, such as returning to the operating room. Only recently has this concept been adapted to detect potential problems with diagnostic processes, such as patterns of care suggestive of missed or delayed diagnoses.19 For instance, a clinic visit followed several days later by an unplanned hospitalisation or subsequent visit to the emergency department could be indicative of something missed at the first visit.31 Similarly, misdiagnosis could be suggested by an unusually prolonged hospital stay for a given diagnosis19 or an unexpected inpatient transfer to a higher level of care,19 32 particularly when considering younger patients with minimal comorbidity.33 Event identification can promote organisational learning with the goal of addressing underlying factors that led to the error, similar to what was proposed earlier in the 2015 Safer Dx framework for measuring diagnostic errors.34 Additionally, e-triggers enable tracking of events over time to allow an assessment of the impact of efforts to reduce adverse events.
Certain e-trigger tools can additionally monitor for high-risk situations prospectively, such as when risk of harm is high, even if no harm has yet occurred. Several studies have shown that e-trigger tools offer promise in detecting errors of omission, such as detecting delays in care after an abnormal test result suspicious for cancer,26–30 35 kidney failure,29 36 infection29 and thyroid conditions,37 as well as patients at risk of delayed action on pregnancy complications.38 39 Such triggers can identify situations where earlier intervention can potentially improve patient outcomes. Future e-triggers could explore other process breakdowns associated with diagnostic errors, such as when insufficient history has been collected or diagnostic testing is not completed for a high-risk symptom (eg, no documented fever assessment or temperature recording in patients with back pain, where an undiagnosed spinal epidural abscess might be missed).40 41 In table 1, we provide several examples of ‘Safer Dx’ e-trigger tools that align with the process dimensions of the Safer Dx framework. To promote the uptake of Safer Dx trigger tools by health systems, we now discuss essential steps for their development and implementation.
Safer Dx Trigger Tools Framework
Overview
e-Trigger development may be viewed as a form of data mining or pattern matching to discover knowledge about clinical processes. Several knowledge discovery frameworks have evolved from fields such as statistics, machine learning and database research. Hripcsak et al proposed a framework for mining complex clinical data for patient safety research, which is composed of seven iterative steps: define the target events, access the clinical data repository, use natural language processing (NLP) for interpreting narrative data, generate queries to detect and classify events, verify target detection, characterise errors using system or cognitive taxonomies, and provide feedback.42 We build on essential components of Hripcsak’s framework to demonstrate the steps of Safer Dx e-trigger tools design and development (figure 1), with an emphasis on operationalising them using a multidisciplinary approach.
Development methods
These development methods (table 2) have now been validated to identify several diagnostic events of interest.25–29 31 35 36 43
Step 1: Identify and prioritise diagnostic error of interest
The choice of which diagnostic error to focus on could be guided by high-risk areas identified in prior research and/or local priorities.44 Because of challenges to define error, we recommend risk areas where clear evidence exists of a missed opportunity to make a correct or timely diagnosis1 45–48 since this emphasises preventability (focusing efforts where improvement is more feasible) and accounts for the evolution of diagnosis over time.
Take, for example, a potentially missed diagnosis of lung cancer related to delayed follow-up after an abnormal chest radiograph.26 35 A robust body of literature suggests that poor outcomes and malpractice suits can result from delays in follow-up of abnormal imaging when potential lung malignancies are missed.49–51
Step 2: Operationally define criteria to detect diagnostic error
Developing operational definitions involves creating unambiguous language to objectively describe all inclusion and exclusion characteristics to identify the event. For example, an operational definition of ‘unexpected readmission’ might be ‘unplanned readmission to the same hospital for the same patient within 14 days of discharge’. In many cases, standard definitions will not exist and will need to be developed by patient safety and clinical stakeholders. Published literature, clinical guidelines from academic societies, and input from clinicians, staff and other stakeholders with expertise or involvement in related care processes will allow development of rigorous criteria matched to local processes and site characteristics. Consensus may be achieved by having a designated team review and approve all final criteria or Delphi-like methods52 with iterative revisions based on individual feedback and re-review by the group.
In the above example, defining what is an ‘abnormal’ radiograph, a follow-up action and length of time that should be considered a ‘delay’ is seemingly straightforward, but in absence of any standards, a key step. ‘Abnormal radiographs’ could include any plain chest radiograph where the radiologist documents findings suspicious for new lung malignancy and ‘timely’ follow-up could include repeat imaging or a lung biopsy performed within 30 days of the initial radiograph. While the 30-day time frame is longer than what is required to act on an abnormal radiograph, it is short enough to ‘catch’ an abnormality before clinically significant disease progression, allowing an opportunity to intervene. Consensus on this time frame might involve primary care physicians, pulmonologists, oncologists and patient safety experts, and definitions may vary from site to site. The criteria should also exclude patients where additional diagnostic evaluation is unnecessary, such as in those with known lung cancer or terminal illness.
Step 3: Determine potential electronic data sources
The nature and quality of the available data will determine to what extent the trigger can reliably capture the desired cohort, and operational definitions will often require refinements based on available data. All safety triggers ultimately involve manual medical record reviews to both validate (during trigger development) and act on (during trigger implementation) trigger output. EHR built-in functionality may provide sufficient data access and querying capabilities for e-trigger development where only a few simple criteria are required, but a data warehouse may be required when numerous inclusion and exclusion criteria are present. In addition to access to clinical and administrative data, e-trigger development relies on query software to develop, refine and test algorithms, as well as temporary storage for holding data from identified records.
Both structured and/or unstructured data can be used. Structured, or ‘coded’, data, such as International Classification of Diseases (ICD) codes and lab results can be used to objectively identify data items. More advanced text mining algorithms, like NLP, can be optionally added to an e-trigger to allow use of the vast amounts of unstructured (ie, free-text) data, particularly when a structured data field for a key criterion does not exist, but the relevant data are contained in progress notes or reports. For example, a structured Breast Imaging Reporting and Data System (BIRADS) code may be helpful in detecting possible cancers on mammography results; however, no analogous coding system is widely used for detecting liver masses on abdominal imaging tests. Instead, an NLP algorithm could scan abdominal imaging result text for radiologist interpretations describing the presence of liver masses.53 While NLP methods are being actively explored, barriers to further deployment include limited access to shared data for comparisons, lack of annotated data sets for training and lack of user-centred development and concerns regarding reproducibility of results in different settings.54 Deployment of NLP systems usually requires an expert developer to build algorithms specific to the concepts being queried in the free-text data, and are often not easily reused in subsequent projects.54 This may put NLP-based triggers beyond current user capabilities, requiring more developer support and limiting wider use. Similarly, unsupervised machine learning, where computers act without being explicitly programmed, can help develop and improve triggers.55 Such algorithms could potentially ‘learn’ to identify patterns in clinical data and make predictions on potential diagnostic errors. However, application of machine learning to make triggers ‘smarter’ requires more research and development and not ready for widespread implementation.
Step 4: Construct an e-trigger algorithm to obtain cohort of interest
The clinical logic for selecting a cohort of interest can be converted into the necessary query language to extract electronic data. This requires individuals with database and query programming expertise, such as Structured Query Language programming knowledge. Detailed understanding of the clinical event of interest and available data sources are needed to generate patient cohorts for subsequent validation, which requires clinical experts to work closely with the query programmer.
While inclusion criteria will initially identify at-risk patients, a robust set of exclusions is needed to narrow down the population of interest. These exclusions could remove patients in hospice care or those unlikely to have a diagnostic error, such as patients where timely follow-up actions were already performed (eg, imaging test or biopsy done within 30 days) or patients hospitalised electively for procedures rather than unexpected admissions after primary care visits. The remaining cohort will include an enriched sample of patients with the highest risk for error.
Step 5: Test e-trigger on data source and review medical records
Depending on algorithm complexity, individual inclusion and exclusion criterion should be validated via reviewing a small sample of records. This may isolate potential algorithm or data-related issues (eg, additional ICD codes that need to be considered) not immediately apparent when initially testing the full algorithm. For instance, a small sample of records will reveal if exclusions such as terminal illness, known lung cancer, imaging testing within 30 days and biopsy testing within 30 days indeed were made accurately.
Application of the full e-trigger algorithm yields a list of patients at high risk of diagnostic error (‘e-trigger positive’ patients). Medical records of e-trigger-positive patients should be reviewed by a clinician to assess for presence or absence of diagnostic error. For instance, when timely follow-up was performed at an outside institution or when the return visit was planned and mentioned only in a free-text portion of a progress note, the record will be false positive. Reviews also help determine whether initial criteria require refinement to increase future predictive value. A review of patients excluded from the cohort (‘e-trigger negative’ patients) may identify information to help refine the e-trigger (eg, ensuring appropriate data are used for selection and whether additional patient information should be incorporated into the trigger). Directed interviews of involved clinicians (eg, physicians, nurses) and subject matter experts may also yield information to modify criteria.42
Step 6: Assess e-trigger algorithm performance
Several assessment measures can be used to evaluate e-triggers, including positive predictive values (PPV) based on the number of records flagged by the e-trigger tool confirmed as diagnostic error on review (numerator) divided by the total number of records flagged (denominator).56 If ‘negative’ records (ie, those not flagged by the trigger) are reviewed, negative predictive values (NPV; number of patients without the diagnostic error divided by all patients not flagged by the e-trigger), sensitivity and specificity can additionally be calculated. Use of criteria to select higher risk populations will often yield higher PPVs (eg, including lipase orders to identify patients presenting with acute abdominal pain to the emergency department).57 Trade-offs will often be needed to achieve the best discrimination of patients of interest from patients without the target or event of interest. e-Triggers with higher PPVs will reduce resources spent on manual confirmatory reviews, while those with higher NPVs will miss fewer records that contain the event of interest. With uncommon events, such as seen in patient safety research, it may only be possible to provide an estimated NPV by reviewing a modest sample of records (eg, 100) because the number of ‘e-trigger negative’ records that need to be reviewed to find a single event is vast and cost prohibitive. Higher sensitivity may be desirable for certain e-triggers where the importance of all events being captured outweighs the additional review burden introduced by false positives.
The PPV helps plan for human resources to review records and to act on e-trigger output. Clinical personnel would intervene in high-risk situations to prevent harm, whereas patient safety personnel would investigate events and factors that contributed to errors. Process improvement and organisational learning activities would follow. Reviews and actions for missed opportunities to close the loop on abnormal test results will require just a few minutes per patient, allowing a single individual to handle many records per week. However, others related to whether a cognitive error occurred and subsequent investigation and debriefs about what transpired will take much longer.
Step 7: Iteratively refine e-triggers to improve trigger performance
Using the knowledge gained from the previous steps, the e-trigger may be iteratively refined to improve capture of the defined cohort. This may involve simply changing the value of a structured field or potentially redesigning the entire algorithm to better capture the clinical event. Similar to initial development, revision should be informed by content experts and iterative review of the available data. Clinicians can also suggest revisions based on clinical circumstances.
Trigger implementation and use
The ultimate goal of Safer Dx e-trigger development is to improve patient safety through better measurement and discovery of diagnostic errors by leveraging electronic data. After e-trigger tools are developed and validated to capture the desired cohort of patients with acceptable performance, safety analysis activities and potential solutions can result based on what is learnt.58 59 Use of e-triggers as diagnostic safety indicators is promising for identifying historic trends, generating feedback and learning, facilitating understanding of underlying contributory processes and informing improvement strategies. Additionally, certain e-triggers can help health systems intervene to prevent patient harm if applied prospectively.
In addition to having leadership support, health systems will need to either leverage existing or build additional infrastructure necessary to develop and implement diagnostic safety triggers. In organisations with advanced safety programmes, development and implementation will require only modest additional investment of resources; but for others in initial stages, trigger tools could provide a useful starting point. We envision many validated algorithms could be freely shared across institutions to reduce development workload.60 61 All health systems will need to convene a multidisciplinary team to harvest knowledge generated by the e-trigger tool. This group should address implementation factors related to how best to use e-trigger results, including who should receive them and how. Prospective application warrants these findings to be communicated to clinicians to take action. Traditional methods of communicating such findings have posed challenges62; thus, additional work to reliably deliver such information is needed. Certain detected events may require further investigation and dedicated patient safety and process improvement teams will need time and resources to collect and analyse data and recommend improvement strategies. Such a group should be composed of clinicians involved in the care processes, informaticists, patient safety professionals, and patients and garner multidisciplinary expertise for understanding data, safety events, and creating and implementing effective solutions. While there is need to invest in additional resources and infrastructure, building such an institutional programme could make significant advances in the quality, accuracy and timeliness of diagnoses.
Discussion
We demonstrate the application of a knowledge discovery framework to guide development and implementation of e-triggers to identify targets for improving diagnostic safety. This approach has shown early promise to identify and describe diagnostic safety concerns within health systems using comprehensive EHRs.25–28 35 This discovery approach could ensure progress towards the goal of using the EHR to monitor and improve patient safety, the most advanced and challenging aspect of EHR use.8 63
Application of the Safer Dx e-trigger tool framework is not without limitations. First, a sizeable proportion of healthcare information is contained in free-text notes or documents. This may necessitate use of NLP-based methods if e-trigger performance is inadequate to detect the cohort of interest, but NLP requires additional expertise and methods to improve portability, and accessibility of NLP tools is still being explored.64 65 Use of statistical models to estimate the likelihood of a diagnostic error or machine learning55 66 to program a computer to ‘learn’ from data patterns and make subsequent predictions may allow subsequent improvements in performance. Maturation of these techniques will stimulate the development and use of more sophisticated and effective e-trigger tools. Second, data availability and quality remain important issues that impact trigger feasibility and performance. Even at organisations that provide comprehensive and longitudinal care, we have found data sharing across institutions to be incomplete, requiring deliberate processes to actively collect and record external findings.60 This highlights the need for more meaningful sharing of data across institutions in a manner computers can use. Efforts to improve data sharing are already under way, but in early states (eg, view-only versions of data from external organisations). As data sharing improves, e-trigger tools will have better opportunities to impact patient safety.67 Furthermore, even when all care is delivered within a single organisation, absent, incomplete, outdated or incorrect data can affect trigger tool performance. Similarly, certain elements of patients’ histories, exams or assessments may not be recorded in the medical record, limiting both e-trigger performance and subsequent chart reviews used to verify trigger results.68 69 However, this is a limitation of most current safety measurement methods.
Conclusion
Use of HIT and readily available electronic clinical data can enable better patient safety measurement. The Safer Dx Trigger Tools Framework discussed here has potential to advance both real-time and retrospective identification of opportunities to improve diagnostic safety. Development and implementation of diagnostic safety e-trigger tools along with institutional investments to do so can improve our knowledge on reducing harm from diagnostic errors and accelerate progress in patient safety improvement.
References
Footnotes
Contributors All authors contributed to the development, review and revision of this manuscript.
Funding Work described is heavily drawn from research funded by the Veteran Affairs Health Services Research and Development Service CREATE grant (CRE-12-033), the Agency for Healthcare Research and Quality (R18HS017820) and the Houston VA HSR&D Center for Innovations in Quality, Effectiveness and Safety (CIN 13–413). Dr Murphy is additionally funded by an Agency for Healthcare Research & Quality Mentored Career Development Award (K08-HS022901), and Dr Singh is additionally supported by the VA Health Services Research and Development Service (Presidential Early Career Award for Scientists and Engineers USA 14-274), the VA National Center for Patient Safety, the Agency for Health Care Research and Quality (R01HS022087) and the Gordon and Betty Moore Foundation. Drs Sittig and Thomas are supported in part by the Agency for Health Care Research and Quality (P30HS023526). These funding sources had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; and preparation, review or approval of the manuscript.
Competing interests None declared.
Patient consent Not required.
Provenance and peer review Not commissioned; externally peer reviewed.