Article Text


Implementation of diagnostic pauses in the ambulatory setting
  1. Grace C Huang1,2,
  2. Gila Kriegel1,2,
  3. Carolyn Wheaton1,
  4. Scot Sternberg1,
  5. Kenneth Sands3,
  6. Jeremy Richards1,2,
  7. Katherine Johnston2,4,
  8. Mark Aronson1,2
  1. 1 Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
  2. 2 Harvard Medical School, Boston, Massachusetts, USA
  3. 3 Hospital Corporation of America Healthcare, Nashville, Tennessee, USA
  4. 4 Department of Medicine, Massachusetts General Hospital, Boston, USA
  1. Correspondence to Dr Grace C Huang, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA; ghuang{at}


Background Diagnostic errors result in preventable morbidity and mortality. The outpatient setting may be at increased risk, where time constraints, the indolent nature of outpatient complaints and single decision-maker practice models predominate.

Methods We developed a self-administered diagnostic pause to address diagnostic error. Clinicians (physicians and nurse practitioners) in an academic primary care setting received the tool if they were seeing urgent care patients who had previously been seen in the past two weeks in urgent care. We used pre–post-intervention surveys, focus groups and chart audits 6 months after the urgent care visit to assess the impact of the intervention on participant perceptions and actions.

Results We piloted diagnostic pauses in two phases (3 months and 6 months, respectively); 9 physicians participated in the first phase, and 16 physicians and 2 nurse practitioners in the second phase. Subjects received 135 alerts for diagnostic pauses and responded to 82 (61% response). Thirteen per cent of alerts resulted in clinicians reporting new actions as a result of the diagnostic pauses. Thirteen per cent of cases at a 6-month chart audit resulted in diagnostic discrepancies, defined as differences in diagnosis from the initial working diagnosis. Focus groups reported that the diagnostic pauses were brief and fairly well integrated into the overall workflow for evaluation but would have benefited as a real-time application for patients at higher risk for diagnostic error.

Conclusion This pilot represents the first known examination of diagnostic pauses in the outpatient setting, and this work potentially paves the way for more broad-based systems and/or electronic interventions to address diagnostic error.

  • diagnostic errors
  • cognitive biases
  • primary care

Statistics from


Diagnostic error is a ‘diagnosis that has been missed, wrong, or delayed, as detected by some subsequent definitive test or finding’.1 The risk of claims related to diagnostic error is more than twice as common in the outpatient setting;2 3 contributing factors include short visits, the extended duration over which presentations may evolve and the benign nature of ambulatory diagnoses, resulting in low vigilance for rare illnesses.4

While systems issues in the diagnostic process (eg, failure to communicate abnormal results) play an important role, the largest contributor to diagnostic error is cognitive error.1 5–7 Diagnostic timeouts, which include checklists,8 prompting questions9 or cognitive debiasing strategies,7 have been proposed to combat diagnostic error caused by cognitive factors.8 10 To our knowledge, a cognitive forcing function in the diagnostic process similar to a diagnostic timeout has not been studied in a primary care practice. Therefore, we describe the implementation of ‘diagnostic pauses’ (occurring 2 days after an index visit) in the ambulatory setting.


Our primary care clinic was situated within an academic medical centre and comprised 63 faculty internists (MDs), 134 internal medicine residents and 9 nurse practitioners (NPs), with 42 000 patients over 100 000 annual visits.

Root-cause analyses of a cluster of diagnostic errors revealed cognitive biases related to multiple visits—anchoring bias (eg, a patient thought to have persistent gout was found to have a fractured toe) and vertical line failure7 (eg, a visit for back pain preceded by a visit for urinary tract infection turned out to be vertebral osteomyelitis). Informed by this pattern of diagnostic risk, a review of the literature3 and consideration of practice logistics, we ultimately defined our high-risk cohort, namely our inclusion criteria, as patients returning for an urgent care visit twice in 2 weeks, with the second visit occurring with a study participant. We excluded patients belonging to residents.

We developed a diagnostic pause tool (box), involving review of the tool by experts in medical cognition, cognitive interviews of clinicians and pilot testing. The final version met our additional goals of being feasible (sent via email) and brief (completed in a few minutes). We queried the scheduling database to identify patients meeting inclusion criteria then emailed an alert to the provider 2 days after the second clinic visit with a link to the diagnostic pause tool.

Box :

Diagnostic pause tool

You are receiving this form because one of your patients has returned for an urgent care visit for the second time in 2 weeks

  1. What is the working diagnosis for this presentation?

  2. What features of the case go against this diagnosis?

  3. Could it be a can’t miss diagnosis such as ‘cancer or clot’?

    ☐Doubt it ☐ Maybe ☐ Actually, yes

  4. Having had the chance to reflect on this case and to see test results, do you think you will do anything different?

    ☐ I’m OK with my current plan of action for this patient.

    ☐ Now that I’ve thought about it some more, I will (check all that apply)

      ☐ Look up some information (web, UpToDate)

      ☐ Order another test

      ☐ Make a medication change

      ☐ Have the patient come back for follow-up

      ☐ Refer the patient to a specialist

      ☐ Contact the primary care physician to do one of the above

      ☐ Ask the triage nurse to call patient in 3–5 days

      ☐ Other

We recruited volunteer clinicians through meetings, emails to the entire practice of 72 faculty internists and NPs, and individual outreach. We offered $10-per-patient stipends to participants. By a priori design, we conducted the study in two incremental phases, from May to August 2013 and from July 2014 to January 2015; the first phase was designed as a proof of concept to limit disruption to the clinic practice, with a planned expansion to more clinicians contingent on positive receptivity and encouraging preliminary analyses.

We developed a survey instrument to elicit perceptions about diagnostic reasoning. The postintervention version included additional questions about the diagnostic pause. We piloted the survey among six internists using cognitive pretesting11 and revised the survey accordingly.

We conducted a chart audit after the visit, examining the working diagnosis at the index visit and second visit, and the ultimate diagnosis at 6 months after the second visit. We defined diagnostic discrepancy as a final diagnosis clinically distinct from the working diagnosis, and ‘diagnostic error caused by cognition’ as an instance in which cognitive bias was clearly present in the diagnostic process.

We compared preintervention with postintervention survey results as paired data, using McNemar’s test for dichotomous variables (eg, agreement dichotomised as ‘strongly agree’ or ‘agree’ vs other options) or Wilcoxon signed-rank test for ordinal variables. We conducted two focus groups after each phase, composed of volunteers from the subject population, solicited by group email. Focus group questions centred on the experience of subjects completing diagnostic pauses. We audio-recorded the focus group sessions and used a third-party vendor to transcribe the recordings. We used thematic analysis12 to identify major motifs. Quantitative analyses were conducted using Stata V.13.0.


Nine primary care physicians (PCPs) participated in the first phase, 16 PCPs and 2 NPs in the second. Fifteen were female, the mean number of years since medical (or nursing) school was 16.5 years and the average number of clinic sessions per week was 2.9. Subjects received a total of 135 alerts for diagnostic pauses (mean=6.1 per provider receiving alerts). Subjects completed 82 alerts (response rate 61%) at an average of 6.4 days after the second visit (SD 5.3 days).

Categories of working diagnoses are summarised in table 1. Most (80%) of the chief complaints for the second visits were related to the first visits. In 48 (59%) instances, clinicians identified features inconsistent with the working diagnosis (eg, lack of response to empiric treatment, persistence of symptoms or negative first-line tests). Clinicians articulated wishing to take additional actions (eg, order another test, refer the patient to a specialist, make a medication change) after a minority (13%) of diagnostic pauses.

Table 1

Diagnostic pause results

Chart reviews 6 months after the index visit revealed 11 (13%) diagnostic discrepancies (table 2), with no significant difference among cases that resulted in additional actions after the pause compared with those that had not. No definite diagnostic errors caused by cognitive biases were detected; however, four patients received multiple serial courses of antibiotics for infectious symptoms that should have resolved. One diagnostic discrepancy resulted in life-curtailing or life-threatening injury (ie, musculoskeletal back pain was found to be caused by metastatic cancer).

Table 2

Diagnostic discrepancies between initial diagnosis and final diagnosis

Clinicians reported a spectrum of diagnostic reasoning patterns, summarised in table 3. The majority (66%) stated that a diagnosis was often present in their minds by the end of the visit and were less likely to look up references (3%) or examine the medical record (52%). They reported a heavy reliance on clinical judgement as a driver of diagnostic reasoning (93%), followed by experience (76%), and less often intuition (28%). Changes in diagnostic reasoning patterns were not statistically significant after the intervention. Free-text responses revealed sentiments that the diagnostic pauses did not usually help with the target cases, that they were not frequent enough to make a difference and that they were redundant with self-reflection that already naturally occurs with patients presenting more than once. Suggestions for improving the intervention centred on the desire for the diagnostic pauses to occur at the time of the visit rather than afterwards, and for them to be integrated with the EMR.

Table 3

Subjects’ reported diagnostic reasoning patterns

Thematic analysis of focus group transcripts revealed several common qualitative themes, summarised in table 4. First, participants acknowledged contributors to diagnostic error, which included the isolated nature of decision-making and time pressures. Second, the diagnostic pause was not perceived to take a significant amount of time. Several commented that the pause would have been more valuable in the form of an alert at the time of the visit. Some participants felt that the process had an additional impact on patients not targeted by the diagnostic pause and in different settings. Lastly, they recognised that many of their patients did not seem to need an opportunity for re-reflection and did not necessarily represent the highest risk for a diagnostic error.

Table 4

Thematic analysis of focus groups after diagnostic pauses


We implemented a mixed-methods pilot study of diagnostic pauses in an academic ambulatory setting and found that a minority of cases led to a reported change in action as a result of the stimulated reflection. Although several diagnostic discrepancies occurred, no cognitive errors were apparent. Perceptions from participating clinicians highlighted the benefit of a brief tool and impact on diagnostic reasoning in general. To our knowledge, this work represents the first description of a process for ambulatory diagnostic pauses and paves the way for more systems and/or electronic interventions to address diagnostic error.

A majority of clinicians did not alter their diagnostic plans as a result of the pause. Qualitative results suggested that physicians already have mechanisms to reflect on cases, whether on their own or with colleagues. Additionally, this study took place in a highly resourced academic practice among very experienced clinicians, which may have buffeted the kind of diagnostic uncertainties that may arise among more junior physicians or in underserved healthcare settings.

Only a few final diagnoses differed from the working diagnosis in this cohort of patients considered high risk for diagnostic error, which suggests that diagnostic accuracy was not a prominent issue. Furthermore, rarely did morbidity or mortality result from these errors; presenting symptoms in the outpatient setting are often self-limited in nature, which makes it difficult to implement a tool that alters the diagnostic process in a meaningful fashion.

Participants reported the diagnostic pause affected not only the cases included in the study but also influenced their approach to other patients, suggesting this pilot functioned to raise awareness about cognitive pitfalls and may have promoted metacognition13 (‘thinking about one’s own thinking’). This observation also underscores the scalability of cognitive forcing functions in debiasing clinicians7 8; though, it should be noted that curtailing heuristics through cognitive forcing functions should not be viewed as a panacea for diagnostic error.14

One of our findings highlighted the potential greater value of having a pause in the moment (ie, a true diagnostic timeout) rather than after the visit as we initially designed it. On the one hand, a timeout in the moment (ie, reflection-in-action15) may abort a cognitive bias before it occurs or has time to gain momentum. On the other hand, an after-the-fact reflection (ie, reflection-on-action15) is well supported by the literature, better facilitates measurement of impact and supports retrieval of information, another evidence-based approach.16

This work builds on proposed strategies and research data on diagnostic error. We used a general checklist approach but eschewed resource-intensive differential diagnosis or disease-specific checklists.8 17 We tailored our implementation to target cases with inherent uncertainty18 rather than on all cases to increase the yield of the diagnostic pause. We relied on self-reflection19 to activate more deliberate, analytic pathways of reasoning.20 We enforced a consideration of alternatives, a commonly cited debiasing strategy.7 8 10 19

Among our limitations was that our patient cohort may not have represented the highest risk patients to benefit from a diagnostic pause. Study participation, which was voluntary, only represented a small subset of eligible clinicians, and not derived from a random sampling of all eligible providers, may have selected for particularly motivated and/or self-reflective clinicians less likely to commit diagnostic errors. The chart review of final diagnoses was not blinded to initial diagnoses and would not have captured data from patients subsequently seeking care outside of our institution. We were only able to distinguish discrepancies in diagnosis rather than conclusively detecting diagnostic error; definitive determination of diagnostic error caused by cognitive factors requires in-depth probing of clinician thinking, which was not performed in our study. Heightened attention to the diagnostic process may have derived primarily from the identification of high-risk patients rather than from the diagnostic exercise itself. Our pilot was not intended, and was underpowered, to detect change in clinical outcomes. We did not follow-up patients who did not undergo a diagnostic pause to identify whether their outcomes differed. Our response rate to the process was moderate and subject to a selection bias for cases with higher diagnostic uncertainty, when in fact circumstances in which clinicians are very confident about their diagnoses may be more appropriate for diagnostic pauses.

In the process of conducting this work, we learnt that identification of the high-risk population is both essential and challenging in instituting a reflective process that yields change. This area of investigation has not been the focus of initiatives to combat diagnostic error and should be part of the research agenda to improve systems in addressing diagnostic error.13 21 Our development process—the identification of an at-risk population for diagnostic error, the workflow design of a diagnostic pause, a mixed-methods evaluation of impact—also depicts a roadmap by which other institutions and practices can individualise interventions to address diagnostic errors. Larger clinical trials will be necessary to assess the effectiveness of diagnostic pauses and other cognitive debiasing strategies in other settings to combat diagnostic error.


View Abstract


  • Funding The study was funded by CRICO/Risk Management Foundation of the Harvard Medical Institutions.

  • Competing interests None declared.

  • Ethics approval Beth Israel Deaconess Medical Center Institutional Review Board.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.