Article Text

Download PDFPDF

Impact of medical education on patient safety: finding the signal through the noise
  1. Jasmine Hwang,
  2. Rachel Kelz
  1. Department of Surgery, University of Pennsylvania, Philadelphia, Pennsylvania, USA
  1. Correspondence to Dr Rachel Kelz, Surgery, University of Pennsylvania, Philadelphia, PA 19104, USA; rachel.kelz{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Medical education and patient care are inextricably linked. At this time, with the limitations of simulation training and our continued dependence on educated physicians for clinical decision-making, one domain cannot exist without the other. The involvement of medical trainees in patient care means it is vital that the impact of changes to medical training programmes on patient outcomes are assessed with well-designed studies. The study, “National improvements in resident physician-reported patient safety after limiting first-year resident physicians’ extended duration work shifts,”1 by Weaver and colleagues published in this month’s issue of BMJ Quality and Safety signals the need for a robust discussion of education policy research within the field of medicine.

Before specifically addressing the approach used by Weaver et al,1 a review of the complexities involved in studying the impact of medical education policy is warranted.2 The field of ‘education outcomes research’ examines the impact to patients of changes to the educational process and clinical learning environment. As with all robust research, studies must be designed to answer a specific question. Typically, this necessitates an explicit definition of the exposure and outcomes to be examined. The exposures under investigation such as the change in education policy evaluated by Weaver et al are often straightforward, but the outcomes are frequently complex. For example, changes to education policy may influence the care delivered by trainees immediately or years later depending on the nature of the change. A further challenge to a well-conceived education outcomes study is the selection of the outcome measure to be evaluated. Ideally, studies on the impact of patient safety should focus on an objective outcome measure rather than a proxy. However, this may not always be feasible. Finally, changes to education policy or practice are not made in isolation. There are usually multiple other influences occurring at any one time that can impact any level of training (figure 1). As such, each study must carefully consider how other simultaneous influences such as new technology (eg, robotic surgery) and alterations in practice patterns (eg, shift to outpatient procedures) might impact the study findings. Each of the aforementioned challenges must be addressed in order to permit the conduct of robust, reliable and reproducible works.

Figure 1

Effects of educational shocks on learners and patients across the educational continuum.

To provide context for international readers, in the USA, first-year residents are typically in the first year of postgraduate training after completion of an undergraduate degree and medical school. These physicians-in-training are often referred to as interns or first-year residents. They serve as frontline physicians who place orders for patients and are often the first to receive calls from the nursing staff in the inpatient setting. Long work hours have been associated with resident impairment and medical errors.3 4 In 2011, the Accreditation Council for Graduate Medical Education (ACGME) implemented restrictions to resident duty hours which included an 80-hour limit to the work week, 16-hour limit of continuous in-hospital duty and at least 1 day off every week on average over 4 weeks. A resident’s shift is typically anywhere from 12 to 30 hours. The 16-hour work limit reduced the consecutive number of hours that first-year residents could work in a single shift from 30 to 16. The duty hour reduction was revoked in 2017 because several large well-designed studies demonstrated that flexible work hours were non-inferior to those imposed by the 16-hour limit.5–7

With that background, we can now explicitly examine the work by Weaver et al which aimed to examine the effects of the ACGME 16-hour work limit for first-year residents on patient safety. The authors found that the 16-hour work rule was associated with a 32% reduced risk (RR) of resident physician reported significant medical errors (RR 0.68; 0.64–0.72), a 34% reduced risk of reported preventable adverse events (RR 0.66; 0.59–0.74) and a 63% reduced risk of reported medical errors resulting in patient death (RR 0.37; 0.28–0.49). The prospective nature of data collection and use of the same survey instrument for two time periods (2002–2007; 2014–2017) are strengths of the study design. Furthermore, the study subjects represent the full complement of ACGME accredited specialty programmes which would normally improve the generalisability of the study findings. However, as only 9% of residents consented to participation, the generalisability may be limited.

There are several design elements of the study that limit complete confidence in the reliability of the findings. First, as acknowledged by the authors, the use of resident perceptions of errors may be subject to recall bias and introduce self-reporting bias. As a proxy, it is also one step removed from the most important outcome—objectively measured medical error. This study relies on participants to self-report both the true exposure, duty hours and the outcome of medical errors. It has been well established that residents may falsely report duty hours,8 especially early on in training,9 and that residents tend to selectively report errors depending on the outcome of the error.10 Furthermore, the study participants were asked to report ‘significant’ medical errors, which leaves it open to the resident to determine what qualifies as significant. In addition, residents were asked to self-report whether ‘they’ had personally made significant medical errors. This can be difficult to discern as rarely is one individual to blame for adverse events: they are multistep and complex processes. Adverse events are difficult to capture and physicians in particular are known for self-reporting not only fewer but also a narrower spectrum of adverse events when compared with nurses or patients.11 12 Regarding the mode of survey distribution, residents are less likely to report adverse events in email queries, as used in this study, compared with face-to-face interviews, which may have resulted in under-reporting.13

An additional measurement concern exists because Weaver et al did not consider the expansion of education on quality and safety that occurred in parallel with the implementation of the 16-hour rule. Because discrete educational sessions on safety event reporting result in increased reporting for several weeks,14 it is possible that the study participants enrolled after the 16-hour rule implementation reported a greater proportion of these events because they had recently been prompted to keep errors on the top of their minds. Moreover, a coincident focus on safety culture, attributed to the implementation of the clinical learning environment initiative in 2012, may have increased the tendency to report adverse events10 15 in the post-period as well. If this was the case, it is possible that the protective effect was even greater than estimated by this study. Alternatively, there has been an increased presence of advanced practice providers on inpatient teams and units in the USA who may provide a safeguard for patients and first-year residents against medical errors. This safety feature did not exist prior to duty hour reform and likely expanded alongside the 16-hour work restrictions. These confounders would bias study results in both directions, either overestimating or underestimating adverse events. It is difficult to evaluate the magnitude of the effect these simultaneous influences had on learners and patients, or the results of this study.

The findings of Weaver and colleagues are at odds with several of the other high-quality, published studies that found little to no association between the introduction of the 16-hour work rule and patient outcomes or resident wellness in surgery and internal medicine.16–18 Weaver et al discuss this discrepancy by asserting that the outcome measures used in other studies, specifically mortality and morbidity, are not directly related to the work of first-year resident-physicians and, therefore, not sensitive enough to detect the adverse impact of the extended work hours on patient care. We argue against this assumption because first-year residents are often first responders and can provide early interventions that save lives. Similarly, the most common non-technical adverse event in surgery is medication related.19 As the primary caretaker of their patients responsible for medication reconciliation and inpatient orders, first-year residents are inherently connected with medication-related adverse events.20 Finally, we agree that residents at different levels in their training may impact various safety measures in different areas and to different extents.

Questions related to best practices in medical education are important to our learners, our patients and the public at-large. While there are no magic metrics, education outcomes research should move towards designs that can handle the complexity of the study questions and assessment metrics that objectively quantify the outcomes of interest. Multiple approaches to study the impact of the 16-hour work-hour limit and other changes in medical education are needed as they can serve to complement one another. Pragmatic trials and observational studies can be used to perform many of the patient-centred assessments. Because of concerns regarding the retrospective analysis in this study, a prospectively designed pragmatic trial, such as that employed by the FIRST trial where duty hours would be assigned at the programme level, might have been useful.21 Furthermore, the limited response rate could have been addressed by the use of data from mandatory surveys such as the ACGME resident survey. To address issues of self-reporting, objective data on adverse outcomes reported to the state or federal agencies like the Centers for Medicare & Medicaid Services could be used. A difference-in-difference approach would have been helpful to control for temporal factors other than the selected exposure.22 23 New educational approaches, materials and rules should be evaluated just as any new treatment that may impact patient care would be tested before receiving approval. Learner-centred assessments require thought and consideration to make sure they are free from bias and discrimination particularly as there are many educational shocks throughout the educational continuum.

Ultimately, this study provides a new perspective on the effects of the 16-hour consecutive work-hour limit policy on patient care. The authors bring up valid concerns about patient safety with the reversal of this policy in 2017. However, given the inherent design limitations, we suggest caution in the re-implementation of the 16-hour work-hour limit based solely on the findings of this study. Instead, medical errors and workforce safety should be closely monitored to make sure that the clinical environment remains safe following the rollback of the 16-hour work rule. Also, if the authors can collect a more robust sample of responses, a difference-in-difference study design might offer a better understanding of the 16-hour rule implementation on medical error.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.



  • Twitter @surgeryspice

  • Contributors Each author contributed to the concept, drafting and critical revision of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Commissioned; internally peer reviewed.

Linked Articles