Article Text

PDF

Task errors by emergency physicians are associated with interruptions, multitasking, fatigue and working memory capacity: a prospective, direct observation study
  1. Johanna I Westbrook1,
  2. Magdalena Z Raban1,
  3. Scott R Walter1,
  4. Heather Douglas2
  1. 1Centre for Health Systems and Safety Research, Australian Institute of Health Innovation, Faculty of Medicine and Health Sciences, Macquarie University, Sydney, NSW 2109, Australia
  2. 2School of Psychology and Exercise Science, Murdoch University, Singapore, Singapore
  1. Correspondence to Professor Johanna I Westbrook, Centre for Health Systems and Safety Research, Australian Institute of Health Innovation, Macquarie University Faculty of Medicine and Health Sciences, Macquarie University, Sydney, NSW 2109, Australia; Johanna.westbrook{at}mq.edu.au

Abstract

Background Interruptions and multitasking have been demonstrated in experimental studies to reduce individuals’ task performance. These behaviours are frequently used by clinicians in high-workload, dynamic clinical environments, yet their effects have rarely been studied.

Objective To assess the relative contributions of interruptions and multitasking by emergency physicians to prescribing errors.

Methods 36 emergency physicians were shadowed over 120 hours. All tasks, interruptions and instances of multitasking were recorded. Physicians’ working memory capacity (WMC) and preference for multitasking were assessed using the Operation Span Task (OSPAN) and Inventory of Polychronic Values. Following observation, physicians were asked about their sleep in the previous 24 hours. Prescribing errors were used as a measure of task performance. We performed multivariate analysis of prescribing error rates to determine associations with interruptions and multitasking, also considering physician seniority, age, psychometric measures, workload and sleep.

Results Physicians experienced 7.9 interruptions/hour. 28 clinicians were observed prescribing 239 medication orders which contained 208 prescribing errors. While prescribing, clinicians were interrupted 9.4 times/hour. Error rates increased significantly if physicians were interrupted (rate ratio (RR) 2.82; 95% CI 1.23 to 6.49) or multitasked (RR 1.86; 95% CI 1.35 to 2.56) while prescribing. Having below-average sleep showed a >15-fold increase in clinical error rate (RR 16.44; 95% CI 4.84 to 55.81). WMC was protective against errors; for every 10-point increase on the 75-point OSPAN, a 19% decrease in prescribing errors was observed. There was no effect of polychronicity, workload, physician gender or above-average sleep on error rates.

Conclusion Interruptions, multitasking and poor sleep were associated with significantly increased rates of prescribing errors among emergency physicians. WMC mitigated the negative influence of these factors to an extent. These results confirm experimental findings in other fields and raise questions about the acceptability of the high rates of multitasking and interruption in clinical environments.

  • medication safety
  • communication
  • emergency department
  • human factors
  • interruptions

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

Statistics from Altmetric.com

Introduction

Increasing attention has been placed on the demands that the clinical work environment imposes on individual providers and teams, particularly as healthcare organisations strive for greater efficiency in high-throughput services such as emergency departments (EDs). Healthcare is a complex adaptive system, and as such stakeholders learn, adapt and self-organise.1 Individuals both influence and are influenced by others. The work of emergency physicians provides a window into the nature of this complex adaptive system. Individual physicians apply various strategies to manage their workload in the face of often unpredictable demands to deliver safe care to patients.2 3 The success of their task management will be influenced by their individual choices and the way in which team members interrelate to them. Relatively little research has characterised these work management behaviours or investigated the extent to which they are effective and safe.4–6 Communication within clinical teams has been considered an important element in error production.7 Multiple studies8–12 have demonstrated interruption rates for emergency physicians are high,13–15 and more recently, studies have identified multitasking as a frequently used15–17 and encouraged18 strategy to handle competing demands.

Experimental psychological research has sought to identify the ways an individual’s task performance may be impacted by interruptions and multitasking.19 20 These experiments show negative effects due to the additional cognitive demands these behaviours incur. Simulation studies of driver distraction have demonstrated significant hazards to task performance when drivers attempt to multitask, from listening to a passenger21 22 to using a mobile phone.23 Very few studies have attempted to investigate these same effects among clinicians. One simulation showed interruptions to nurses were associated with increased chemotherapy administration errors,24 and a second found anaesthetists who immediately responded to an interruption all failed to check a blood product before transfusing to a patient.25 A study26 of the effects of multitasking on diagnostic decision-making found reduced performance when subjects were asked to listen to verbal patient updates of information about other patients, which they were required to remember. Studies in real-world clinical settings require direct observation and are rare.11 27 Reasons for this include the significant methodological challenges of studying these phenomena in real-world settings and the difficultly in identifying task errors that can be feasibly and reliably measured.

Individual characteristics may influence task errors. Experiments show an individual’s working memory capacity (WMC) is inversely correlated with task errors.28 Working memory allows for the temporary storage and active maintenance of task-relevant information in the face of distractions.29 30 Interruptions and multitasking make demands on WMC by requiring individuals to process information unrelated to their primary task and thus increase cognitive load.31 Individuals high in WMC are better able to actively maintain information in the focus of attention and can more efficiently retrieve information momentarily displaced due to disruption.29 WMC has thus been implicated as a variable that influences interruption effects32 33 and predicts the ability to multitask effectively.34 As such it has been hypothesised that individuals with lower WMC scores may exhibit increased task times and more errors when interrupted.35 WMC has been shown to be negatively associated with age.36 37

Fatigue similarly has been demonstrated to reduce task performance during clinical simulation studies, but there is very limited empirical evidence of the effects of fatigue on performance in clinical field studies.38–40

The aim of this study was to assess the relative contributions of interruptions and multitasking by emergency physicians to prescribing errors, while also considering a range of individual, physician and contextual characteristics such as WMC, preference for multitasking, age, seniority, workload and fatigue.

Methods

Design and participants

We conducted a quantitative, direct observation study of ED physicians in a 440-bed teaching hospital in Sydney, Australia, which annually treats ~50 000 patients. Doctors were invited to participate, comprising resident medical officers (RMOs), senior RMOs (SRMOs), registrars and staff specialists/consultants.

Data collection

Direct observations

An information session was held for physicians to inform them about the study. Physicians were informed that the study was to investigate work patterns in the ED. Individual physicians were then approached to participate. Following informed consent, observers closely shadowed physicians for up to 3-hour intervals during day shifts (8:00–18:00) between July and October 2014. Observation periods of >3 hours when collecting such detailed data are taxing on observers and increase the risk of a reduction in data quality. A sampling matrix was prepared to ensure representation across the shift and doctors by seniority. Observers used this matrix to select participants by clinician group (ie, consultant, RMO) to observe at specific times of the day. For each participant, age, gender and position were recorded. The Work Observation Method by Activity Timing (WOMBAT) observational approach, which allows multidimensional capture of clinical tasks (http://aihi.mq.edu.au/project/wombat-work- observation-method- activity-timing), was applied.41 WOMBAT has previously been used to study clinicians in a range of countries and settings.14 42–44 Observers used the WOMBAT software on handheld computers to record tasks, interruptions and multitasking, which are automatically time-stamped when entered. Task definitions have been published in our study protocol.45 During prescribing tasks (which occurred on paper charts), observers recorded the patient’s medical record number to allow for subsequent review to identify and classify prescribing errors.45

Two observers conducted the observations. Study variables were strictly defined,45 and multiple training sessions (>30 pilot hours) tested the consistent collection of these in the field. Inter-rater reliability was assessed by two methods. First, we compared the proportions of tasks between observers, as well as proportions of time within task categories for the main analysis variables prior to and at several points during the study using Monte Carlo permutation tests. No significant differences between the two observers were detected using a significance level of α=0.05. We also applied Cohen’s kappa to the data divided into 1 s time windows to assess agreement on task type. For the five inter-rater reliability sessions conducted throughout the study period, the kappa score ranged from 0.65 to 0.82 showing good agreement between observers.46

Definitions

A prescribing task was defined as a physician writing one or more medication or fluid orders for administration to a patient while in hospital. Interruptions were defined as an observable external stimulus resulting in a change in a physician’s task. Multitasking was defined as conducting two tasks in parallel (eg, typing on a computer while verbally answering a question). A physician may self-initiate multitasking or multitask in response to an external stimulus.

Assessment of prescribing errors

Following observations, an experienced hospital pharmacist reviewed medication orders observed during the study, while blind to the observational data. Medication orders were assessed for legal/procedural errors (eg, unapproved abbreviations, missing drug units), clinical errors (eg, wrong drug due to a drug–disease interaction) and severity using previously applied definitions.47 All clinical errors were verified by a second pharmacist and differences resolved by consultation. Prior to this verification, inter-rater reliability between the two pharmacists for error classification was assessed using a random sample of orders (kappa=0.64).46

Measurement of WMC, polychronicity and sleep

The WMC of physicians was assessed using an automated version of OSPAN.48 49 On a computer participants are presented with a series of trials involving an alternating sequence of arithmetic equations and to-be-remembered consonants. Participants must judge the correctness of the equation and encode the consonant for subsequent recall. The trials consist of 75 letters and 75 sums in total. OSPAN correlates well with other measures of WMC, has both good internal consistency (α=0.78) and test–retest reliability (0.83),49 and takes approximately 20 min to complete.

Polychronicity (ie, preference for multitasking and a belief that this is efficient) was assessed using the adapted version of the Inventory of Polychronic Values (IPVs).50 Participants respond to 10 items (eg, ‘I believe people do their best work when they have many tasks to complete’) rated on a seven-point scale (strongly agree to strongly disagree). The IPV takes approximately 5 min to complete. The median internal scale reliability has been reported as 0.84, with test–retest reliability between 0.78 and 0.95.50 Physicians each completed the OSPAN and IPV prior to observation sessions. OSPAN and IPV scores are largely unaffected by an individual’s levels of fatigue, stress or time of day.51–54

After each observation, session physicians were asked about their hours of sleep in the preceding 24 hours, and whether this amount was average, below average or above average for them.

ED workload

Workload was measured with a modified version of an existing metric55 which estimates the time-specific ratio of patients to doctors, weighted for triage scores. This measure improves on the use of the number of daily presentations as it takes into account temporal variation and patient-mix.56 Workload data were extracted from the ED’s electronic information systems and merged with the observational WOMBAT data so that the time-specific workload measures were synchronised with the observational data of individual physicians. Patient age was included as a further potential indicator of patient complexity.

Statistical analysis

The error rate (error count per medication order) was modelled separately for clinical and legal/procedural errors using multivariate Poisson regression applied to task-level data. The binary variables representing whether a task received any interruptions and whether there was any multitasking were always retained in the models as they represent a key study focus. Other dependent variables considered for inclusion comprised doctors’ seniority, age, gender, OSPAN score and self-reported sleep, patient age and time-specific department workload. A generalised estimating equations approach was used to account for correlation of task-level error rates within individual doctors. Model fit was assessed by testing the equality of the model deviance and the df, which was not significant, indicating satisfactory fit. A change-in-estimate method was used to select variables. This is preferable to P-valued-based automatic variable selection.56 Following a process of deletion and reselection of variables with progressively narrowing criteria, the final variables were retained if they changed at least one of the coefficients for interruptions or multitasking by >10%. The models had a minimum of 10 prescribing tasks per variable, which is adequate for estimation of coefficients and CIs.57 Models were implemented with the GENMOD procedure using SAS software V.9.4 (SAS Institute).

Results

Participant characteristics

In total, 36 of 39 physicians (92%) approached to participate agreed. Three declined, citing a preference not to be shadowed. Our final study sample represented ~50% of the total ED medical staff. We did not study medical staff at night or on weekends. These staff represented the remaining 50% of staff who worked in the department. Doctors were shadowed for 120 hours over 58 sessions. Twenty-eight doctors were observed prescribing: 5 (18%) RMOs, 9 (32%) SRMOs, 8 (29%) registrars and 6 (21%) consultants. Each physician was observed for between 1 and 3 times with an average of 1.6 sessions.

WMC, polychronicity and sleep

The mean OSPAN score was 40.9 out of 75 (SD 18.5). There were no significant differences in mean scores by doctor seniority (P=0.11). However, consistent with previous findings on WMC, scores decreased significantly with age (P=0.03).37

Results from the IPV showed that our participants demonstrated a neutral response to working on multiple things at once (score 3.87; SD 1.15). There were no significant differences in IPV scores by physician seniority or age (P=0.17).

Sleep-related questions were completed after 56 of the 58 observation sessions (after two sessions the physician was not available to answer these questions). Average sleep was reported in 64% of sessions (n=36), below average in 20% (n=11) and above average in 16% (n=9). The mean amount of sleep reported was 6.6 hours (range 4.0–8.5). Mean hours of sleep self-reported as ‘below average’, ‘average’ and ‘above average’ were 5.6, 6.7 and 7.8, respectively.

Prescribing errors

A total of 106 prescribing tasks, comprising 239 medication orders, for 69 patients (mean age 64 years) were reviewed for errors. A total of 208 errors were identified, 27 clinical (0.4/patient) and 181 legal/procedural (2.6/patient). Overall, 144 (60%) medication orders had ≥1 prescribing errors. The overall prescribing error rate was 0.87 errors/order; the clinical error rate was 0.11/order and legal/procedural error rate was 0.76/order. Tables 1 and 2 present examples of errors identified. Most errors (n=196, 94.2%) were rated as of insignificant or minor severity, and 12 (5.8%) of moderate severity (11 clinical and 1 legal/procedural).

Table 1

Examples of clinical prescribing errors identified

Table 2

Examples of procedural and legal prescribing errors identified

Factors associated with prescribing errors

On average physicians experienced 7.9 interruptions/hour and 9.4 interruptions/hour while prescribing. They spent 4.6% of their overall time multitasking, but 20.1% of prescribing time multitasking.

Multitasking during prescribing was significantly associated with an increased rate of legal/procedural errors (rate ratio (RR) 1.86; 95% CI 1.35 to 2.86). Being interrupted during prescribing showed no evidence of an effect on legal/procedural error rates (RR 1.08; 95% CI 0.77 to 1.51) (table 3). Error rates differed significantly by seniority with lower rates of legal/procedural errors among junior doctors (eg, residents) compared with their senior (consultants/staff specialists) colleagues, with the P value from a type 3 test of 0.031. A significant type 3 test provides evidence that the outcomes (error rates) in the categories of seniority (e.g. RMO, registrar) do not come from the same population.

Table 3

Model estimates of factors associated with prescribing errors

For every one-point increase on the 75-point OSPAN scale, there was a 2% decrease in the legal/procedural error rate (RR 0.98; 95% CI 0.97 to 0.99). This is equivalent to a 19% reduction in the legal/procedural error rate for every 10-point increase in a physician’s OSPAN performance (RR 0.81; 95% CI 0.76 to 0.87). The variables representing patient age, physician age and sex, polychronicity, amount of sleep in the previous 24 hours, time of day and workload did not satisfy the model inclusion criteria.

Clinical errors increased almost threefold when physicians were interrupted (RR 2.82; 95% CI 1.23 to 6.49). These error rates also increased with each year of patient age (RR 1.05; 95% CI 1.02 to 1.08) and physician age (RR 1.07; 95% CI 1.00 to 1.16). Clinical error rates were inversely related to doctor seniority with RMOs having the highest error rate relative to consultants.

WMC was also significantly associated with clinical errors. A 2% decrease in the clinical error rate was observed for each point of increase in OSPAN score (RR 0.98; 95% CI 0.97 to 0.99). This is equivalent to a 19% reduction in the clinical error rate for each 10-point improvement in OSPAN score (RR 0.81; 95% CI 0.71 to 0.92).

Where doctors reported lower than average sleep in the previous 24 hours, the clinical error rate was >15 times greater than when doctors reported average sleep (RR 16.44; 95% CI 4.84 to 55.81). Neither physician gender, polychronicity, time of day nor workload were associated with clinical prescribing errors.

Discussion

Interruptions, multitasking, WMC and errors

To our knowledge, this study is the first to investigate and demonstrate significant negative associations of interruptions and multitasking with physicians’ error rates in a ‘real-life’ clinical environment.58 When interrupted, emergency physicians’ rate of clinical prescribing errors significantly increased, and when they multitasked their rate of legal/procedural prescribing errors increased. Further, our results showed a significant association between an individual’s WMC and their error rates, such that those with lower WMC made more errors.

Task complexity impacts on cognitive load. Being interrupted during more complex tasks taxes an individual’s processing limits which increases error risk.59 60 The most complex part of the prescribing task is to decide on the clinical elements, for example, the right drug. Other elements, such as selecting the correct abbreviations, are semiautomatic cognitive tasks. Thus, our finding that physicians made more clinical errors when interrupted, but legal/procedural errors were not impacted by interruptions may be explained by differences in task complexity. Our finding that multitasking was significantly associated with legal/procedural errors, but not clinical errors, was interesting and warrants further investigation. Experimental evidence indicates that a range of factors, such as the mode of multitasking (ie, whether the tasks being undertaken require similar or different skills, eg, visual, motor, auditory), as well as a subject’s ability to control when multitasking occurs, may influence the impacts observed.19

Multitasking is increasingly valued as a positive attribute in clinical care. In 2011, the American Board of Emergency Medicine model of clinical practice61 added multitasking and the ability to handle interruptions and task-switching as necessary skills to provide optimal care. Our findings suggest that continued encouragement of multitasking may contribute to task errors.

Sleep and prescribing errors

Our results present compelling evidence of the substantial effects of insufficient sleep on clinicians’ task performance. When physicians reported less than average sleep in the previous 24 hours, clinical error rates substantially (>15 times) exceeded those when physicians reported average sleep. Overall, emergency physicians reported receiving less than the recommended hours of sleep for adults (ie, 7–9 hours in 24 hours),62 and those in the ‘below-average group’ (5.6 hours), considerably fewer hours. Concerns about physician fatigue and error have motivated significant policy changes in reducing shift hours.63 64 However, two reviews of sleep and resident physicians64 65 concluded that while simulation and experimental evidence suggests that sleep loss should make a difference to the safety of clinical care, evidence to support this claim is lacking. Our study addresses this evidence deficit.

Implications for clinical work

These results should challenge thinking about the way in which clinicians are expected to manage work demands in dynamic clinical situations. Strategies involving immediate responses to interruptions and high rates of multitasking may be perceived by physicians as time-efficient16 but our results suggest that they may also have negative implications for the safe completion of tasks. Organisational and professional messages conveyed to clinicians are likely to contribute to driving the use of these work-demand strategies.66 This issue is of contemporary importance as demands on the clinical workforce rapidly expand, opportunities to multitask increase with technology and attention increasingly falls on improving work productivity.

We found that prescribing is particularly susceptible to interruption and multitasking compared with other tasks. This result parallels studies of nurses’ showing higher interruption rates during medication administration compared with other tasks.67 68 Blanket interventions aimed at reducing all interruptions are likely to be ineffective, inefficient and at times unsafe.68–72 Targeted interventions73 including limiting unnecessary interruptions through greater training about their potential effects6; identifying reasons, and reducing the need, for interruptions, for example, by making required information easily available74; redesigning work spaces to allow clinicians to perform more demanding cognitive tasks in areas less open to interruption; and introducing tools, including information technology, which can provide cues to allow more effective recovery from interruptions, should all be considered.68 69 The application of cognitive systems engineering to ED information systems shows promise.75

The individual differences in clinicians’ performance relative to their WMC introduce the possibility of intervening to train doctors in more effective individual work strategies. Individuals may adapt their task management strategies to compensate for lower WMC. For example, Szumowsak et al76 found experimental subjects with lower WMC accommodated by being less likely to respond to interruptions through more selective attention to their primary task.

As in previous experimental studies of interruptions and multitasking, the nature of the task errors studied (in our instance prescribing errors—many of which were of minor clinical significance) is of less importance than demonstrating the effects of additional cognitive load on an individual’s performance. The significance of this study lies in applying a robust methodological approach and replicating experimental findings in a real-world clinical setting.19 A limitation of our study was that we only investigated one ED. However, the prescribing error rate observed was comparable to other EDs,77–79 as was the rate of interruptions,8–11 suggesting that our study site is likely to be representative of other large EDs.

Conclusions

In this rare study of emergency physicians, we demonstrated that the frequently used work management strategies of interruption and multitasking had a negative impact on task errors, confirming effects found in experimental studies. We demonstrated that individuals with higher WMC are likely to be better able to operate in this dynamic environment, but for all physicians, adequate sleep appears fundamental to performance. Our results illustrate the complex interplay between individual physician characteristics, communication strategies in a dynamic setting and error production. The results raise interesting new questions about the value of some traditional and accepted ways of working in EDs and how best to prepare and support physicians to deliver safe care in this increasingly demanding environment.80

Acknowledgments

The authors thank Ms Dana Strumpman and Dr Lisa Pont for their expert assistance in the identification and classification of medication errors; and Dr John MacKenzie and the emergency physicians who participated in the study and who provided valuable feedback on the study results.

References

View Abstract

Footnotes

  • Contributors All authors contributed to the development of the detailed study methods and protocol. SRW and MR collected the observational data. HD administered the psychometric tests. MR led the classification and analysis of the prescribing error data. SRW designed and led the statistical modelling. JIW conceived of the study and obtained funding, prepared the manuscript and all authors contributed to the interpretation of results, revisions to the paper and final approval of the manuscript.

  • Funding This study was funded by National Health and Medical Research Council (1054146) and an Australian Research Council Discovery grant (DP160100943).

  • Competing interests None declared.

  • Ethics approval South Eastern Sydney Local Health District Human Research Ethics Committee.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Correction notice This article has been updated since publication as minor typographical errors were missed during proofing.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.