Article Text

Why patients’ disruptive behaviours impair diagnostic reasoning: a randomised experiment
  1. Sílvia Mamede1,
  2. Tamara Van Gog2,
  3. Stephanie C E Schuit3,
  4. Kees Van den Berge4,
  5. Paul L A Van Daele3,
  6. Herman Bueving5,
  7. Tim Van der Zee2,
  8. Walter W Van den Broek1,
  9. Jan L C M Van Saase3,
  10. H G Schmidt2
  1. 1Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Rotterdam, The Netherlands
  2. 2Department of Psychology, Erasmus University Rotterdam, Rotterdam, The Netherlands
  3. 3Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands
  4. 4Department of Internal Medicine, Admiraal de Ruyter Hospital, Goes, The Netherlands
  5. 5Department of General Practice, Erasmus Medical Center, Rotterdam, The Netherlands
  1. Correspondence to Dr Sílvia Mamede, Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Wytemaweg 80, Ae-242, Rotterdam 3015CN, The Netherlands; s.mamede{at}


Background Patients who display disruptive behaviours in the clinical encounter (the so-called ‘difficult patients’) may negatively affect doctors’ diagnostic reasoning, thereby causing diagnostic errors. The present study aimed at investigating the mechanisms underlying the negative influence of difficult patients’ behaviours on doctors’ diagnostic performance.

Methods A randomised experiment with 74 internal medicine residents. Doctors diagnosed eight written clinical vignettes that were exactly the same except for the patients’ behaviours (either difficult or neutral). Each participant diagnosed half of the vignettes in a difficult patient version and the other half in a neutral version in a counterbalanced design. After diagnosing each vignette, participants were asked to recall the patient's clinical findings and behaviours. Main measurements were: diagnostic accuracy scores; time spent on diagnosis, and amount of information recalled from patients’ clinical findings and behaviours.

Results Mean diagnostic accuracy scores (range 0–1) were significantly lower for difficult than neutral patients’ vignettes (0.41 vs 0.51; p<0.01). Time spent on diagnosing was similar. Participants recalled fewer clinical findings (mean=29.82% vs mean=32.52%; p<0.001) and more behaviours (mean=25.51% vs mean=17.89%; p<0.001) from difficult than from neutral patients.

Conclusions Difficult patients’ behaviours induce doctors to make diagnostic errors, apparently because doctors spend part of their mental resources on dealing with the difficult patients’ behaviours, impeding adequate processing of clinical findings. Efforts should be made to increase doctors’ awareness of the potential negative influence of difficult patients’ behaviours on diagnostic decisions and their ability to counteract such influence.

  • Decision making
  • Diagnostic errors
  • Medical education

Statistics from


Faulty clinical reasoning has been observed in most diagnostic errors,1 ,2 and the literature has suggested that doctors’ emotions may play a role in causing reasoning flaws.3 Doctors’ emotional reactions to patients whose behaviours make the doctor-patient interaction particularly distressing, often named ‘difficult patients’, have been said to negatively affect clinical decisions.3–5 In a companion paper, we report on a study that provides initial experimental evidence for this claim.6 Doctors made more mistakes when diagnosing clinical cases with difficult patients than with neutral patients, even though the cases were exactly the same except for the patients’ behaviours. How difficult patients’ behaviours hinder doctors’ reasoning has yet to be determined.

Generating a diagnosis requires matching findings from the problem at hand to the representation of a particular disease that the doctor has stored in memory.7 Usually, an intuitive, largely automatic, recognition of a disease ‘pattern’ in the case is followed by an (more or less thorough) analytical review of relevant findings against the mental representation of that disease to verify the initial diagnosis. In the process, alternative disease hypotheses may be activated that may require further analytical review of the findings. So, in the diagnosis of a clinical case intuitive and analytical processes alternate.8

How could doctors’ emotional reactions to disruptive patients’ behaviours affect this process? At least three hypotheses can be put forth based on psychological research on how emotions can negatively affect decision-making.9–11 (1) A ‘premature closure’ hypothesis. The intuitive first impression of a difficult patient may be so overwhelming that the doctor undervalues subsequent findings and sticks to his/her (possibly wrong) initial diagnosis. Studies have shown that decisions are indeed often dominated by an initial emotional appraisal of the problem, ignoring information that is presented subsequently.12 ,13 (2) An ‘intrusive thoughts’ hypothesis. The doctor may try to override an (erroneous) intuitive diagnosis by engaging in analytical processes to review the case findings, but thoughts associated with the emotional reaction to the difficult patient interfere with his thinking. Even if individual findings are extensively processed, intrusions would hinder a coherent understanding of the problem. Research has shown that emotions may indeed affect analytical mental processes,14–16 by triggering irrelevant thoughts.17 (3) A ‘resource depletion’ hypothesis. The emotion-triggering patient's behaviours may capture so many of the doctors’ mental resources that fewer resources are left to deal with the clinical findings of the case, thereby impairing decision-making. There is some evidence that people have only limited mental resources available, and problem-solving is hampered when emotion-provoking elements of the problem capture many of these resources.18–20

All three hypotheses predict diagnostic accuracy to be poorer with difficult patients. To distinguish between the three, as we cannot directly observe doctors’ reasoning processes, we have to measure its by-products. Two by-products are particularly relevant: time needed to make a diagnosis and the nature and the amount of information remembered from a case. (It is known that information that has been processed more extensively tends to be recalled better and vice versa.)21 The premature closure hypothesis suggests that, under the difficult patient condition, doctors would tend to quickly reach a decision based on their first intuitive response, without extensively processing any part of the case. They would, therefore, spend less time and recall less about the patient's clinical findings and behaviours compared with the neutral condition. The intrusion hypothesis assumes that doctors’ reasoning, under the difficult patient condition, is repeatedly interrupted by emotional intrusions forcing them to repeatedly ‘restart’ thinking. A diagnostic decision would then take more time and, as the case findings are processed more extensively, they would be better recalled (however without producing a coherent understanding of the case). Finally, the resources depletion hypothesis assumes that attentional (mental) resources are limited and similar, independent of condition. Time to reach a diagnosis would then be equal under the two conditions. As the emotional experiences with the difficult patient capture much of the mental resources (and are therefore extensively processed), fewer resources remain for processing the clinical findings. Consequently, compared with the neutral condition, the difficult patient condition recalls more information of the patient's behaviour but less of the related clinical findings.

This experiment tested these predictions (that are summarised in table 1). We expected difficult patients’ behaviours to adversely affect diagnostic accuracy but made no a priori hypothesis about which one of the three mechanisms explains this negative effect.

Table 1

Differential predictions of the three hypotheses (‘lower’ means lower than the neutral condition; ‘higher’ means higher than the neutral condition; ‘equal’ means equal to the neutral condition)



Participants were 74 1st-year and 2nd-year internal medicine residents (mean age M=29.35 years; SD=2.22; 46 female) from the Erasmus MC, Rotterdam, the Netherlands. All residents (80) attending two educational meetings in March 2014 were invited to participate in the study, and volunteers were recruited. No incentive was provided for participation.

As the nature of the experiment prevented disclosure of its objectives beforehand, participants were informed about their tasks and debriefed later. All participants signed consent to use their data.

Materials and procedure

The study used eight written clinical cases, prepared by two board-certified internists (SCES, PLAVD) by adapting cases used in previous studies.22–24 All cases were prepared by internists based on real patients and had a confirmed diagnosis. Each case consisted of a brief description of a patient's medical history, present complaints, symptoms and findings from physical examination and diagnostic tests. The diagnoses were: inflammatory bowel disease, acute viral hepatitis, coeliac disease, Addison's disease, liver cirrhosis, aortic dissection, appendicitis, hyperthyroidism.

A fragment of text describing the patient's behaviours either in the present or in previous visits was added to each case. The fragment described either a difficult patient's behaviours or a neutral patient's behaviours, thereby leading to two versions of each clinical case. Three coauthors (WWVdB, KVdB, SM) prepared the patient behaviour descriptions based on experience with real patients and on the literature on difficult patients.4 ,5 ,25–27 We developed portrays of patients with the following behaviours: (1) ‘frequent demander’;25 (2) an aggressive patient; (3) a patient who questioned his doctor's competence; (4) a patient who ignores his doctor's advice; (5) a patient who has low expectations of his doctor's support; (6) a patient who presents herself as utterly helpless; (7) a patient who threatens the doctor; and (8) a patient who accuses the doctor of discrimination (box S-1 in the online supplementary file presents a set of representative clinical cases, each one in the two versions). The complete set of clinical cases and patients’ portrays is available on request.

The study employed a balanced within-subjects incomplete block design in which each participant diagnosed difficult-patient and neutral-patient cases. Rather than having participants diagnosing the same case twice (with the second decision almost certainly affected by the first one), each participant diagnosed half of the cases in the difficult patient version and the other half in the neutral patient version, but which case was diagnosed in each version differed between participants. The cases were counterbalanced in such way that, at the level of the group of participants, all the difficult versions and the neutral versions were diagnosed the same number of times. This balanced design was made possible by preparing four different variations of the materials in the computer program Qualtrics, which was employed to run the experiment. The four variations counterbalanced the order and the version in which each of the eight cases was presented (difficult or neutral), and participants were randomly assigned to one of the four variations. Each participant first diagnosed the eight clinical cases (four neutral and four difficult patient cases). After diagnosing each case, they performed the recall task, typing in all the information that they could remember from the case. Subsequently, each vignette was presented again, and participants rated, on a 5-point Likert item, how likable the patient was. The Qualtrics software automatically registered participants’ responses and the time required for each response.

Data analysis

The accuracy of participants’ diagnoses was evaluated by two board-certified internists (PLAVD; JLCMVS) by considering the confirmed diagnosis of each case as a standard. The two internists independently evaluated each diagnosis made by the participants, without knowing the condition under which it was provided, as correct, partially correct or incorrect (scored as 1, 0.5 or 0 points, respectively). A response was considered correct whenever it mentioned the core diagnosis and partially correct when the core diagnosis was not cited, but a constituent element of the diagnosis was mentioned. There was agreement on 86% of the responses; differences were resolved through discussion.

For the analysis of the recall task, two authors (SM, WWVdB) first identified, by a consensus model, the idea unitsi ,28 referring to clinical findings and to the patient's behaviour present in each case. Subsequently, three authors (SM, KVdB, TVdZ) counted the number of idea units of the two types present in 10% of the participants’ responses, and, as the agreement was high (clinical findings, 88%; patient behaviours, 91%), the count proceeded with a single evaluator.

A repeated-measures ANOVA with patient behaviour as within-subject factor (difficult patient vs neutral patient) was performed on the mean diagnostic accuracy scores to check whether difficult patients cause doctors to make mistakes. To verify the predictions described in the introduction, a similar ANOVA was performed on time needed to make the diagnosis. The predictions were also examined through the analysis of the recall task. A repeated-measures MANOVA with case type (difficult patient vs neutral patient) as within-subject factor was performed on four dependent variables: the percentage of correct clinical findings recalled, the percentage of correct patient behaviour recalled, the frequency of clinical findings incorrectly recalled and the frequency of patients’ behaviours incorrectly recalled.

Finally, a repeated-measures ANOVA compared the ratings for patient likability of difficult and neutral patients.


Participants made more mistakes when diagnosing difficult patient cases relative to neutral patient cases, F(1, 73)=7.17, p=0.009, partial η2=0.09. Time spent on diagnosis did not differ, F(1, 73)=1.14, p=0.29, partial η2=0.01 (see table 2).

Table 2

Mean diagnostic accuracy scores (range 0–1; SDs into brackets) and mean time spent in diagnosing the cases (seconds) as a function of patients’ behaviours, N=74

The analyses of the recall task showed a significant multivariate effect of patient behaviour across information type (V=0.54, F(4, 70)=20.53, p<0.001, partial η2=0.54). Participants recalled fewer clinical findings from difficult than from neutral patient cases (F(1, 73)=16.28, p<0.001, partial η2=0.18) and made more errors (F(1, 73)=14.97, p<0.001, partial η2=0.17), that is, more clinical findings were mistakenly attributed to the patient. By contrast, participants recalled more information about the behaviours of difficult than neutral patients (F(1, 73)=49.72, p<0.001, partial η2=0.45), while the number of errors was similar (F(1,73)=0.25, p=0.62, partial η2=0.003) (see table 3).

Table 3

Mean percentage of clinical findings and patients’ behaviours recalled per case and mean frequency of incorrectly recalled clinical findings and incorrectly recalled patients’ behaviours per case as a function of patient behaviour (SDs into brackets), N=74

The averaged likability ratings were lower for the difficult patient versions than for the neutral versions of the cases (F(1, 73)=317.71, p<0.001, partial η2=0.81).


These findings show that difficult patients’ behaviours can indeed adversely affect doctors’ reasoning, causing them to make diagnostic errors. Participants provided less accurate diagnoses when the patient displayed in the vignette presented with difficult than with neutral behaviours even though the clinical cases were exactly the same except for the patient's behaviours. No difference in the time needed for diagnosis between difficult and neutral versions of the cases was found. These results are in agreement with findings from a previous study conducted by the same research group.7 More importantly, our current findings also shed light on why our doctors made mistakes, on the underlying cognitive mechanism through which difficult patients’ behaviours affect doctors reasoning. The results of the recall task seem at variance with a premature closure and an intrusion account for the difficult patient phenomenon. The premature closure hypothesis predicted shorter times needed to reach a diagnosis and poorer recall of the case as a whole, that is, including clinical findings and patient behaviours. The intrusion account on the other hand predicted longer times to diagnosis and better overall recall. We failed to find support for either hypothesis. No differences were found in time to reach a diagnosis, and participants recalled fewer clinical findings and more patient behaviours in difficult patient than in neutral patient cases.

Taken together, these findings support the resource depletion hypothesis as an explanation for the increase in diagnostic errors in difficult patients. Remember that the assumption was that if limited cognitive resources were allocated to more extensively processing the emotion-provoking behaviours of difficult patients, this would occur at the expense of processing the clinical findings. Total processing time was not expected to be different because doctors would not have more resources available for difficult patients than for neutral patients. Because troublesome behaviours consume part of scarce mental resources, diagnostic accuracy suffers.

Arguably, the salience of aversive behaviours per se could explain why the participants recalled more behaviours from difficult than neutral patients. However, this increased recall of behaviours from difficult patients was associated with a decreased recall of clinical findings. The salience of the difficult behaviours cannot explain less processing of clinical findings unless the amount of mental resources that doctors (as every person) are capable (or willing) of allocating to solving a problem is limited. If this is true, then processing parts of the case more extensively necessarily implies that other parts receive less attention.

This resource depletion hypothesis is consistent with research employing a variety of tasks and participants in other domains.29 When people are requested to solve problems, the same limited supply of mental resources is used for many different processes, including regulating emotions, inhibiting impulses and controlling attention.18 ,19 If some of these processes capture part of the supply, the resources available for solving the problem are depleted, and performance is impaired. In several studies in other domains, participants required to inhibit their emotions (or to prevent their attention from being distracted) while dealing with elements of the to-be-solved problem underperformed compared with participants who did not have to exert control over their cognitive processes.19 ,20 ,30 The adverse influence of resource depletion on performance occurs particularly in problems that require complex thinking such as logical reasoning or thoughtful comprehension.30 These studies employed everyday emotion-provoking stimuli and problems that did not assume any specialised knowledge. Nevertheless, their findings seem to translate to clinical practice. To diagnose most clinical problems, doctors need to systematically review the case findings to distinguish relevant from irrelevant information and integrate findings into a coherent mental representation of the problem, while sustaining attention and overriding incipient responses. If resource depletion affects simpler, everyday problems, it is not surprising that these highly complex cognitive processes are impaired if a substantial proportion of mental resources is seized by the confrontation with emotional experiences triggered by patients’ troublesome behaviours.

The detrimental effect of the confrontation with difficult patients on doctors’ diagnostic performance was substantial, especially if we consider the high stake of diagnostic decisions. A medium effect size emerged31 even though the methodological approaches that we adopted tend to weaken the effect relative to what it would tend to be in real life. First, our treatment was rather subtle. We depicted the patient's behaviours in a few written sentences, while aversive behaviours presented by real patients might have a stronger effect. Second, we used written vignettes, and all the information necessary to reach a diagnosis was therefore already available. Real patients require the doctors to determine by themselves which information is relevant and to gather it, which makes reasoning possibly more vulnerable to flaws. The adverse influence of difficult patients on diagnostic reasoning observed with internal medicine doctors in the present study replicate the findings of our previous study with general practitioners (HG Schmidt et al, Submitted 2015). This replication of the difficult patient phenomenon with doctors from different specialties, in a different phase of their training, solving partially different clinical cases, speaks of the generalisability of the findings.

Throughout medical education and professional practice, students and doctors are recommended to avoid that their emotional reactions to patients interfere with their clinical judgments. Doctors seem to believe that they in fact succeed in doing so.32 The poor awareness about the influence of doctors’ emotions on clinical judgments is reflected in the literature on diagnostic errors, which has given little attention to the problem.3 Our findings indicate that efforts are required to making students and practicing clinicians aware that difficult behaviours of patients can interfere with the diagnostic reasoning, adversely affecting clinical judgments. Interventions that can decrease doctors’ vulnerability to such interference also deserve the attention of researchers. We are not aware of any approach whose effectiveness has been investigated. Nevertheless, research in other domains provides some indication that regular practice of exercises that exert control over one's thoughts or feelings makes people less susceptible to depletion of mental resources while solving problems.33 ,34 These studies employed simple, daily life problems. It is for future research to say whether they are of any use for the conception of interventions that help doctors counteract the influence of emotional reactions to difficult patients.

Our study has limitations. First, we investigated three potential underlying mechanisms of the decrease in diagnostic performance in difficult patients. We are not aware of research that provides a basis for a different cognitive explanation of the phenomenon, but it cannot be ruled out that such other mechanism exists. Second, the limited experience of our participants may also be viewed as a limitation of the study. It cannot be said whether findings would hold for more experienced doctors. It is possible that, over time, doctors somehow develop an ability to prevent their emotional reactions from interfering with their reasoning. That would imply that more experienced physicians’ diagnostic reasoning would suffer less under these conditions. However, having dealt with many difficult patients might make negative attitudes towards them stronger and, consequently, more easily activated in the mind.10 Emotional reactions would then tend to be more frequent, making experienced doctors more susceptible to the resources depletion mechanism. If experience with difficult patients helps or hinders is, therefore, still to be determined.


The authors thank the residents who dedicated their limited time to participate in the study. We would like to thank Prof. Geoff Norman for his valuable contribution during the review of the article.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter Follow Tim Van der Zee at @Research_Tim

  • Contributors SM, TVG and HGS conceived and designed the study. SM, SCES, KVdB, PLAVD, HB, TVdZ, WWVdB and JLCMVS prepared the materials and acquired the data. SM, TVG, SCES, KVdB, PLAVD, HB, TVdZ, JLCMVS and WWVdB analysed the data. SM and HGS wrote the paper. TVG, SCES, KVdB, PLAVD, HB, TVdZ, WWVdB and JLCMVS revised the paper. All authors approved the final version of the manuscript. SM and HGS contributed equally to the work and are the guarantors. All authors had full access to all of the data, including statistical reports and tables in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis.

  • Competing interests None declared

  • Ethics approval The Ethics Committee (ECP) from the Department of Psychology, Erasmus University Rotterdam, approved both experiments comprised in this study (decision letter issued on 15 December 2011).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Case level data are available from the corresponding author.

  • i An idea unit is the smallest meaningful idea that can be identified in a fragment of text.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles