Article Text

Do patients' disruptive behaviours influence the accuracy of a doctor's diagnosis? A randomised experiment
  1. H G Schmidt1,
  2. Tamara van Gog1,
  3. Stephanie CE Schuit2,
  4. Kees Van den Berge3,
  5. Paul LA Van Daele2,
  6. Herman Bueving4,
  7. Tim Van der Zee1,
  8. Walter W Van den Broek5,
  9. Jan LCM Van Saase2,
  10. Sílvia Mamede5
  1. 1Department of Psychology, Erasmus University Rotterdam, Rotterdam, The Netherlands
  2. 2Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands
  3. 3Department of Internal Medicine, Admiraal de Ruyter Hospital, Goes, The Netherlands
  4. 4Department of General Practice, Erasmus Medical Center, Rotterdam, The Netherlands
  5. 5Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Rotterdam, The Netherlands
  1. Correspondence to Dr Sílvia Mamede, Institute of Medical Education Research Rotterdam, Erasmus Medical Center, Wytemaweg 80, Ae-242, Rotterdam 3015CN, The Netherlands; s.mamede{at}


Background Literature suggests that patients who display disruptive behaviours in the consulting room fuel negative emotions in doctors. These emotions, in turn, are said to cause diagnostic errors. Evidence substantiating this claim is however lacking. The purpose of the present experiment was to study the effect of such difficult patients’ behaviours on doctors’ diagnostic performance.

Methods We created six vignettes in which patients were depicted as difficult (displaying distressing behaviours) or neutral. Three clinical cases were deemed to be diagnostically simple and three deemed diagnostically complex. Sixty-three family practice residents were asked to evaluate the vignettes and make the patient's diagnosis quickly and then through deliberate reflection. In addition, amount of time needed to arrive at a diagnosis was measured. Finally, the participants rated the patient's likability.

Results Mean diagnostic accuracy scores (range 0–1) were significantly lower for difficult than for neutral patients (0.54 vs 0.64; p=0.017). Overall diagnostic accuracy was higher for simple than for complex cases. Deliberate reflection upon the case improved initial diagnostic, regardless of case complexity and of patient behaviours (0.60 vs 0.68, p=0.002). Amount of time needed to diagnose the case was similar regardless of the patient's behaviour. Finally, average likability ratings were lower for difficult than for neutral-patient cases.

Conclusions Disruptive behaviours displayed by patients seem to induce doctors to make diagnostic errors. Interestingly, the confrontation with difficult patients does however not cause the doctor to spend less time on such case. Time can therefore not be considered an intermediary between the way the patient is perceived, his or her likability and diagnostic performance.

  • Diagnostic errors
  • Medical education
  • Decision making

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Doctors are often engaged in clinical encounters that are emotionally charged. However, most of these fall within the limits of what is to be expected in clinical practice: Patients who fear that something is wrong with them might respond emotionally in the interaction with their doctor. Yet, some patients display behaviours that make the doctor–patient interaction particularly stressful. These patients have been named ‘difficult’,1 ,2 ‘heartsink’,3 ‘frustrating’4 or even ‘hateful’5; labels that express the negative feelings their behaviours arouse in their doctors.6 ,7

The reasons why some patients are considered difficult are diverse. They involve care avoiders, demanding, argumentative or even aggressive patients, patients who do not trust their doctor and ignore his or her advice or utterly helpless patients.8 ,9 Some authors assume that such difficult patients are to be characterised as having personality disorders or even major psychopathology.2 ,6 Encounters with disruptive behaviours displayed in the consulting or the emergency room are by no means rare. Doctors have reported to see hard-to-deal-with patients in around 15% of the outpatient population.1 ,2

It comes therefore as no surprise that being confronted with such patients cause emotional reactions in doctors.10 According to Smith and Zimney, the highest degree of emotional reaction is caused by threats to physician's integrity or self-esteem, followed by situations where patients are demanding or upset.7 Although experiencing an emotional response to such patients is understandable; such response does not necessarily imply that one's diagnostic ability is negatively affected. There is a literature suggesting that doctors’ reactions to these patients influence their diagnostic decisions adversely.2 ,7 ,11 To the best of our knowledge, however, there is to date no empirical evidence to corroborate this hypothesis. The purpose of this article, hence, is to present results of an attempt to fill this gap.

To study the influence of difficult patients’ behaviours on diagnostic accuracy, we presented residents in family medicine with one of two versions of six patient vignettes; a version describing the behaviour of a difficult patient or a version of the same patient but now described without the aversive behaviours We hypothesised that difficult patients’ behaviours would adversely influence diagnostic reasoning, inducing doctors to make diagnostic errors. We also hypothesised that such effect would be particularly strong with complex cases. In addition, time to reach a diagnosis was measured. Some authors suggest that doctors avoid extensive processing of the information provided by difficult patients. If this mechanism plays a role, we expected our participants to spend less time on a difficult patient than on his or her neutral counterpart. Subsequently, we presented the participants a second time with the same cases, applying a ‘deliberate-reflection’ procedure encouraging doctors to process each case in a more analytical fashion.12 ,13 This procedure was invoked to check whether deliberate reflection would help overcome the diagnostic errors initially made.14 Finally, we asked the participants to rate the cases in terms of their likability.



Participants were 63 family practice residents (mean age M=31.34; SD=4.00; 44 women) from the Erasmus Medical Center, Rotterdam. All residents were in the last quarter of the third year of their training. All 83 residents of this group were invited to participate in the study between December 2012 and March 2014, and volunteers were recruited. No incentive was provided for participation. The ethics review committee from the Department of Psychology, Erasmus University Rotterdam approved this study. As the nature of the experiment prevented disclosure of its objectives beforehand, participants were informed about their tasks and debriefed later. All participants signed consent to use their data.


Six clinical vignettes, prepared by two board-certified general practitioners (HB; NW) and based on real patients, were employed in this experiment. All cases had a confirmed diagnosis and consisted of a brief description of a patient's history, complaints, symptoms and findings from physical examination and tests. The diagnoses were (1) community-acquired pneumonia, (2) pulmonary embolism, (3) meningoencephalitis, (4) hyperthyroidism, (5) appendicitis and (6) acute alcoholic pancreatitis. Based on performance patterns in previous studies,1416 the first three were considered simple cases, and the last three were considered complex.

In each case, a few sentences described aspects of the patient's behaviour. These sentences portrayed either a difficult patient or a neutral patient, effectively producing two versions of the same clinical case. Three coauthors (WWVB; KVB; SM) prepared the descriptions based on the difficult-patient literature reviewed in the Introduction section and on experiences with actual patients. The descriptions of the difficult patients consisted of (1) a ‘frequent demander,’ (2) an aggressive patient, (3) a patient who questions his doctor's competence, (4) a patient who ignores his doctor's advice, (5) a patient who has low expectations of his doctor's support and (6) a patient who presents herself as utterly helpless. In all other respects, the different versions were identical, leading to the same diagnosis. Online supplementary appendix 1 presents an example of two versions of the same case. (The full set of cases is available upon request.)


The study employed a within-subjects design, so each participant was confronted with both difficult-patient and neutral-patient versions of the cases. A full within-subjects design would imply the presentation of both versions of each case to the participants. Such presentation of two versions of the same case would however likely lead to carry-over effects: when one has seen one version, diagnosing the second may become easier (or more difficult). An alternative is to present to each participant half of the cases in difficult and half of the cases in neutral format. In other words: every participant received three neutral and three difficult patients such that all six cases were seen once. Such balanced within-subjects incomplete block design enabled us to compare mean diagnostic performance scores under the two experimental conditions.

Participants were requested to carry out three tasks. First, each participant received a booklet containing the six vignettes, three presented in the difficult and three in the neutral version. Different versions of the booklets were prepared, counterbalancing the order of cases and the version in which each case was presented, and the researcher in charge of each meeting randomly distributed the different versions of the booklets among the participants. The first task required participants to read the case and write down the most likely diagnosis as fast as possible but not at the expense of accuracy. A large digital clock was visible in the room, and the participants were requested to register the time immediately before starting reading the case and after having provided the diagnosis.

After having diagnosed all six cases, the second task requested participants to deliberately reflect upon the cases by following a procedure that has been employed in previous studies.1416 They were presented with the same cases again, one by one, and requested to (1) read the case; (2) write down the diagnosis previously given for the case; (3) list the findings in the case description that support this diagnosis, the findings that speak against it and the findings that would be expected to be present if this diagnosis were true but that were not described in the case; (4) list alternative diagnoses if the initial diagnosis generated for the case had proved to be incorrect and (5) proceed with the same analysis (step 3) for each alternative diagnosis. Finally, they should write down their final diagnosis for the case.

Finally, participants were requested to rate, on a five-point Likert scale, how likable the patient was. This was done to check whether our experimental manipulation worked.

Data analysis

The accuracy of participants’ diagnoses was evaluated by considering the confirmed diagnosis of each case as a standard. Two board-certified general practitioners (HB; NW) independently evaluated each diagnosis, without knowing the condition under which it was provided, as correct, partially correct or incorrect (scored as 1, 0.5 or 0 points, respectively). A response was considered correct whenever it mentioned the core diagnosis, and partially correct when the core diagnosis was not cited but a constituent element of the diagnosis was mentioned. For example, in a case of thyrotoxic crisis, ‘hyperthyroidism’ was considered correct, and ‘hypokalaemia-induced muscle weakness’ was evaluated as partially correct. The two experts agreed in 80% of the diagnoses and solved discrepancies through discussion.

A repeated-measures analysis of variance (ANOVA) with patient behaviour (difficult vs neutral), case difficulty (complex vs simple) and reasoning mode (non-analytical vs reflective) as within-subjects factors was performed on the mean diagnostic accuracy scores. This analysis tested the hypothesis that difficult patients’ behaviours would negatively affect diagnostic accuracy, particularly in complex cases, and that deliberately reflecting upon the case would improve on initial diagnoses. To check whether difficult patients’ behaviours led doctors to speed up the diagnostic process, we performed a repeated-measures ANOVA with patient behaviour and case complexity as within-subjects factors on time spent to make the initial diagnosis. Finally, separate ANOVAs with patient behaviour as within-subjects factor were performed on participants’ ratings for patient likability.i

Significance levels were set at p<0.05 for all comparisons. SPSS V.20.0 (SPSS, Chicago, Illinois, USA) was used for the statistical analyses.


Participants made more errors when diagnosing cases with difficult patients than with neutral patients, independent of case complexity F(1, 62)=6.05, p=0.017, partial η2=0.09. As expected, overall diagnostic accuracy was higher for simple than for complex cases, F(1, 62)=270.04, p<0.001, partial η2=0.81. Deliberate reflection upon the case improved initial diagnostic accuracy, regardless of case complexity and of patient behaviours, F(1, 62)=10.24, p=0.002, partial η2=0.14. The observed effect sizes range from medium to large.17 No significant interaction effects emerged. See table 1.

Table 1

Mean diagnostic accuracy scores (range 0–1; SD between brackets) as a function of patient behaviour, case complexity and reasoning mode, N=63

The amount of timeii needed to diagnose the case was similar regardless the patient's behaviour (difficult patients: M=0.31, SD=0.06; neutral patients: M=0.30, SD=0.07), F(1, 59)=1.70, p=0.20, partial η2=0.03. Participants tended to spend more time to diagnose complex than simple cases, but this difference was not significant (complex cases: M=0.31, SD=0.06; simple cases: M=0.30, SD=0.06), F(1, 59)=3.47, p=0.07, partial η2=0.06. The interaction effect was not significant.

Averaged likability ratings were lower for difficult than for neutral-patient cases, F(1, 60)=103.49; p<0.001; partial η2=0.63, suggesting that our experimental manipulation was successful.


Recent literature suggest that difficult patients may drive their doctors into despair7 ,10 and even affect their ability to arrive at an accurate diagnosis.2 ,7 ,11 The findings reported in this article demonstrate that difficult patients can indeed adversely affect doctors’ diagnostic reasoning. While diagnosing clinical cases that were exactly the same, except for the description of the patient's behaviours, participants provided less accurate diagnoses when the patient presented with disruptive behaviours than with neutral behaviours. The effect of our experimental manipulation was by no means small. Participants made 42% more mistakes when the cases were relatively complex. (When they cases were simple, the effect was smaller: 6%.) Given the opportunity to analytically reflect upon the case, the doctors’ diagnoses improved significantly, but largely left the initial differences intact: in fact, both conditions profited to the same extent. It seems that deliberate reflection unlike its role in previous studies involving other determinants of diagnostic error such as availability bias13 was not able to overcome the adverse effect of difficult patient behaviours. This finding attests to the strength of the effect.

How is it possible that identical clinical cases were diagnosed so differently? One possibility we raised was that doctors avoid extensive processing of the information provided by difficult patients. Such tendency to avoid difficult-to-handle patients might lead to ignoring particular signs or symptoms. If this idea were to have any merit, we expected our participants to spend less time on a difficult patient than on his or her neutral counterpart. However, our results did not support this hypothesis. Time on task per se cannot explain our findings. How exactly difficult-patient information is cognitively processed differently from identical information presented by a neutral patient should be the focus of future research. For a first attempt to address this issue see the article by Mamede et al.18

Our study has a number of limitations. First, to demonstrate the effect of difficult patients on diagnostic accuracy, we used written vignettes in which all the relevant information required to reach the most likely diagnosis was provided. There is no way to check to what extent this effect also generalises to actual clinical practice. On the other hand, in real life, doctors have to search for information themselves, making clinical reasoning more demanding and perhaps even more susceptible to mistakes. Additionally, the potential negative effect of difficult behaviours displayed by real patients is likely to be stronger than what can emerge from a few sentences about the patient's behaviour in a written case. If an adverse effect of such subtle manipulation emerged, confrontation with real behaviours is likely to produce even stronger effects. More generally, experimental evidence shows that case vignettes are a good proxy for the study of doctors’ behaviours in real-world settings as they accurately and reliably enable detecting group-level differences in clinical decision making.1921

A second potential limitation of our approach is that the difficult behaviours may have appeared to the participants as clinical findings associated with another disease, and this may have caused the errors. However, if this were the case, we would have expected identical or similar mistakes given a particular disruptive behaviour. No such patterns of similar mistakes related to specific behaviours were found, making this alternative explanation for our findings unlikely.

Third, our participants were residents, and it is unclear to which extent our findings would apply to more experienced physicians as well. However, as physicians gain experience, they will encounter difficult patients more frequently, which might make emotional reactions more likely to occur. Whether more experienced physicians are better able to counteract their effect (as they would tend to learn how to deal with them) or whether they would be even more harmed by them is still to be studied.

The findings have implications for clinical practice and for medical education. Medical students and doctors—from residents to experienced clinicians—have reported emotional reactions such as anger, fear or impatience towards their patients.6 ,7 Acknowledging that negative feelings towards patients do occur is indeed not so difficult. Many clinicians have seen their interest turn into impatience by frequent attenders with vague complaints, repeated interruptions during a consultation or insistence in requesting unnecessary tests. Most doctors would, however, tend to deny that these feelings influence their judgements.22 The prevailing view in medicine is that doctors should stay above the emotional pull in clinical encounters, preventing emotional reactions to patients to interfere with reasoning for the sake of clear clinical judgements. Whatever we believe, however, the fact is that difficult patients trigger reactions that may intrude with reasoning, adversely affect judgements and cause errors. It may be more beneficial, therefore, to aim at increasing students’ and doctors’ awareness that emotional responses to difficult patients can affect clinical reasoning and threaten patient safety11 and to enhance their ability to counteract the influence of such emotions. How this goal can be achieved is a subject for future research.


The authors are thankful to the residents who dedicated their limited time to participate in the study and to Dr Niek Wisse, from the Department of General Practice, Erasmus MC, for collaborating, without any compensation, in the analysis of the diagnoses provided by the participants in Experiment 1. We would like to thank Prof. Geoff Norman for his valuable contribution during the review of the article.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter Follow Tim Van der Zee at @Research_Tim

  • Contributors HGS, TG and SM conceived and designed the study. SM, SCS, KVB, PLVD, HB, TVZ, WWVB and JLS prepared the materials and acquired the data. HGS, SM, TG, SCS, KVB, PLVD, HB, TVZ, JLS and WWVB analysed the data. HGS and SM wrote the paper. TG, SCS, KVB, PLD, HB, TVZ, WWB and JLVS revised the paper. All authors approved the final version of the manuscript. HGS and SM contributed equally to the work and are the guarantors. All authors had full access to all of the data, including statistical reports and tables in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis.

  • Competing interests None declared.

  • Ethics approval The Ethics Committee (ECP) from the Department of Psychology, Erasmus University Rotterdam approved both experiments comprised in this study (decision letter issued on 15 December 2011).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • i We also asked participants to rate their experience with similar patients’ behaviours, and their experience with the diseases that were central to our study. Since these ratings turned out to be inconsequential for our study (participants had somewhat less experience with patients displaying the difficult behaviours, and their experience with the diseases central to this study were average) they are omitted here. Data can be obtained from the authors.

  • ii To control for differences in case length, we computed time per word for the analysis; M=mean.

Linked Articles