Article Text

Download PDFPDF

‘This time is different’: physician knowledge in the age of artificial intelligence
Free
  1. Gurpreet Dhaliwal
  1. Department of Medicine, University of California San Francisco, San Francisco, California, USA
  1. Correspondence to Dr Gurpreet Dhaliwal; gurpreet.dhaliwal{at}ucsf.edu

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Great diagnosticians are often portrayed as recognising rare diseases that evade the efforts of mere mortals. This makes for great TV and local legends, but does not reflect daily practice, where the most common diagnostic challenge is discriminating between common conditions like pneumonia and heart failure or appendicitis and gastroenteritis.

Questions about how to train the brain to make those distinctions are central to the efforts of many clinician educators. An unresolved issue is whether the structure of knowledge (about diseases and diagnostic pathways) in the physician’s long-term memory or the clinician’s mode of cognition (intuitive or analytical thinking) is more deterministic of diagnostic success. A study1 in this issue of BMJQS sheds light on this issue, but also invites a broader question: is physician cognition still essential for this task at all?

A test of lookalikes

In a two-phase experiment, Mamede et al 1 asked 68 internal medicine residents to recall from memory the key clinical features of six conditions (vitamin B12 deficiency, inflammatory bowel disease, hyperthyroidism, adrenal insufficiency, appendicitis, endocarditis). Physicians were categorised as high knowledge (HK) or low knowledge (LK) based on their recall of discriminating features, which are essential to differentiate one condition from common competing diagnoses.

One week later, the residents were given related clinical vignettes and asked to render a diagnosis. Half of the vignettes had a salient distracting feature (SDF), a clinical finding that may prompt the physician to suspect a condition other than the correct diagnosis. For example, a vignette of a confused patient included a family history of dementia, which was irrelevant in the face of strong evidence for vitamin B12 deficiency. The authors used the SDF as a model for activating the anchoring heuristic, which is a tendency to adhere to an early judgement triggered by a data point. Essentially, the authors created a trap and wanted to see who fell for it—and why.

The main outcome was the frequency of a specific diagnostic error—selecting the SDF-concordant diagnosis (eg, selecting dementia instead of vitamin B12 deficiency). The authors found that the LK and HK residents had equal diagnostic accuracy for cases without an SDF. But when a case had an SDF, LK physicians were more likely to offer the diagnosis linked to the SDF than HK physicians. This raised the question of why HK physicians were less susceptible to the trap. Was it due to their greater knowledge or because they were more analytical? The authors used a proxy of analytical reasoning, time taken to make a diagnosis, to find out.

When faced with distracting information, both LK and HK residents thought longer about the case, but the LK physicians were still more likely to get it wrong. Extended deliberation did not explain the diagnostic differences between the two groups, but knowledge did.

Analysing the analytical mode

Mamede et al’s study aligns with other investigations which find that steering the brain toward the analytical mode has limited prospects for improving diagnosis.2–5 Vignette studies lack ecological validity (eg, text cases cannot capture the context of a patient encounter),6 but their design allows for the interrogation of concepts that inform medical education based on sound logic but limited evidence. Such studies have helped dispel notions that there is a hierarchy between intuition and analysis,7 8 that heuristics and biases are reliably mitigated by analytical thought,9 or that heuristics trade accuracy for efficiency.10

Heuristics are human predispositions to think; they attract the name ‘bias’ when things go wrong. When we are well informed and well practised on a matter, these predispositions (eg, early perception of mottled skin in a patient with impending sepsis) are incredibly powerful and are a hallmark of expertise. When we are ill-informed or inexperienced, these predispositions (eg, latching on to a report of night sweats and immediately favouring tuberculosis) are a liability and get assigned a name like anchoring bias.11 Heuristics turn into biases in the void of expertise. The study by Mamede et al illustrates that knowledge can immunise against bias12 but that analytical thinking cannot reliably overcome it.

The equal time the LK and HK residents spent contemplating the SDF cases suggests that the analytical mode is heterogeneous across clinicians. When diagnosing, a physician can spend their time in analytical mode seeking contradictory evidence, rejoicing in confirmatory evidence or pondering irrelevant evidence. One thing is certain though: we all have the best chance of making use of that mode when we possess the relevant knowledge. If I am unaware that ataxia, macrocytosis and neutrophil hypersegmentation point to vitamin B12 deficiency, no amount of analysis will get me to the right diagnosis. But maybe a computer will.

Machine learning

This may be an odd moment to highlight the power of the brain in sorting between competing diagnoses. Machine learning systems have convincingly demonstrated the capacity to distinguish melanoma from a benign nevus13 or detect diabetic retinopathy,14 often matching or exceeding the accuracy of physicians.15 To some, these milestones herald the declining need for physicians to store information in long-term memory. For others, this is a familiar story.

Every advance in information technology, from the written word to the printing press to the internet, has sparked debate about the impact on learning and knowing. Mass-produced books in the 15th century triggered fears that the incentive to memorise and internalise concepts would be diminished. The internet brought a similar prediction—that humans (particularly students) would be liberated from memorisation because every fact would be available at their fingertips. Memorisation would be replaced with critical thinking skills. Recall would be replaced by search skills.

But it was not so. Infinite access did not produce infinite wisdom for any subject.16 It still takes a sustained intellectual investment to speak about a topic critically. Owning an English–Japanese dictionary does not make someone fluent in Japanese; that still takes unrelenting memorising, practice and failure. And physicians rarely have time to research the questions that arise in their daily work17; instead, they continue to practise almost entirely using their knowledge and experience. But now, each of these tasks—erudite academic treatises produced with one well-crafted prompt, real-time language translation, and instantaneous answers to medical questions—have all been performed admirably by artificial intelligence systems. Such remarkable feats make me think that perhaps this technological revolution is truly different from its predecessors.

Yet, back in the clinic and hospital, patients arrive with unique scenarios and intersecting problems that neither the clinician nor the computer has ever seen before. For the time being, the upper hand in generating new solutions at the point of care goes to clinicians. Although the computer has the advantage in accessing and remixing all the old solutions that humans have created, it lacks understanding of context, emotion and common sense and operates without deep conceptual knowledge that physicians draw on when old solutions fail.18

Memory is dead. Long live memory

Memorisation in the sense of keeping basic facts on speed dial forever is waning in relevance. But the importance of human knowledge in solving complex health problems remains as relevant as ever.

Medicine is a multifaceted endeavour that asks more of the clinician than rendering a diagnosis at one point in time. It requires skills in teamwork, communication, health system navigation, and troubleshooting. For the focused portion of the job that requires making A versus B diagnostic comparisons, computers may soon learn enough to outperform clinicians. I suspect they would do a splendid job on the cases in Mamede et al’s study19 and that they will continue to amaze us by making insights that evade the human eye and brain like detecting the fingerprints of atrial fibrillation in a normal sinus rhythm ECG.20 But if experience is any guide, whenever technology solves one problem, it creates another.

Genetic tests establish variants of unknown significance, head-to-toe imaging leaves incidentalomas in its wake and complex risk equations change adjectives into numbers (eg, ‘medium risk’ becomes 9%). Technology rarely resolves uncertainty—it just shifts the frontier.21 And when it does, it is still up to the physician to manage it. As the complexity and stakes of the uncertainty relocate—perhaps from ‘is this congestive heart failure or pneumonia?’ to ‘what are the stakes for this matriarch if we get that diagnosis wrong given her recent transplant, recurrent Clostridium difficile infections and family turmoil?’—the situation will call for a compassionate clinician who has deep knowledge, rich experience and an endless desire to grow both.

I love the optimism in the phrase ‘this time is different,’ and I sincerely hope it is true. But for now, the results of Mamede et al’s study bolster my standing advice to students and residents: do not worry about machine learning; worry more about how you can become a learning machine.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.

References

Footnotes

  • Contributors GD conceived of, drafted and revised the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests GD is a board member of the Society to Improve Diagnosis in Medicine.

  • Provenance and peer review Commissioned; internally peer reviewed.

Linked Articles