Statistics from Altmetric.com
Lung cancer, the most common cancer worldwide, has comparatively poor survival in the UK.1 Most patients with lung cancer first present to primary care, but diagnostic delays are well documented: patients with lung cancer have more consultations in primary care before investigation than those with many other cancers.2 In addition, while intervals from presentation to diagnosis have reduced for other common cancers over time, they remain unchanged for lung cancer.3 It has been suggested that missed opportunities for lung cancer diagnosis in primary care may contribute to poor lung cancer survival.4
Primary care physicians, referred to throughout this paper as general practitioners (GPs), have direct access to lung cancer diagnostic tools, including chest X-ray. GPs may not consider lung cancer as a differential diagnosis because patients with lung cancer commonly present to primary care with non-specific symptoms that are more often due to benign causes.5 Non-specific symptoms and rare disease occurrence therefore present diagnostic difficulty for GPs.6 Reducing diagnostic delays requires an understanding of how GPs decide which patients with common, non-specific symptoms to investigate for lung cancer. It is unclear how GPs decide who requires further investigation by chest X-ray or by specialist referral, and inequalities by patient age, gender and socioeconomic circumstances have been identified in retrospective analyses of routine data.1 ,2 ,7 ,8 Most previous research has examined the diagnostic process using retrospective data in patients with cancer only,5 thus missing a key dimension, that is, how GPs decide which patients with symptoms do not require investigation.
Examining decision-making in a standardised way in clinical practice presents substantial methodological challenges.9 ,10 Direct observation of real physician–patient encounters offers no opportunity to control patients' clinical and sociodemographic characteristics, and so requires observation of very large numbers of consultations to obtain the necessary numbers in specific risk or demographic categories. The use of fictional patient profiles (vignettes) can provide a valid and efficient approach to examining clinician behaviour,11 and studies have already produced useful insights into sources of error in clinicians' decision-making processes, due to both patient factors (eg, symptom characteristics)12 and physician factors (eg, cognitive biases).12 ,13 As Blumenthal-Barby and others recognise, however, there are limits to the applicability of written vignettes and other vignette designs that do not simulate key features of real consultations.14 In particular, when vignettes offer little or no opportunity for physicians to seek information from or about the vignette patient, they can inappropriately frame the decision for the physician by cueing what they should notice about the patient or by offering participants only a limited selection of response options. This risks priming participating physicians to consider certain actions and biasing their responses.
In this vignette study, we therefore sought to simulate key features of consultations. We designed a website using interactive multimedia vignettes with videos of actor ‘patients’, which enabled participating GPs to ask questions in their own words and receive real-time responses. We used this intervention in a factorial randomised experimental study to examine GPs' decisions to initiate lung cancer investigation across different combinations of patient clinical and sociodemographic characteristics.
We constructed 36 simulated consultations comprising video vignettes of actor ‘patients’ and comprehensive clinical information, including previous medical history, comorbidities and examination findings, and sociodemographic characteristics,. The symptomatic information provided adhered to material in the latest available National Institute for Health and Care Excellence (NICE) referral guidelines for suspected cancer (published in 2005),15 with cancer risk based on data from the CAPER case–control study.16 Each consultation was designed to take participating GPs approximately 10 min to complete, so that it mirrored the length of a ‘real’ clinical encounter in primary care in the UK National Health Service.
At the start of each ‘consultation’, a video was shown where the actor ‘patient’ volunteered a description of their presenting symptom. Participants could then elicit further information in real time on the presenting symptom, other symptoms and risk factors by typing in questions to which they received the ‘patient's’ video response. They could also, if they wished, click on a drop-down menu to obtain information on behavioural and familial risk factors, previous medical history, family history, sociodemographic information and examination findings (figure 1). A demonstration is available at: http://www.ucl.ac.uk/stream/media/swatch?v=c22f1a2b58b8.
We applied a factorial experimental design, where GPs undertook one consultation from each of six clinical profiles across three lung cancer risk levels (table 1); no GP saw the same actor twice. Within these constraints, allocation of GPs to vignettes was random. This achieved approximate balance of patient characteristics by clinical profile, gender, ethnicity and socioeconomic circumstances. The study protocol is available at: http://www.ucl.ac.uk/dahr/research-pages/gp_study
Recruitment and participation
Qualified GPs and registrars nearing the end of their specialist GP training were invited through nine primary care research networks across England in 2012 and 2013 to participate in a study of decision-making (without explicit reference to lung cancer). Those that returned an expression of interest were sent further information. For GPs that wished to take part, their internet browsers were checked for compatibility with the study software.
GP participants were first trained to use the online simulated consultations. This was done using a web-based video in advance of the study, with access to support from the research team during or between study consultations. Each participating GP used the study website to ‘consult’ with six ‘patients’, and at the end of the ‘consultation’, entered their management plan. GPs also completed a brief questionnaire about their practice characteristics and years since qualifying.
The application's development followed the steps recommended by Adler et al17 for developing simulations:
Case concept: developing the vignette design and content
Review and revision by content experts
Outline and flow development: a typical online consultation in the study
Translation of content into simulation platform: vignette interactive website
Pilot testing and revisions
A detailed description of each step is given in online supplementary file S1. In brief, the structure of the factorial experiment required 36 unique vignette combinations to cover the four experimental factors: known to be associated with variations in lung cancer survival, but whose effect on inequalities in GPs' rates of referral for investigation or to secondary care is uncertain:8
Ethnicity: three variations (white, black Caribbean, South Asian)
Gender: two variations (male, female)
Socioeconomic circumstances: two variations (advantaged or disadvantaged)
Clinical risk of lung cancer: three variations (low-risk, medium-risk and high-risk), with two profiles for each level of risk. Age was not included as a separate experimental factor, but was instead incorporated into profiles because older age increases the risk of cancer associated with most symptom combinations.16 We constructed six clinical profiles, two for each risk level, using different combinations of symptoms, age and smoking status (table 1). The positive predictive values (PPV) of lung cancer were drawn from PPVs generated by analysis of symptom combinations in the CAPER case–control dataset and interpretation of these symptoms and their characteristics informed by the latest available NICE guidance on investigation of suspected cancer15 ,16 (described further in online supplementary data).
To maximise the clinical authenticity of the cases, GPs specialising in cancer diagnosis and non-academic GPs reviewed the proposed vignettes. The website content and functionality were also informed by patient representatives' comments. For example, these influenced the types of responses ‘patients’ provided, because patient representatives corroborated previous research that patients may well not disclose certain symptoms with their doctors without being directly asked about them.18
The translation of content into the online study application website (virtual patient application) required filming actors portraying patients, creating and populating the website with that content. The website architecture and application software was produced by Athenaeum Educational Technologies. It involved the development of a bespoke system using natural language-processing principles to recognise GPs' free text questions and play a video clip in response (see ref. 19 for an explanation of the principles). This system was underpinned by databases on symptoms or risk factors and the features of those symptoms (eg, what exacerbates or relieves the symptom or how long it has been present).
Every action performed by GPs on the website (ie, all the questions asked of ‘patients’, drop-down menus accessed, free text entered in management plans) was captured by the study website. This information was used to measure the duration of each consultation and to generate three indicators about GPs' information requests in each consultation and the capacity of the research application to respond to these requests:
data sought: average number of data items sought (questions asked or drop-down menu items accessed) by GP and by individual vignette
errors: error messages displayed as a proportion of all data items sought, calculated for all consultations, consultation 1 and consultations 2–6 only, assuming that in the first consultation GPs were familiarising themselves with the application
key information elicited: proportion of GPs that elicited information on the vignette’s second, but unvolunteered, lung cancer symptom.
GPs also had the opportunity to provide free text comments on any aspect of the application in an online survey after all the consultations were completed. These comments were not treated as a representative survey of all participants' experiences, but were examined to provide insights into GPs' experiences of the application and their perceptions of its utility as a research tool for eliciting the decision-making process.
The primary outcome was the proportion of ‘patients’ for whom lung cancer investigation was included in the management plan. This included ordering appropriate imaging or referral for a specialist opinion, for example, from a respiratory consultant, whether participants' management plan stated this investigation was for lung cancer or not. This outcome variable was constructed from free text responses entered by participants in their management plan, according to predefined criteria. A clinician confirmed the validity of every constructed primary outcome.
Data were analysed by fitting multilevel logistic regression models using Markov chain Monte Carlo for estimation,20 allowing variation between participants and between vignettes within participants. This allowed for a correlation between outcomes within a given GP but independent outcomes for two vignettes viewed by different GPs. Estimation of ORs and 95% credible intervals was carried out using the RStan library in R V.3.0.2 (Stan Development Team. RStan: the R interface to Stan. V.2.5; 2014. http://mc-stan.org/rstan.html). Significance testing was carried out using Wald tests, based on the means and posterior variances of the estimates.
Variations in outcome were examined by ‘patient’ gender, ethnicity, socioeconomic circumstances and risk profile, an indicator variable for whether participants sought the second symptom and GP characteristics (demographics, experience and region). Two models were built in order to examine differences by (a) clinical profile and (b) by age. A supplementary analysis was conducted to examine whether findings were difficulties in obtaining information sought from the application, by including the indicator on errors as another covariate in each model. To examine selection bias, the gender and age of participating GPs and their practices' cancer referral characteristics were compared with national data.21 ,22
The required sample size was calculated on the basis that a minimum difference in investigations of 10% was considered of clinical importance and realistic, given variations in cancer investigations in other studies.23 A response from 216 participants was sought to give 1296 vignettes (ie, each of the 36 vignettes viewed 36 times). Each risk and ethnic group would therefore be viewed 432 times, and each gender and socioeconomic group 648 times. Assuming a 20% variance inflation factor for clustering of GPs/‘patients', 432 in each risk and ethnic group would give 95% power to detect a difference of 10%. For differences between gender and socioeconomic groups, 648 in each group would give 85% power for a difference of 5%.
Two hundred and twenty-seven GPs completed the study. This was 76% of the 300 GPs who registered for the study and 41% of the 556 GPs in total that initially expressed an interest in taking part (see online supplementary file S2A). There were no demographic differences between registered GPs who did and did not complete the study, but GP participants were younger than the national GP population, and practices had higher cancer referrals than non-participating practices (see online supplementary file S2B).
Out of 1362 vignettes, 14 (1%) were excluded due to missing participant demographic data in one GP (n=6, 0.4%), when participants asked about second symptoms but did not receive a response (n=4, 0.3%) or when participants did not enter a management plan (n=4, 0.3%).
GPs spent on average 13 min on the first consultation and 11 min on consultations 2–6, and sought 47 items of information per consultation (by asking text questions of the patient, looking up patient history or personal information, conducting ‘examinations’ or ‘bedside tests’). GPs received error messages in response to an average of 4.6% of data sought for consultations 2–6 (range 4%–22%) (see online supplementary file S2C).
Lung cancer investigations
Participants initiated investigations in 1000 (74%) vignettes. There was little difference in investigation between low-risk, medium-risk and high-risk levels (72%–75%) but large variation between clinical profiles (59%–86%). There were no variations by ‘patient’ gender or socioeconomic circumstances, but there was a gradient in investigation by ethnicity, with ‘patients’ of black ethnicities least likely and white ethnicities most likely to be investigated (71% vs 77%) (table 2).
GPs asked for additional, relevant information about second symptoms in 778 (58%) of cases overall, with marked variation by clinical profile, ranging from 48 (21%) in profile 1 to 214 and 216 (95%) in profiles 2 and 3, respectively. There was a significant interaction between seeking a relevant second symptom and clinical profile (p<0.001). Ninety-one per cent of GPs who discovered the presence of weight loss initiating investigation were compared with just 46% who did not seek this information. In contrast, knowing ‘patients’ experienced fatigue did not significantly change the likelihood of investigation (table 3).
While obtaining second symptom information was associated with more investigation (adjusted OR (AOR): 3.18 (2.27 to 4.70), p<0.001), there was still underinvestigation in ‘patients’ with appetite or weight loss (profiles 4 and 6) compared with ‘patients’ with chest pain and cough (profile 3) (AORs: 0.25 (0.14 to 0.42), p<0.001 and 0.5 (0.29 to 0.91), p=0.02, respectively) (table 4a). GPs were less likely to investigate older than younger ‘patients’ (AOR: 0.52 (0.39 to 0.70), p<0.001) and less likely to investigate ‘patients’ of black compared with white ethnicities (AOR: 0.68 (0.48 to 0.95), p=0.03) (table 4b).
Associations were similar when the variable for errors received was included (see online supplementary file S2D).
Comments volunteered by GP participants on their experiences of the application and their perceptions of its utility as a research tool for eliciting the decision-making process are summarised in online supplementary file S3.
In this factorial experiment using vignettes in simulated consultations, GPs' decisions to investigate lung cancer were influenced by whether they sought out additional, relevant clinical information about the presence of common symptoms. Even when participating GPs elicited sufficient information about symptoms, inequalities by age and ethnicity in investigation decisions remained.
Comparisons with existing literature
Our data were collected during 2012–2013, and our finding that GPs investigated a high proportion (72%–75%) of cases is in line with literature from 2013.23 However, it is higher than that might have been expected if GPs were following the latest national guidance for suspected cancer investigation available during the study period.15 Participants may have proposed more tests for vignette ‘patients’ than they would in reality, because they were not subject to the resource constraints of clinical practice or may have ordered X-rays primarily to investigate diagnoses other than cancer. Alternatively, they may have been aware of, and, responding to epidemiological evidence, presumed patient preferences and policy published since the 2005 NICE guidance, all of which support a lower threshold for cancer investigation.24–27 Indeed, updated NICE guidance on referral of suspected cancer, published in 2015 (after our data were collected), includes a substantially lower investigation threshold than that recommended in their earlier guideline,28 such that all our vignettes would now suggest investigation.
We found that in 42% of cases, GPs did not seek additional information that would help to make an informed decision regarding referral and that was available on request. This accords to some extent with international studies of missed opportunities in cancer diagnosis.29 ,30 In the UK, the updated NICE guidance explicitly recognises that patients with combinations of common symptoms may be more likely to have lung cancer than patients with any one of these symptoms alone,28 ,31 but patients may not volunteer all the symptoms they experience in consultations, perhaps due to real or perceived time constraints in the consultation.31 The importance of data gathering for reaching a timely diagnosis was highlighted in the recent Institute of Medicine report into improving diagnosis in healthcare.32 The study by Zwaan et al33 of breathlessness using expert review of medical records found evidence of inappropriately selective information gathering in a third of cases, with some evidence that diagnostic error and patient harm occurred in a proportion of these cases. Our study extends the field by providing objective evidence of non-clinical variations in data gathering by physicians in a large vignette study and demonstrates associations between gathering sufficient data and appropriate decision-making.
We also found that the effect of eliciting this second symptom on decision-making varied by symptom. It made little difference whether participants knew that patients had a cough or fatigue, but made significant difference to decision-making if participants knew of appetite and weight loss. For weight loss in particular (a key question when clinicians are considering whether cancer is a possible diagnosis), in 91% of cases where GP participants had elicited information about weight loss, they initiated investigation, compared with just 46% where GPs were unaware the patient had lost weight. It is important to acknowledge that neither in real life nor in the vignettes are the factors (symptom, age and smoking) that constituted each profile independent of one another. Therefore, while we contend the results are interpretable and reliable, they are not as definitive as a randomised controlled trial result; so, this finding has to be treated with some caution. However, the finding accords with the recent ‘think aloud’ study by Kostopoulou et al,34 which suggests that when physicians have an idea of cancer early in the consultation, they ask pertinent questions and initiate appropriate investigations to ensure a cancer diagnosis is reached. Therefore, it still seems likely that routinely questioning patients with ongoing respiratory symptoms about weight loss would expedite the diagnosis of some lung cancers.
Our finding that GPs were less likely to investigate older ‘patients’ is consistent with several observational studies of primary care cancer referral and investigation.35 ,36 Scott et al37 propose that as patients grow older, they are increasingly likely to attribute bodily changes to normal ageing processes than to disease. If clinicians also apply this ‘normal ageing’ heuristic, it may explain why GPs in this study were less likely to investigate older patients, despite knowing their symptoms. In contrast, patient experience survey data indicate more referral delays in younger (aged 55–64 years) than older patients (over 75 years). However, survey data may be biased if older patients (with lower overall survival) were underrepresented because they had died or were too ill to participate in the survey (which was undertaken 6–12 months after diagnosis).2
We also found smaller ethnic variations in GPs' investigation behaviour, with fewer investigations initiated in black (and to some extent) South Asian ‘patients’ than white patients. This is consistent with survey data where non-white patients with cancer report more referral delays than white patients.2 One possible explanation is that GPs were less ready to consider a lung cancer diagnosis in individual non-white ‘patients’ who presented with high-risk clinical profiles, because they placed weight on knowledge that lung cancer risk factors and prevalence are lower in black and South Asian than white populations.38 However, there is no evidence that patients of different ethnicities exposed to the same risk factors with similar symptoms are at different risk of lung cancer; so, differential investigation by ethnicity is not clinically warranted. Another possible explanation is that investigation likelihood is influenced by GPs' ethnicity. In this study, there were only seven GPs identified as black; so, it was not possible to examine this, but the mechanism by which observed ethnic variations in decision-making occur remains an important question to address.
Strengths and limitations
Our novel approach, using vignettes in an interactive website that delivered real-time responses, obtained comprehensive information on decision-making in over 99% of consultations and in a timeframe comparable with a typical consultation. The method simulated more components of the decision-making process in real time than that has been achieved in previous studies.39–41
Of equal importance is the fact that we applied a randomised, factorial, experimental design, with exact balance on profile and risk, and approximate balance, with random allocation, to GPs, on sociodemographic factors. This allowed us to examine the effects of patients' sociodemographic and clinical characteristics on GPs' decision-making. We were not able to achieve total orthogonality in design of every patient characteristic, but the randomisation and approximate balance give some confidence in the general applicability of our results.
Despite the advances we achieved in simulating real consultations, the online vignettes were limited mainly due to the constraints of the natural language system. These constraints meant the website was unable to provide responses to all GPs' information requests. In the postconsultation survey, 12 GP participants (5%) reported difficulty in obtaining information, which caused some of them frustration, and a small number (n=4, 1.8%) observed may have altered their decision-making behaviour. The process itself of typing in questions may also have prompted GP participants to consider their clinical reasoning more than they would in their routine clinical practice. Conversely, the opportunity to select from the extensive drop-down selections of examinations without facing any of the logistical constraints faced in a real consultation (eg, time required to measure weight) may have led them to seek more information with less consideration than they would do in routine clinical practice. However, it is important to note that all approaches to simulating consultations have some drawbacks. For example, while other vignette studies have enabled physicians to ‘ask’ questions of the patient, this has required a researcher to type responses online as ‘the patient’, sometimes resulting in longer ‘consultations’ than real consultations.39–41 Moreover, there are several reasons why these simulations still provide valuable insights into GPs' decision-making. First, our sensitivity analysis indicates that results were very close to the main analysis even after taking into account GPs' difficulties in obtaining responses from the application. Second, shortcomings in doctor–patient communication during the clinical encounter are well recognised, such that patients in real consultations do not volunteer all the information clinicians would need to make informed decisions.18 Third, it is the divergence from reality that makes simulated consultations useful for studying phenomena or circumstances not possible to observe or investigate in real life.42 In this study, this divergence enabled the systematic manipulation of patient characteristics to examine their effects on GPs' decisions in isolation of the complex range of patient expectations and comorbidities that might explain variations in decision-making in real life. The divergence also meant GPs were not faced with the logistical and system/organisational constraints that affect referral decisions in practice. As a result, the findings provide insight into the cognitive processes underlying GPs' decision-making when the variation in system and patient factors present in real life are removed.
There was some bias in the GP sample registering for the study in that GP participants' practices had higher cancer referrals than non-participating practices; so, they may be more ready than GPs nationally to investigate symptoms suggestive of cancer. However, there was no evidence to suggest participating GPs would have greater or smaller variation in decision-making than non-participants.
Another possible limitation is that the risk levels were based on positive predictive values from the CAPER symptom case–control dataset, which had wide and overlapping CIs (as shown in online supplementary data S1). Therefore, the PPVs alone are not sufficient to conclude that clinical risk and therefore decision-making should have varied by profile. However, even where the PPV point estimates are most disparate and CIs overlap minimally, GPs investigated similar proportions of patients. In addition, the risk profiles had additional information other than PPV which should have guided decision-making if GPs were acting in line with the latest available clinical guidance (eg, symptom duration). Furthermore, our three broad categories align well with the 2015 NICE guidance. These equate to: risk below 1%, safety netting; 1%–3%, test in primary care if possible; over 3%, refer for specialist testing.28
Conclusions and implications for research and practice
This study demonstrates that GPs were not more likely to initiate cancer investigations for ‘patients’ with higher risk symptoms. Furthermore, they do not investigate everyone with the same symptoms equally. It also indicates that insufficient data gathering could be responsible for diagnostic errors. It is not that GPs are doing a bad job: the average GP sees one patient with new lung cancer a year.16 Distinguishing symptoms indicating possible cancer from self-limiting illness that GPs see daily, therefore, is challenging. However, non-clinical variations in investigation could contribute to the sociodemographic inequalities in the timeliness of diagnosis and survival of lung cancer seen in the UK. It also marks a departure from the National Health Service commitment to promote equality through its services.43 The findings also have wider implications for quality and safety in healthcare internationally. According to the Institute of Medicine, diagnostic errors contribute to approximately 10% of patient deaths, and sufficient data gathering is an essential part of reaching a timely diagnosis.32
It is therefore incumbent on health systems to consider strategies that can be implemented in practice, such as clinician education,32 ,44 decision-support tools24 and the assessment of equity in clinical practice.
We are grateful to the 227 GPs that took part in this study and the GP and medical colleagues that piloted the study website in various stages of development. We also acknowledge the following for their essential input in the study:
– Dr Anjali Bajekal provided clinical advice throughout the study and advised on successive rounds of piloting.
– Dr Stephanie Meats recruited, followed up and supported GPs to participate in the study during a placement as a GP registrar.
– Ms Lucy McCann, Mr Barnaby Raine, Ms Anastasia Tillman, Mr Eddy Wax recruited, followed up and supported GPs to participate in the study, working as administrators at UCL.
– Ms Rachael Dodd, Mr Steven Marcos and Ms Jessie Porter coded participant data to support the analysis.
– Mr Dave Ardron and Mr Tom Haswell advised on patients’ experience and styles of reporting lung cancer symptoms to the GP.
– Athenaeum Educational Technologies programmed the tool.
– UCL Media filmed the patient actors.
– Colleagues in the Policy Research Unit and the Department of Applied Health Research provided valuable advice in the development and interpretation of the vignette results.
We acknowledge the support of the National Institute for Health Research, through the Primary Care Research Network for its effective recruitment of study participants.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.