Accuracy of telephone triage for predicting adverse outcomes in suspected COVID-19: an observational cohort study

Objective To assess accuracy of telephone triage in identifying need for emergency care among those with suspected COVID-19 infection and identify factors which affect triage accuracy. Design Observational cohort study. Setting Community telephone triage provided in the UK by Yorkshire Ambulance Service NHS Trust (YAS). Participants 40 261 adults who contacted National Health Service (NHS) 111 telephone triage services provided by YAS between 18 March 2020 and 29 June 2020 with symptoms indicating COVID-19 infection were linked to Office for National Statistics death registrations and healthcare data collected by NHS Digital. Outcome Accuracy of triage disposition was assessed in terms of death or need for organ support up to 30 days from first contact. Results Callers had a 3% (1200/40 261) risk of serious adverse outcomes (death or organ support). Telephone triage recommended self-care or non-urgent assessment for 60% (24 335/40 261), with a 1.3% (310/24 335) risk of adverse outcomes. Telephone triage had 74.2% sensitivity (95% CI: 71.6 to 76.6%) and 61.5% specificity (95% CI: 61% to 62%) for the primary outcome. Multivariable analysis suggested respiratory comorbidities may be overappreciated, and diabetes underappreciated as predictors of deterioration. Repeat contact with triage service appears to be an important under-recognised predictor of deterioration with 2 contacts (OR 1.77, 95% CI: 1.14 to 2.75) and 3 or more contacts (OR 4.02, 95% CI: 1.68 to 9.65) associated with false negative triage. Conclusion Patients advised to self-care or receive non-urgent clinical assessment had a small but non-negligible risk of serious clinical deterioration. Repeat contact with telephone services needs recognition as an important predictor of subsequent adverse outcomes.


BACKGROUND
During the COVID-19 pandemic, there was a risk that hospitals could be overwhelmed by patients who did not need specific treatment.UK government pandemic planning predicted that, in the advent of an influenza or similar pandemic, there could be around 750 000 excess emergency department (ED) attendances

Key messages
What is already known on this topic ⇒ Telephone triage has been used to divert patients with suspected COVID-19 to self-care or for non-urgent clinical assessments, and thereby help mitigate the risk of health services being overwhelmed by patients who require no specific treatment.⇒ Concerns have been raised that telephone triage may not be sufficiently accurate in identifying need for emergency care; however, no previous evaluation of accuracy of telephone triage in patients with suspected COVID-19 infection has been completed.

What this study adds
⇒ Patients advised to self-care or receive non-urgent clinical assessment had a small but non-negligible risk of deterioration and significant adverse outcomes.⇒ Telephone triage has comparable performance to methods used to triage patient acuity in other emergency and urgent care settings.⇒ Accuracy of triage may be improved by better recognition of multiple contact with services as a predictor of adverse outcomes.
Original research in the UK. 1 2 Attendances were predicted to be largely for patients who would not require hospitalisation. 3 4o reduce this risk, from 18 February 2020 onwards, NHS England advised patients with suspected infection to contact the National Health Service (NHS) 111 service instead of attending healthcare providers. 5HS 111 is a national, free-to-use 24-hour telephone triage service for urgent health problems.Initial triage is carried out by trained, non-clinical call advisors using the NHS Pathways clinical decision support software.The end point (disposition) is advice on what to do next, in terms of which service to access and the timeframe within which this access should occur.If appropriate, the call can be passed onto a clinician (usually a nurse or paramedic) for further assessment and, depending on local arrangements, callers can speak to other specialist clinicians or appointments can be made with relevant services, including general practitioners.Similar COVID-19 telephone triage 'hotlines' have been implemented in parts of the USA. 6 7n the first 6 months of the COVID-19 pandemic, ED attendances in the UK decreased by approximately 25%, probably due, at least in part, to displacement of care. 8Patients who did attend the ED with suspected COVID-19 infection were high acuity with a mortality rate of 15.5%, with lower acuity patients likely being managed via NHS 111. 9 Indeed, there were almost 3 million NHS 111 calls made across England in March 2020; a record number and double the number in March for the previous year. 10To cope with the increase in call volume, a specific telephone triage pathway for patients with suspected COVID-19 infection was introduced in early February 2020, which underwent rapid updates as the pandemic progressed.Local NHS 111 services used interim triage methods while awaiting implementation of new telephone triage pathways and, due to excess demand, calls started to be diverted to a national centre on 4 March 2020.
Concerns have been raised that during this period of high demand and reconfiguration of services, telephone triage may have underappreciated the severity of some callers' illness, leading to delays in treatment and avoidable harm. 11There have been calls for an inquiry into the effectiveness of NHS 111 telephone triage at identifying critically unwell patients and the Healthcare Safety Investigation Branch (HSIB) has started an investigation into NHS 111's response to callers with suspected COVID-19. 12 13A specific concern raised by public and patient representatives affiliated with HSIB is: 'The NHS 111 telephone advice given did not fully respond to the severity of the reported symptoms'. 13here has been no previous evaluation of the accuracy of the clinical risk-assessment performed by this service nor, to our knowledge, other telephone triage services for patients with suspected COVID-19 infection.Evaluating the accuracy of telephone triage and specifically estimating the risk of serious adverse outcome in those advised to self-care or wait for non-urgent assessment allows safety concerns regarding underappreciation of illness severity to be examined.
Our study aimed to: 1. assess how accurately NHS 111 telephone services identified those who suffered an adverse outcome needing an emergency response; 2. identify any factors that may have affected the accuracy of telephone triage.

Study design
The Pandemic Respiratory Infection Emergency System Triage (PRIEST) study was piloted as the Pandemic Influenza Triage in the Emergency Department (PAINTED) study, part of the National Institute for Health Research portfolio of studies to be activated in an influenza pandemic in England. 14

Original research
111 online triage services for suspected COVID-19 in June 2020, including scheduling of clinical assessments. 16 17All patients within the English NHS are allocated a unique identification number, the NHS number.Records with no NHS number (<2%) were not provided as these records could not be associated with a traceable individual without manual review.The dataset consisted of patient identifiers, demographic data, call details and triage dispositions extracted from routinely collected electronic NHS 111 call records (online supplemental material 1).Patient identifiers were provided to NHS Digital for them to trace the identities of our cohort (ie, indicate different sets of identifiers belonging to the same patient) and to supply additional individual-level demographic, comorbidity and outcome data.NHS Digital manages national health and care data collections from a variety of settings and providers in England. 18NHS Digital identified records in their collections belonging to patients in our cohort and provided data on patient demographics, limited COVID-related general practice (GP) records, ED attendances, hospital inpatient admissions, critical care periods and death registrations from the Office for National Statistics (online supplemental material 2).
Both YAS and NHS Digital removed records belonging to patients who had registered an NHS national data opt-out.The study team excluded patients who had opted out of any part of the PRIEST study and those with inconsistent records (eg, multiple deaths recorded or death before latest activity).Patient identifiers across all datasets were replaced with a consistent pseudo-identifier to enable the identification of records belonging to individual patients across datasets without revealing patient identifiers.

Inclusion criteria
Our final cohort consisted of all adult (aged 16+ years) patients at time of first call (index contact) within the YAS NHS 111 calls dataset who were traced by NHS Digital and for whom a final triage disposition, and therefore urgency of recommended triage, was recorded for their index contact.

Patient characteristics
Comorbidities recorded 12 months before the index contact with NHS 111 were extracted from electronic healthcare data provided by NHS digital (online supplemental material 2).This is consistent with the timescale for inclusion of comorbidities used to calculate comorbidity indexes using other routine data sources. 19 20Immunosuppressant drug use only contributes to the immunosuppression comorbidity if recorded in the 30 days before index contact.Pregnancy status was based on GP records recorded in the previous 9 months.Frailty in patients older than 65 years was derived from the latest recorded (if any) clinical frailty scale score present in the electronic GP records prior to index contact. 21Smoking status was similarly derived from GP records based on the latest recorded (if any) smoking status prior to the index contact.

Outcomes
The primary outcome was death or renal, respiratory or cardiovascular organ support (serious adverse outcomes) at 30 days from index contact (identified from death registrations and critical care data).
The secondary outcome was death or organ support at 3 and 7 days from index contact.

Analysis
We first conducted a descriptive analysis of patient demographics, comorbidities and call disposition and used multivariable logistic regression modelling to confirm known patient characteristics associated with the primary adverse outcome in COVID-19 infection.The model included: age, gender, available comorbidities, smoking status, number of medications, clinical frailty scale, deprivation index and number of contacts with telephone triage.Ethnicity was excluded from analysis due to the high proportion of missing data (22.2%).Obesity was excluded due to an observed implausible protective association with the primary outcome which we believe to be an artefact of how these data were collected and recorded in the electronic GP dataset.For those under 65 years, a frailty scale score of 1 was assigned, since the score is not validated in this age group.
To assess how accurately NHS 111 identified patients with adverse outcomes, the call disposition categories of the index contact were divided into a binary classification of either: ambulance dispatched, or other urgent clinical assessment required; and self-care or non-urgent assessment (online supplemental material 3).Urgent clinical assessment included advice to selfpresent to the ED, or provision of a further clinical assessment either immediately or within 4 hours of the call.Advice and call disposition provided by NHS 111 can change over successive calls as a patient's condition changes.Therefore, to assess if deterioration was recognised over multiple calls, a sensitivity analysis was conducted in patients who had an adverse outcome in which the disposition of the call immediately before the adverse outcome was used for binary classification.
We assessed the accuracy of the binary triage classification (ambulance dispatch/urgent clinical assessment vs self-care/non-urgent assessment) in terms of sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for the primary outcome with 95% CIs.To assess whether the implementation of different COVID-19-related NHS Pathways affected the accuracy of triage, accuracy was estimated for the whole study period and in two distinct time periods.The first time period Patient characteristics of false negatives (those advised to self-care/non-urgent assessment who experienced the primary outcome) and true positives (those provided with an ambulance/urgent assessment who experienced the primary outcome) were compared.Similarly, we compared the characteristics of false positives (those provided with an ambulance/urgent assessment and not conveyed to hospital and did not experience the primary outcome) and true negatives (those advised to self-care/non-urgent assessment) among those who did not experience the primary composite adverse outcome.In patients with the adverse outcome, multivariable logistic regression was used to identify patient characteristics associated with false negative triage.We completed equivalent analysis in those without the adverse outcome to identify factors which predicted false positive triage.The models included: age, gender, available comorbidities, smoking status, number of medications, deprivation index and number of contacts with telephone triage.Due to a low proportion of missing data in included variables, complete case analysis was conducted.As with the previous analysis, ethnicity and obesity were excluded.Frailty was additionally excluded from this modelling due to a high proportion of missing data (39.4% of false negatives).
The sample size was based on the number of NHS 111 calls for suspected COVID-19 that YAS received during the first wave of the pandemic.All multivariable logistic models included a sample size of >500 and >10 events (adverse clinical outcome, false positive or false negative triage) per predictor parameter. 22 23ll totals presented are rounded to the nearest 5, with small numbers suppressed to comply with NHS Digital data disclosure guidance.

Patient and public involvement
The Sheffield Emergency Care Forum (SECF) is a public representative group interested in emergency care research. 24Members of SECF advised on the development of the PRIEST study and two members joined the Study Steering Committee.A PRIEST study patient public involvement (PPI) group was created during the study which included patients who had been admitted to hospital with COVID-19 or their family members.Although not involved in conducting the analyses, both PPI groups were consulted regarding study design, particularly the ethical implications of using routine health data for research.All study findings were presented and discussed with the PPI groups.Members helped with interpretation of findings particularly regarding acceptable risk of misclassification.

Study population
Figure 1 and table 1 summarise study cohort derivation and the characteristics of the 40 261 included individuals.In total, 1200 people (3%, 95% CI: 2.8% to 3.2%) experienced the primary outcome (death or organ support) within 30 days following first contact with telephone triage services and 670 (56%) of adverse outcomes occurred within 7 days of contact.In our study cohort, 8165 patients (20.3%, 95% CI: 19.9% to 20.7%) were conveyed or self-presented to the ED and 4490 (11.2%, 95% CI: 10.9% to 11.5%) were admitted as hospital inpatients within 30 days of index contact.

Prediction of false negative or false positive triage
Online supplemental material 7 compares the characteristics of who were correctly triaged as true positives or misclassified as false negatives.In both groups, approximately 50% of people experienced the primary adverse Original research outcome within 7 days of first contact, although a higher proportion of true positives experienced the adverse outcome within 3 days of contact.Multivariable modelling showed that younger age, multiple contacts and diabetes were associated with increased risk of false negative triage (table 3).The effect estimates for multiple NHS 111 contacts were similar if the triage disposition of last call before the primary outcome (two contacts, OR 1.96, 95% CI: 1.11 to 3.48 and three or more contacts, OR 7.78, 95% CI: 1.02 to 59.43) was used to classify true positives and false negatives.
Online supplemental material 8 compares the characteristics of patients who received false positive or true negative triage classification; 24.9% of the cohort were false positives and table 4 presents the results of multivariable modelling to identify factors associated with being a false positive.Increased risk of being a false positive was associated with chronic renal impairment, immunosuppression and chronic respiratory disease (table 4).Other predictors included older age, smoking, increased medication use and female gender (table 4).

Summary
Our study showed that, during the study period, telephone triage achieved a sensitivity of 74.2% (95% CI: 71.6% to 76.6%) and specificity of 61.5% (95% CI: 61% to 62%) for the primary outcome.Telephone triage recommended self-care or non-urgent assessment for the majority (60%), with a very low but nonnegligible risk of adverse outcome (1.3%).Sensitivity of telephone triage was higher for outcomes at 3 and 7 days (online supplemental material 6) than 30 days, and sensitivity appeared to be increased at the expense of specificity in the later period of clinical assessment pathway implementation (table 2).Users of the service who were identified with possible COVID-19 infection had a low (3%) risk of adverse outcome.
To identify factors which may affect accuracy of triage, we used multivariable analysis to identify predictors of false negative and false positive triage.The findings need cautious interpretation, given the limited information available during telephone triage, but suggest that some comorbidities (such as chronic respiratory disease) may be overappreciated as predictors of adverse outcome, while the association of diabetes with adverse outcome may be underrecognised.Perhaps most striking, is that multiple contacts with NHS 111, in which possible COVID-19 infection was identifed, was associated with false negative assessment, suggesting that repeat contacts may require a more urgent response.

Comparison with previous literature
The available evidence assessing the accuracy of telephone triage for serious clinical outcomes, particularly  Existing studies evaluating similar telephone triage 'hotlines' in the USA have described service use or acceptability. 6 7The sensitivity and specificity of telephone triage found in our study to the composite primary outcome is similar to that reported for clinical tools used to triage patient acuity in the ED, at a point on the receiver operating characteristic curve with an equivalent balance of sensitivity and specificity. 25][28][29] However, a systematic review of accuracy of emergency medical service dispatch by call handlers found the most urgent ambulance dispatch priorities to have sensitivities ranging between 78% and 95.6% for time critical conditions and specificities ranging between 15.4% and 83.8%.Despite the reported sensitivities being higher than achieved by telephone triage in our study, the associated negative predictive values ranged from 95.4% to 96.9%, similar to that estimated in our study.

Strengths and limitations
Although telephone triage has been recommended and widely used during the pandemic in the UK and the USA to risk assess patients with suspected COVID-19 to limit potential spread of infection, this appears to be the first evaluation of accuracy. 6 30We have used a large cohort of patients identified from routinely collected telephone triage records and linked this to nationally collected, patient-level healthcare records to provide robust outcome data.We have assessed performance in a cohort of patients with suspected infection which, in the absence of accurate universally available rapid COVID-19 diagnostic tests, reflects the population which urgent and emergency care services must clinically triage.Unrestricted community testing for those with symptoms suggestive of COVID-19 infection was only available from 18 May 2020 and therefore it is not possible to estimate the proportion of confirmed infections.][33] Due to the use of routinely collected data, there were high rates of missing data for some variables, for example, ethnicity and frailty, which prevented inclusion in some analyses.We have also assumed that if comorbidities were not recorded in the previous 12 months they were not present.The mechanism of how data are collected and recorded in the routine datasets

Table 1 Continued
Original research used means that, as identified for obesity, there may be bias in the classification of patients.The estimated prevalence of obesity in our cohort is 15% (half that reported in the national health survey) and, as weight is not comprehensively and consistently measured by GPs, the observed protective association is likely to reflect unknown characteristics associated with a measurement being taken, rather than obesity itself. 34e have evaluated the performance of NHS 111 telephone triage as implemented by YAS.Although NHS 111 Pathways software algorithms are developed nationally, there may be variability in local implementation which may affect accuracy.During the study period, calls were diverted between regions and to a national centre due to excess demand.The basis on which calls were selected for diversion is not transparent, but it is possible that patients with less complex healthcare needs were diverted to the national centre, potentially affecting the generalisability of our results.Our study period includes multiple pathway iterations but, due to how rapidly assessment pathways were updated, it was not possible to assess the accuracy of individual assessment pathways (online supplemental material 4).A national online assessment tool was implemented from the end of February 2020 and this may have affected the characteristics of the population using telephone triage services for advice. 35However, it was not until June 2020 that the public were advised to use the NHS 111 online coronavirus service before calling NHS 111.

Implications
Telephone triage performed comparably to triage methods used for patient acuity in the ED and, given the limited information available, including a lack of physiological parameters, this may reflect the best accuracy that could be achieved. 25 36It is difficult to accurately model the impact on emergency medical services if telephone triage had not been recommended for the initial assessment of patients with suspected COVID-19.However, in 2019, the estimated population of Yorkshire and the Humber was 5 502 967 (including children). 37On the basis of the number of patients in our cohort and study period, not using telephone triage could have led to around 61 extra ambulances or urgent clinical assessments being provided each day per 1 000 000 population, without considering diversion to the national centre.YAS provided a face-to-face response to an estimated 298 incidents per day in March 2020. 38NHS 111 telephone triage appears to have effectively helped to mitigate the risk of emergency healthcare services being overwhelmed by lower risk patients during the 'first wave' of the pandemic in England.
This must be weighed against the small but nonnegligible risk that patients who were recommended to Original research self-care or have a non-urgent clinical assessment had of serious adverse outcomes.Early clinical guidelines for the risk stratification of patients with suspected COVID-19 infection, on the basis of previous influenza epidemics, emphasised the importance of respiratory comorbidities and may have underestimated the risk associated with gender and diabetes. 39The results of our multivariable modelling reflects this, with the importance of smoking and chronic respiratory disease appearing to be overestimated and diabetes underestimated.Later clinical guidelines incorporated this evolving research base and emphasised the risk associated with diabetes. 40However, the association we found with multiple NHS 111 COVID-19-related contacts and risk of undertriage does not appear to have been previously identified and may reflect that patients with repeat contacts represent an unrecognised high-risk group.Patients with early representation after discharge from the ED are considered clinically high risk for adverse outcomes and misdiagnosis and this is likely to be reflected in patients who contact NHS 111. 41This finding has been fed back to the telephone triage service provided by YAS and is likely to be applicable to telephone triage in different settings.
Telephone triage services for suspected COVID-19 and other conditions have rapidly expanded during the pandemic across different settings, with specific COVID-19 telephone triage 'hotlines' created in parts of the USA. 6 7 42Different models for telephone triage in urgent and emergency care exist internationally. 26 43 44Research is needed to determine the optimal configuration of such services in terms of accuracy and cost-effectiveness. 43NHS 111's use of trained, nonclinical call advisors for initial assessment contrasts with other national triage services, where assessments are performed by nurses and other clinicians: this may impact accuracy, acceptability and cost. 44The acceptable risk of deterioration following such triage is subjective and significant variation in risk tolerance between clinicians and public representatives has been demonstrated. 45Research may be needed to support implementation of telephone triage methods and tailor triage to the resource constraints and risk tolerance of different healthcare settings.Within the context of the UK, future research could use our methods for a national evaluations of NHS 111 performance, including the devolved nations, and to assess regional variations in triage, accuracy and safety.

CONCLUSIONS
We have conducted the first evaluation of accuracy of telephone triage for need for emergency treatment in patients with suspected COVID-19 infection.Telephone triage appears to have had an important  Original research role in managing lower-risk patients and potentially preventing many patients who required no specific treatment from attending hospitals or other care providers.This must be weighed against the small but non-negligible risk of serious adverse outcomes in patients advised to self-care or have a non-urgent clinical assessment.Repeat contact with triage services may need more recognition as an important predictor of subsequent deterioration.Future research is needed to determine acceptable risk of deterioration in patients advised to self-care and the optimal configuration of telephone triage services.

Figure 1
Figure 1 Strengthening the Reporting of Observational Studies in Epidemiology flow diagram of selection of study population.NHS, National Health Service; YAS, Yorkshire Ambulance Service.

Table 1
Population characteristics

to primary outcome from index contact-up to and including (n, %)
†Unrestricted community testing for suspected COVID-19 infection was only available from 18 May 2020.Confirmed diagnosis is based on inpatient PCR testing or clinical diagnosis in hospital.‡Suppressed due to small numbers.ED, emergency department; NHS, National Health Service.

Table 2
Performance of binary NHS 111 triage (ambulance or urgent assessment 4 hours or less) for composite outcome (death or organ support)

Table 3
Multivariable model predicting false negatives

Table 4
Multivariable model predicting false positives