Statistics from Altmetric.com
Prediction models hold tremendous promise as a way to improve patient outcomes and healthcare quality more generally by efficiently targeting interventions to those patients most likely to benefit. Most aspects of healthcare (ie, tests, medications and procedures) have associated risks and burdens. Accurate prediction of patients at higher risk for adverse outcomes from their underlying conditions allows for better targeting of interventions, so that only patients for whom the benefits of the intervention outweigh the risks of treatment are provided the intervention. Conversely, accurate prediction of patients at low risk for adverse outcomes (for whom the risks of the intervention outweigh the benefits) can help those patients avoid unnecessary and potentially harmful interventions. Thus, accurate prediction models play a pivotal role in the vision of personalised medicine, where all clinical decisions are tailored to each individual’s unique risk profile.1
In this issue of BMJ Quality & Safety, McAlister and van Walraven2 expand our knowledge of prediction for common, adverse hospitalisation outcomes (prolonged hospitalisation, 30-day mortality and readmission) in older adults. They use province-wide data from Ontario, Canada in 2004–2010 to compare how two previously published prediction models (Hospital Frailty Risk Score or HFRS3 and Hospital-patient One-year Mortality Risk or HOMR4) predict these outcomes in historical Ontario data. They found that the HFRS more accurately predicted prolonged hospitalisation, while the HOMR more accurately predicted 30-day mortality. Both HFRS and HOMR poorly predicted 30-day readmissions to hospital.
The authors should be commended for conducting a methodologically rigorous external validation of previously developed prediction models. Previous reviews have found that many prediction models are developed, but few are externally validated.5 Without external validation, providers face substantial uncertainty about whether a model is accurate enough to guide clinical decisions. Thus, methodologically rigorous validation studies such as this one constitute a critical but often-ignored component in the chain of evidence that starts with prediction model development and ends with prediction models being used to inform clinical care.
As the authors themselves note, their primary finding, that the HFRS better predicts prolonged hospitalisation while the HOMR better predicts 30-day mortality, comes as no surprise. The HFRS was developed and optimised to predict frailty and prolonged hospitalisation. In contrast, the HOMR was developed and optimised to predict 1-year mortality. This study’s results show that while adverse outcomes such as prolonged hospitalisation and mortality often cluster together, they are distinct: A prediction model optimised for 1-year mortality predicts 30-day mortality better than a prediction model optimised to predict frailty and prolonged hospitalisation. An optimistic interpretation of these results is that prediction models have reached a level of sophistication where related outcomes such as mortality and prolonged hospitalisation can be distinguished and distinct prediction models are needed for these related (same-same) but distinct (different) outcomes.
Three additional factors should be considered when interpreting the results of this study2: (1) differences between the UK and Ontario, (2) the importance of calibration as well as discrimination in the validation of prediction studies, and (3) the importance of physical function in the prediction of outcomes for hospitalised older adults.
One striking result of this study concerns the differences between the UK and Ontario populations of hospitalised older adults. At baseline, hospitalisation occurred much more commonly in the UK, with 40.1% of UK patients experiencing multiple prior admissions in the previous 2 years compared with only 6.3% in Ontario. This profound difference in the hospitalisation exposure likely led to increased rates of the International Classification of Diseases, 10th Revision (ICD-10) codes in the UK, resulting in the UK cohort having higher Charlson comorbidity scores (2.9 in UK, 2.0 in Ontario) and higher proportion of frailty (58% in UK, 26% in Ontario). These baseline differences in hospitalisation exposures may also have contributed to some of the unexpected, surprising results of this study. For example, patients with higher frailty risk scores (HFRS) in the UK were more likely to be readmitted; however, patients in Ontario with higher frailty risk scores were less likely to be readmitted. Future studies should explore potential sources of the differences in hospitalisation rates in the UK and Ontario, including the differential use of geriatric day hospitals or different coding practices. This information could be very helpful in interpreting some of these intriguing differences in both the rate of hospitalisation and the apparent differential effect of frailty on subsequent readmissions.
Calibration as well as discrimination
Future studies should focus on both discrimination and calibration as coequal components in prediction model validation.6 Historically, discrimination (usually measured by the c-statistic) has overshadowed calibration in prediction model validation. This is unfortunate. Calibration is as important as discrimination (if not more important) for prediction models that are going to be used to inform clinical decisions.7 8 Discrimination measures how well a model stratifies or orders patients by risk. However, patients and providers are less interested in whether a patient is higher (or lower) risk than others. The information that is most helpful in clinical decision making is absolute predicted risk, which is evaluated through calibration. For example, knowing that a patient is in the highest quintile of risk is often less important than knowing that the patient’s 1-year risk of the outcome is 50%. Thus, prediction model validation studies should prominently display calibration (predicted vs observed outcome rates) across risk groups so that readers can evaluate model calibration and calculate the predicted risk for an individual patient with a specific set of predictors.
There is an extensive literature highlighting the importance of physical function as a predictor of outcomes for hospitalised older adults.9 However, neither the HFRS or HOMR considers functional status predictors. This omission likely reflects the fact that functional data are often unavailable in electronic and administrative databases. Thus, although the HFRS, the HOMR and the current validation study are methodologically sound, major advances in the prediction of hospitalisation outcomes for older adults will likely remain out of reach until we can access additional data on factors such as physical function. Given the intrinsic importance of physical function to patients and its additional value in predicting outcomes such as readmission and nursing home placement, I urge regional and national data systems to follow the lead of the US Department of Veterans Affairs in routinely collecting functional data.10
This is an exciting time for clinical prediction with several trends coalescing to make prediction easier, faster and more accurate. First, the increasingly widespread use of electronic medical records means that more and more data are becoming readily accessible for clinical prediction. Instead of relying solely on administrative data such as age and ICD-10 codes, clinical data such as laboratory results, radiology results and pharmacy data are increasingly being used to identify which patients are at highest risk.11 Second, there is increasing acceptance by clinicians and the public that ‘big data’ and ‘predictive analytics’ can make many things, including healthcare, better. While previous generations of clinicians viewed sophisticated ‘black-box’ prediction models with scepticism, newer generations of clinicians, growing up in an age where Google correctly guesses your search phrase after three letters and Netflix recommends a show that ends up being your favourite, are more comfortable using predictions from sophisticated models to inform clinical decisions. These trends suggest that clinical prediction models will play a larger role in healthcare in the future. Studies such as this one will be a critical component of the evidence base that ensures that clinical prediction models fulfil their promise of better, safer care.
Funding SJL was supported by the National Institute on Aging (R01AG047897 and R01AG057751), VA HSR&D (IIR 15-434) and VA Office of Academic Affiliations (VA Quality Scholars Program, Grant #AF-3Q-09-2019-C).
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Commissioned; internally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.