Abstract
Background: The opportunity to improve care using computer reminders is one of the main incentives for implementing sophisticated clinical information systems. We conducted a systematic review to quantify the expected magnitude of improvements in processes of care from computer reminders delivered to clinicians during their routine activities.
Methods: We searched the MEDLINE, Embase and CINAHL databases (to July 2008) and scanned the bibliographies of retrieved articles. We included studies in our review if they used a randomized or quasi-randomized design to evaluate improvements in processes or outcomes of care from computer reminders delivered to physicians during routine electronic ordering or charting activities.
Results: Among the 28 trials (reporting 32 comparisons) included in our study, we found that computer reminders improved adherence to processes of care by a median of 4.2% (interquartile range [IQR] 0.8%–18.8%). Using the best outcome from each study, we found that the median improvement was 5.6% (IQR 2.0%–19.2%). A minority of studies reported larger effects; however, no study characteristic or reminder feature significantly predicted the magnitude of effect except in one institution, where a well-developed, “homegrown” clinical information system achieved larger improvements than in all other studies (median 16.8% [IQR 8.7%–26.0%] v. 3.0% [IQR 0.5%–11.5%]; p = 0.04). A trend toward larger improvements was seen for reminders that required users to enter a response (median 12.9% [IQR 2.7%–22.8%] v. 2.7% [IQR 0.6%–5.6%]; p = 0.09).
Interpretation: Computer reminders produced much smaller improvements than those generally expected from the implementation of computerized order entry and electronic medical record systems. Further research is required to identify features of reminder systems consistently associated with clinically worthwhile improvements.
Computerized systems for entering orders and electronic medical records represent two of the most widely recommended improvements in health care. 1 These systems offer the opportunity to improve practice by delivering reminders to clinicians at the point of care. Such reminders range from simple prescribing alerts to more sophisticated support for decision-making.
Previous reviews have classified all computer reminders together, including computer-generated paper reminders and email alerts sent to providers, along with reminders generated at the point of care. 2–5 They have also typically reported the proportion of studies with results that were on balance “positive.” 2–4 We conducted a systematic review to quantify the expected magnitude of improvements in processes of care from computer reminders delivered to physicians during their routine electronic ordering or charting activities.
Methods
Data sources
We searched the MEDLINE database (1950 to July 2008) using relevant Medical Subject Headings and combinations of text words such as “computer” or “electronic” with terms such as “reminder,” “prompt,” “alert” and “support.” A methodologic filter identified all potential clinical trials. We similarly searched the Embase and CINAHL databases (both to July 2008). We also retrieved all articles that mentioned computers, reminder systems or decision support from the Cochrane Effective Practice and Organisation of Care registry (www.epoc.cochrane.org/welcome), which covers multiple bibliographic databases. Finally, we scanned reference lists of all included studies and review articles. For non-English-language articles, we screened English translations of titles and abstracts, pursuing a full-text translation as needed to determine inclusion or exclusion of the study.
Study selection
Eligible studies evaluated the effects of computer reminders on processes or outcomes of care using a randomized or quasi-randomized controlled design (allocation on the basis of an arbitrary but not truly random process, such as even or odd patient identification numbers). We required that clinicians encounter the reminder during routine performance of the activities of interest, such as prescribing medications or documenting clinical information. Reminders that required clinicians to deviate from their usual activities (e.g., to use a special program without any prompt from the main clinical information system) were excluded because relying on users to remember to call up such resources undermined the core notion of a reminder.
Outcomes
We focused primarily on improvements in processes of care rather than on clinical outcomes, because we wished to determine the degree to which computer reminders achieved their main goal, namely changing provider behaviour. The degree to which such changes ultimately improve patient outcomes will vary depending on the strength of the relation between targeted processes and clinical outcomes. Consequently, if computer reminders do not improve patient outcomes, this may reflect inadequate connections between the targeted processes and outcomes of care rather than a failure to change physician behaviour. Nonetheless, we did capture clinical out-comes, including intermediate outcomes such as control of blood pressure. We excluded outcomes primarily related to resource use, such as length of hospital stay.
We standardized all outcomes so that increases always corresponded to improvements in care. For instance, if a study reported the proportion of patients who received inappropriate medications, we would record the complementary proportion of patients who received appropriate care.
Data extraction
For any given article, two of three investigators (K.S., A.J. or A.M.) independently screened the citation for inclusion. They abstracted the following data from included articles: clinical setting, number of participants, methodologic details, characteristics of the computer reminder, the presence of cointerventions, and the results for eligible outcomes. Discrepancies between the two reviewers were resolved by discussion, involving the third reviewer if necessary to achieve consensus.
Statistical analysis
We anticipated that many studies would assign intervention status at the provider level but would not account for “cluster effects” when analyzing patient-level data. 6,7 Correcting for clustering effects can sometimes be achieved by estimating the intraclass correlation coefficients, especially if the primary studies all report the same outcome and a minority provide relevant data upon which to base imputations. 8 In this case, however, few studies contained the necessary data, and studies tended to report multiple outcomes, which required an additional assumption that correlations within clusters do not vary across different outcomes.
To preserve the goal of quantifying the effects of computer reminders without resorting to numerous assumptions and conveying a misleading degree of precision, we focused on the median and interquartile range (IQR) for improvements reported by eligible studies. This method, first used in a large review of strategies for implementing guidelines, 9 has since been applied in Cochrane reviews of interventions to improve practice 10–14 and other systematic reviews of quality improvement interventions. 15–18
Quantifying the median improvement involves two distinct uses of “median.” First, to handle multiple outcomes within individual studies, we identified the median improvement across each study’s eligible outcomes. If a study reported 10 adherence-related outcomes, we calculated the median absolute difference in adherence between the intervention and control groups. With each study represented by its median outcome, we then calculated the median effect and IQR across all included studies. For the purposes of sensitivity analyses, we repeated this calculation using the best outcome from each study.
The median and IQR convey the magnitudes of improvement achieved in the majority of studies. This method avoids skewing by a few outlying studies with highly positive results and 95% confidence intervals inappropriately narrowed by ignoring important clustering effects. It also permits nonparametric analyses of potential associations between study features and effect size in order to examine subgroups of studies with larger or smaller magnitudes of effect. For instance, we looked for associations between magnitude of effect and study size, markers of methodologic quality, features of the study context (e.g., ambulatory v. inpatient) and characteristics of the reminders (e.g., requiring users to enter a response before continuing with their work). We performed all such comparisons using a nonparametric Mann–Whitney rank-sum test.
Results
Of 2036 citations identified, we excluded 1662 at the initial stage of screening and an additional 374 after review of the full-text articles. A total of 28 articles (reporting 32 comparisons) met all of our inclusion criteria (Figure 1). 19–46 The full review has recently been published in The Cochrane Library. 47
Of the 32 comparisons, 19 were in the United States and 8 occurred in inpatient settings (Table 1, located at the end of the article). Only six comparisons involved a quasi-randomized design, typically allocating intervention status on the basis of even or odd provider identification numbers. Twenty-six comparisons allocated intervention status to providers or provider groups (cluster trials); 12 of these comparisons accounted for clustering effects in the analysis. Seventeen trials reported a power calculation that included a target effect size. Twelve trials reported a target improvement in adherence to processes of care; 10 of these trials specified an absolute increase of at least 10% (Table 1).
Figure 2 displays the median improvements in adherence to processes of care for each included study (for details about the results from each study, see Appendix 1, available at www.cmaj.ca/cgi/content/full/cmaj.090578/DC1). Pooling data across studies (Table 2), we found that the median improvement in adherence associated with computer reminders was 4.2% (IQR 0.8%–18.8%). Prescribing behaviours improved by a median of 3.3% (IQR 0.5%–10.6% [21 trials]), adherence to target vaccinations by 3.8% (IQR 0.5%–6.6% [6 trials]) and test-ordering behaviours by 3.8% (IQR 0.4%–16.3% [13 trials]). Table 2 also shows the results obtained when we used the best outcome from each study instead of the median improvement.
Across eight comparisons that reported dichotomous clinical outcomes (e.g., achievement of target treatment goals), patients in the intervention groups experienced a median absolute improvement of 2.5% (IQR 1.3%–4.2%). For blood pressure control, the single most commonly reported outcome, patients in the intervention groups experienced a median reduction in systolic blood pressure of 1.0 mm Hg (IQR 2.3 mm Hg reduction to 2.0 mm Hg increase) and a median reduction in diastolic blood pressure of 0.2 mm Hg (IQR 0.8 mm Hg reduction to 1.0 mm Hg increase).
Study features and effect size
We found no significant correlation between effect size and the following study features: publication year, country (United States v. other), study design (randomized v. quasi-randomized) or sample size (whether calculated on the basis of patients or providers) (Figure 3). We considered that studies with high adherence rates in control groups (a marker for baseline adherence) might achieve smaller improvements in care, because they had smaller opportunities for improvement. Surprisingly, studies with control-group adherence rates that were higher than the median across all studies showed larger effect sizes (Figure 3). When we analyzed the potential impact of baseline adherence in various other ways (e.g., focusing on the highest and lowest quartiles of baseline adherence), we found no evidence that small improvements reflected high baseline quality of care.
We observed a trend toward larger improvements with inpatient interventions than with outpatient interventions (median 8.7% [IQR 2.7%–22.7%] v. 3.0% [IQR 0.6%–11.5%]; p = 0.34). All inpatient interventions occurred at two institutions that had well-developed, “homegrown” computerized systems for order entry by providers. Moreover, the recipients of computer reminders from these institutions consisted primarily of physician trainees.
Our grouping of studies on the basis of track records in clinical informatics did not result in significant differences, except that the studies from Brigham and Women’s Hospital in Boston, USA, reported a median improvement of 16.8% (IQR 8.7%–26.0%), 26,31,37,40,46 compared with 3.0% (IQR 0.5%–11.5%) for studies from the other institutions (p = 0.04).
Features of computer reminders and effect size
We analyzed a number of reminder characteristics to look for associations with effect size (Figure 4). Only the requirement for providers to enter a response to the reminder showed a trend toward larger improvements (median 12.9% [IQR 2.7%–22.7%] v. 2.7% [IQR 0.6%–5.6%] for no response required; p = 0.09). No trends toward larger effect sizes existed based on the type of targeted problem (underuse v. overuse of a targeted process of care), inclusion of patient-specific information, provision of an explanation for the alert, inclusion of a specific recommendation with the alert, development of the reminder by the study authors, or the type of system used to deliver the reminder (CPOE [computerized provider order entry] v. electronic medical records).
Reminders that were “pushed” onto users (i.e., users automatically received the reminder) did not achieve larger effects than reminders that required users to perform some action to receive them (i.e., users had to “pull” the reminders); only 4 of the 32 comparisons involved “pull” reminders. A three-armed cluster randomized controlled trial of reminders for screening and treatment of hyperlipidemia 45 directly compared these two modes of delivering reminders. Patients cared for at practices randomly assigned to deliver automatic alerts were more likely to undergo testing for hyperlidemia and receive treatment than were patients at clinics where reminders were delivered to clinicians only “on demand.”
Sensitivity analyses
We re-analyzed the potential predictors of effect size (study features and characteristics of reminders) using a variety of choices for the representative outcome from each study, including the outcome with the middle value (rather than a calculated median) and the best outcome (the outcome associated with the largest improvement in adherence to the process). None of these analyses substantially altered the main findings.
Interpretation
Across the 32 comparisons, computer reminders achieved small to modest improvements in care, with a median improvement of 4.2% (IQR 0.8%–18.8%). Even using the best out-come from each trial, the median improvement was only 5.6% (IQR 2.0%–19.2%). These changes fall below the thresholds for clinically significant improvements specified in most trials, and they are certainly smaller than the improvements generally expected from computerized order entry and electronic medical record systems. Interestingly, these improvements are also no larger than those observed for paper-based reminders. 5,48
With the upper quartile of reported improvements beginning at an almost 20% increase in adherence to processes of care, some studies in our review clearly did show larger effects. However, we were unable to identify any study characteristic or reminder feature that predicted larger effect sizes, except for a statistically significant increase in magnitude of effect seen in studies involving a well-developed, homegrown computer order entry system at Brigham and Women’s Hospital. 26,31,37,40,46 A trend toward larger effects was also seen for reminders that required users to enter a response in order to proceed; however, this finding may have been confounded by the uneven distribution of studies from Brigham and Women’s Hospital. Thus, we do not know if the success of computer reminders at this institution reflects the design of reminders requiring user responses, other features of the computer system or perhaps institutional culture.
Included studies often provided limited descriptions of key features of the reminders and the systems through which they were delivered. We attempted to overcome this problem by abstracting basic features, such as whether user responses were required and whether the reminder displayed a justification for its content. But heterogeneity within even these apparently straightforward categories could mask important differences in effect. Important differences in effect may also reflect characteristics that we found difficult to operationalize (e.g., the “complexity” of the reminder) or that were inadequately reported. This problem of limited descriptive detail of complex interventions and the resulting potential for heterogeneity among included interventions in systematic reviews has been consistently encountered in the quality-improvement literature. 49,50
Conventional meta-analyses estimate mean effects and 95% confidence intervals by calculating weighted averages across study results. The individual weights derive from study precision such that larger studies contribute greater weight to the meta-analytic result. However, more than half of the studies included in our review reported spuriously high precision, and most of the studies did not report the data required to adjust for this problem. For example, of the 26 clustered trials, only 9 provided a single value for the intra-cluster correlation coefficient, and only 3 reported values for all outcomes. Because we could not accurately weight studies based on precision, we focused on the median and interquartile range for study effects, a method that has found increasing application in systematic reviews of interventions for quality improvement. 9,13–15,17,18,51
The main potential drawback of this method is that we assigned equal weight to all of the studies. However, for our results to have substantially misrepresented the true impacts of computer reminders, the minority of studies with large magnitudes of effect would also have to be the larger studies (and thus deserving of greater weight in a meta-analysis). Not only is this unlikely in general, we specifically showed that study size bore no relation to effect size, using various definitions of study and effect size.
Conclusion
Computer reminders typically increased adherence to target processes of care by amounts below thresholds for clinically significant improvements. A minority of studies showed more substantial improvements, consistent with the expectations of those who advocate widespread adoption of computerized order entry and electronic medical record systems. However, until further research identifies study design and reminder features that reliably predict clinically worthwhile improvements in care, implementing these expensive technologies will constitute an expensive exercise in trial and error.
Footnotes
-
Previously published at www.cmaj.ca
See also research article by Villeneuve and colleagues
This article has been peer reviewed.
Competing interests: None declared.
Contributors: Kaveh Shojania and Jeremy Grimshaw conceived the study. All of the authors contributed to refinements of the study design and to the analysis and interpretation of the data. Kaveh Shojania drafted the initial manuscript, and all of the other authors provided critical revisions. All of the authors approved the final manuscript submitted for publication. Kaveh Shojania is the guarantor for this paper.
Funding: Kaveh Shojania and Jeremy Grimshaw received salary support from the Government of Canada Research Chairs Program. Craig Ramsay’s position in the Health Services Research Unit is funded in part by the Chief Scientist Office of the Scottish Government Health Department. Alain May-hew receives salary support from the Canadian Institutes of Health Research. The views expressed are those of the authors and not the funding agencies.