Study objectives We aimed to investigate how different presentation formats influence comprehension and use of comparative performance information (CPI) among consumers.
Methods An experimental between-subjects and within-subjects design with manipulations of CPI presentation formats. We enrolled both consumers with lower socioeconomic status (SES)/cognitive skills and consumers with higher SES/cognitive skills, recruited through an online access panel. Respondents received fictitious CPI and completed questions about interpretation and information use. Between subjects, we tested (1) displaying an overall performance score (yes/no); (2) displaying a small number of quality indicators (5 vs 9); and (3) displaying different types of evaluative symbols (star ratings, coloured dots and word icons vs numbers and bar graphs). Within subjects, we tested the effect of a reduced number of healthcare providers (5 vs 20). Data were analysed using descriptive analysis, analyses of variance and paired-sampled t tests.
Results A total of 902 (43%) respondents participated. Displaying an overall performance score and the use of coloured dots and word icons particularly enhanced consumer understanding. Importantly, respondents provided with coloured dots most often correctly selected the top three healthcare providers (84.3%), compared with word icons (76.6% correct), star ratings (70.6% correct), numbers (62.0%) and bars (54.2%) when viewing performance scores of 20 providers. Furthermore, a reduced number of healthcare providers appeared to support consumers, for example, when provided with 20 providers, 69.5% correctly selected the top three, compared with 80.2% with five providers.
Discussion Particular presentation formats enhanced consumer understanding of CPI, most importantly the use of overall performance scores, word icons and coloured dots, and a reduced number of providers displayed. Public report efforts should use these formats to maximise impact on consumers.
- Performance measures
- Report cards
- Decision making
- Nursing homes
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Comparative performance information (CPI) about healthcare providers has been embraced in Western countries, both in efforts to empower patients and as part of competition-based healthcare reforms.1–4 These reports highlight variations in recommended processes of care and/or patient outcomes and experiences across healthcare providers. Especially, web-based CPI reports have become common in practice in the past 10–15 years. It has become clear, however, that CPI is only used by a small number of consumers.5–8 An important precondition for effective use by consumers, namely the availability of adequate information, has not always been met.9 ,10 Another reason is that many people have difficulty understanding CPI.2 ,11–13 CPI usually consists of many quality indicators, described in medical or policy jargon,10 ,14 ,15 and often expressed in numerical terms.15 However, many consumers lack the basic health literacy and numeracy skills needed to comprehend such scores.16–18 Additionally, most consumers are not motivated to rigorously study a large amount of CPI, especially since they do not always recognise any variations between providers.19 The question of how larger groups of consumers, including those from vulnerable populations, can be supported in understanding this variability has emerged as a key issue.20–22
In the past few years, several presentation formats have been tested to improve comprehensibility of CPI. For example, it has been demonstrated that explanatory frameworks for healthcare quality and word icons can help consumers when using CPI, probably because these approaches foster easy interpretations.23–25 Furthermore, different evaluative icons have been shown to facilitate the understanding and use of CPI.11 ,23 ,24 ,26 Another line of research, focusing on information overload in health plan choices, suggests that consumers are aided by reducing the number of providers as well as the indicators displayed.27–31 Although this past work provides valuable suggestions for CPI design, it remains unclear on which formats will help the largely unstudied group of consumers from vulnerable populations, such as those with lower cognitive skills. Some pilot work has been done among consumers from vulnerable groups,20 but this work did not look into information comprehension. Moreover, novel presentation formats have been largely investigated separately, while combinations—as is common in real CPI—have remained relatively unexplored.
This study aimed to investigate how combinations of novel presentation formats influence CPI comprehension and use among consumers, both from vulnerable populations (ie, those with relatively low educational level, health literacy, health numeracy and patient activation) and from non-vulnerable populations. We tested the effects of (1) reducing the number of healthcare providers and (2) reducing the number of quality indicators as ways to reduce cognitive effort for consumers. In addition, we assessed the effects of (3) displaying an overall performance score and (4) evaluative symbols as ways to foster easy interpretations. The effects were assessed on different outcomes related to information comprehension and use, most importantly the correct identification of the top three performers of healthcare providers.
This study used an experimental between-subjects and within-subjects design, in which presentation formats of CPI as provided on the website ‘kiesBeter’ were manipulated. This website is the Dutch national government-run website for quality of care ratings. Within subjects, the effect of a small number of providers (five nursing homes) versus a larger number of providers (20 nursing homes) was tested. Between subjects, we tested the following manipulations: the display of an overall performance score (yes vs no); the display of a small number of quality indicators (5 vs 9); and the display of different types of evaluative symbols (star ratings, coloured dots, word icons, vs numbers and bar graphs), resulting in a 2×2×5 design.
Respondents were recruited through an online access panel (FlyCatcher Internet Research; 20 000 panel members in total, ISO 20252- and ISO 26362-certified). Members of this panel are members of the general public who have signed up to participate in various types of surveys, to be rewarded with ‘monetary points’. Panel members received an email in which they were invited to participate. A total of 2124 panel members, representative of the Dutch population regarding age, gender and the geographic area in which they lived were approached. People with lower educational level (ie, no education or only primary education) were oversampled (ie, approximately half of the sample had a low educational level), to enable us to assess the effects of our manipulations in respondents with relatively low socioeconomic status (SES) and low cognitive skills. In total, 902 panel members were included. This sample size was large enough to assess small to medium main effects and two-way and three-way interaction effects (ie, effect size of 0.10–0.25) with a statistical power of 0.80. Selected panel members were randomly assigned to one of the 20 experimental conditions, using a stratified randomisation process to ensure even distributions of gender and educational level over these conditions. Participants received a small monetary token of €1.67.
Respondents saw fictitious but realistic CPI concerning nursing homes with one absolute best provider, that is, one provider that was dominant on all indicators. We used existing CPI derived from measurements with the Dutch Consumer Quality Index (CQI) Long Term Care.32 The CQI scores varied on a scale from 1 to 4, where 1 indicated non-optimal care and 4 indicated optimal care according to patients. Based on the priority that patients gave to quality indicators in previous research,32 ,33 we selected 9 from the 14 available indicators. These indicators were safety of care, conduct of professionals, mental well-being, independency, privacy, cleaning service, meals, availability of personnel and competence of personnel. We selected nursing homes by rank ordering the 752 nursing homes with complete data, and a subsequent selection of nursing homes after every 39 cases. We ensured a selection of providers that varied on the quality of provided care. The actual scores of several nursing homes were slightly adjusted in order to derive the top three of the best-performing and the worst-performing providers. One nursing home outperformed the other nursing homes on all quality indicators. The two other nursing homes considered second and third best had more ‘better than average’ scores than the other nursing homes listed. The names of nursing homes were replaced by fictitious names and the order in which the nursing homes and indicators appeared was randomly chosen. The names of the 20 nursing homes in the first set were not included in the names in the second set of five nursing homes. For each condition, a screenshot was designed in the style of the government-run kiesBeter website. Respondents were able to access explanations of the indicators through mouse-overs. Online supplementary appendix 1 provides examples of those screenshots.
Respondents were provided with two screenshots: one with 20 nursing homes (the ‘realistic version’) and one with five nursing homes (the ‘reduced version’) and a sequence of questions followed. We asked respondents to imagine having to choose a nursing home for themselves or for their parents/grandparents. We instructed them to take their time in viewing information about quality of care in the different nursing homes. All participants first saw the information about 20 nursing homes, as this reflected a realistic number of providers as currently presented on Dutch websites. They were then provided with questions that assessed their comprehension and hypothetical choice. Next, respondents saw a screenshot with information about five nursing homes, and they were again asked to fill in answers to several questions. Respondents were not able to return to the page displaying CPI after they had answered our questions. Finally, respondents filled in answers to questions about sociodemographic background and cognitive skills.
Comprehension of the information
Our main outcome variable was the selection of the top three nursing homes, assessed by the following item: ‘According to you, what are the top 3 best performing nursing homes?’ It should be noted that what is seen as ‘best’ for the second and third nursing homes was no more than the sum of scores on all quality indicators. So the importance consumers might attach to the different indicators was not taken into account. Other items measuring comprehension were, ‘According to you, which nursing home performs best?’ (this variable had one absolute best provider) and ‘According to you, which nursing home performs worst?’ These three items all had a multiple choice response scale with all nursing homes from the screenshot listed; respondents could select the nursing home(s) from this scale. These questions were based on questions used in previous research testing CPI presentation.11 ,34 In addition, we also measured respondents’ more verbatim comprehension of the information displaying 20 nursing homes. These measures differed for the different experimental conditions and included questions such as ‘What is the score of “Zeezicht” on the item “independency”?’ and ‘How does “Oosterstraat” score on the item “professional personnel” compared with “De Zonnewijzer”?’ Using these questions, we assessed how consumers comprehended individual quality indicators, the overall performance score, relative performance of providers, and, if provided, specific numerical information. Three response options were developed so that items were formulated with a three-point multiple choice response scale. For example, for the item ‘What is the score of “Zeezicht” on the item “independency”?’ in the presentation format of numbers, the three response options were 3.96, 3.43 and 3.42. For this item in the star ratings format, the options were ‘better than average’, ‘average’, and ‘worse than average’.
Choice of nursing home
We formulated one question that captured hypothetical choice: ‘If you had to choose a nursing home for yourself or for one of your parents/grandparents, which one would you choose?’
Evaluations of the information
Several questions were posed to assess respondents’ evaluations of (using) the information: ‘How easy or hard was it for you to make a choice between the nursing homes?’ response scale from 1 (very easy) to 5 (very hard) and ‘I would like to use this kind of information when choosing between nursing homes’, response scale from 1 (completely disagree) to 5 (completely agree).
Descriptive analysis was conducted to assess how the information was comprehended overall and used. Subsequently, we employed analyses of variance (ANOVA) to analyse the effects of our between-subjects manipulations. We examined the main effects and two-way and three-way interactions of the three manipulations, as well as the two-way interactions between the manipulations and educational level, subjective health literacy, health numeracy and patient activation. We used paired-sampled t tests to assess the effects of our within-subjects manipulation of the number of nursing homes. All analyses were performed using IBM SPSS Statistics V.20.0 with a significance level of 0.05.
A total of 902 respondents (response of 43%) completed our questionnaire. Table 1 shows their background characteristics. Almost half of our respondents had low educational levels. Of the respondents, 15% had inadequate subjective health literacy and 61% answered one or more of the health numeracy questions incorrectly. Of our respondents, 39% were in the two lowest PAM levels. Online supplementary appendix 2 shows the numbers of participants in each cell of the 2×2×5 design.
Table 2 shows the percentages of correct answers to our questions about CPI comprehension and use, both for the realistic screenshot and the reduced screenshot. Overall, respondents had most difficulty answering the question ‘How does “Oosterstraat” score on the item “professional personnel” compared with “De Zonnewijzer”?’ (percentage correct was 76.6% (question only asked for the realistic version)). Almost 70% correctly selected the top three nursing homes when provided with 20 nursing homes. Notably, the percentage of respondents correctly selecting the single best nursing home (86%) differed from the percentage that correctly selected the number 1 as the top performer in the complete top three (88%) and from the percentage that chose the best nursing home (83%).
Figures 1 and 2 present the percentages of respondents correctly selecting the top three nursing homes for the different between-subjects manipulations, after viewing the realistic screenshot (figure 1) and after viewing the reduced screenshot (figure 2). Overall, more respondents correctly selected the top three nursing homes when provided with the reduced screenshot (figure 2) than when provided with the realistic screenshot (figure 1).
The results of the ANOVAs for all outcome variables in the realistic version are presented in online supplementary appendix 3. Table 3 presents the main findings from these ANOVAs. The results are described below.
Reducing cognitive effort
For the selection of the top three nursing homes, a significant difference between the realistic version and the reduced version was found (χ2=90.39; p<0.001). When provided with the realistic version, 69.5% of respondents correctly selected the top three, whereas with the reduced version, 80.2% did so. We also found significant differences in the same direction for the other outcome variables, that is, for the selection of the best nursing home (t=9.12; p<0.001), the selection of the worst nursing home (t=5.15; p=0.001) and the choice of nursing home (t=8.42; p<0.001). Reducing the number of quality indicators overall had less influence compared with reducing the number of providers and no significant main effects on respondents’ comprehension and use of CPI in either the realistic version (see table 3) or the reduced version.
Fostering easy interpretations
In terms of the selection of the top three nursing homes in the realistic screenshot, displaying an overall performance score had a significant main effect on respondents’ answers (F=31.66; p<0.001; see table 3). Respondents who saw an overall performance score more often selected the complete top three (79.8% correct) than respondents who were not provided with an overall score (59.5% correct; figure 1). Results in the same direction were found for the selection of the best nursing home (F=4.10; p=0.043) and the selection of the worst nursing home (F=30.70; p<0.001). Displaying an overall performance score also significantly influenced the selection of the top three when viewing the reduced screenshot (F=21.24; p<0.001), with findings in the same direction as for the realistic screenshot, but not the other outcome variables. The use of evaluative symbols also had a significant main effect on the selection of the top three with the realistic screenshot (F=6.92; p<0.001; see table 3). Respondents provided with coloured dots most often correctly selected the top three (84.3%), compared with word icons (76.6% correct), star ratings (70.6% correct), numbers (62.0%) and bars (54.2%; figure 1). Findings were in the same direction for the selection of the best nursing home (F=3.24; p=0.012), the selection of the worst nursing home (F=2.96; p=0.019) and the nursing home choice (F=4.16; p=0.002; see table 3).
When provided with the reduced screenshot, the type of evaluative symbols significantly influenced respondents’ selection of the top three nursing homes only (F=4.19; p=0.002). We also found an interaction between the display of an overall performance score and the type of evaluative symbols on selection of the top three nursing homes, both for the realistic screenshot (F=13.17; p<0.001; see table 3) and the reduced screenshot (F=6.05; p<0.001); word icons and coloured dots only supported respondents in selecting the top three when no overall performance score was displayed. This interaction was also significant and in the same direction for the selection of the best nursing home in the realistic screenshot (F=2.87; p=0.022; see table 3) and for the selection of the worst nursing home in the reduced screenshot (F=3.27; p=0.011).
Interactions between formats aimed to reduce cognitive effort and formats aimed to foster easy interpretations
For the selection of the top three in the realistic screenshot, a significant interaction was found between the number of quality indicators and the type of evaluative symbols (F=3.64; p=0.006; see table 3). Word icons and coloured dots especially aided consumers with a large number of quality indicators. The three-way interaction between the manipulations was not significant for the top three (F=2.28; p=0.059; see table 3). However, we did find significant interactions between the three between-subjects manipulations for the selection of the best nursing home (F=2.48; p=0.042) and for nursing home choice (F=3.48; p=0.008) in the realistic version (see table 3). These effects indicated that the word icons and coloured dots aided consumers when there was no overall performance score or when there was an overall performance score but a large number of quality indicators. For the reduced screenshot, we further found a significant interaction between the display of an overall score and the number of quality indicators on selection of the best nursing home (F=4.71; p=0.03), indicating that a small number of quality indicators only helped people when there was also an overall score displayed. The three-way interaction effect on the choice of nursing home was also significant with the reduced screenshot (F=3.07; p=0.016).
Consumers’ vulnerability-related characteristics
Health numeracy showed a significant association with the selection of the top three nursing homes (F=18.25; p<0.001) as well as with the selection of the best nursing home (F=9.06; p<0.001), the selection of the worst nursing home (F=19.05; p<0.001) and the choice of nursing home (F=8.73; p<0.001) in the realistic screenshots (see table 3). Respondents with higher health numeracy more often adequately comprehended and used the provided CPI. We also found a significant interaction between displaying an overall performance score and health numeracy (F=3.50; p=0.015) for the selection of the worst nursing home in the realistic screenshot. The overall performance score helped people with lower health numeracy more in selecting the worst nursing home than people with higher health numeracy. Health numeracy was also significantly related to respondents’ answers when provided with the reduced screenshot, namely regarding the selection of the top three of best nursing homes (F=3.49; p<0.001), the selection of the best nursing home (F=4.89; p=0.002), the selection of the worst nursing home (F=2.81; p=0.039) and the nursing home choice (F=2.68; p=0.046).
In addition, patient activation was significantly related to respondents’ selection of the top three with the reduced screenshot (F=3.31; p=0.020); respondents with the highest PAM level less often selected the top three (namely 75.5%) compared with the other three levels (level 1, 82.5%; level 2, 82.0%; level 3, 82.9%). A significant association in the same direction was found for the selection of the best nursing home (F=5.60; p=0.001) and the choice of a nursing home (F=2.64; p=0.049). A closer inspection revealed that those respondents with high PAM who showed fewer correct responses also relatively often had low health numeracy. For example, of those 47 respondents with the highest PAM level who did not correctly select the top three, 19% had the lowest numeracy level, whereas of the 145 respondents with the highest PAM level who did correctly select the top three, 6% had the lowest numeracy level. We further found an interaction effect between the number of quality indicators and subjective health literacy for the reduced screenshot (F=5.77; p=0.017), indicating that reducing the number of quality indicators had a greater positive effect in people with high subjective health literacy than among people with lower subjective health literacy.
This study investigated how different presentation formats—aimed at reducing cognitive effort and fostering easy interpretations—influence comprehension and the use of CPI in both consumers with lower SES/cognitive skills and consumers with higher SES/cognitive skills. We showed that especially the display of an overall performance score and the use of coloured dots and word icons helped consumers in comprehending and using CPI. Furthermore, a reduced number of healthcare providers displayed appeared to aid consumers. Overall, the influence of the presentation formats did not vary with vulnerability-related variables such as educational level or cognitive skills. However, there were two exceptions: (1) the display of an overall performance score appeared to particularly help people with lower health numeracy when they had to select the worst nursing home from 20 nursing homes listed; and (2) a reduced number of quality indicators appeared to only help people with relatively high subjective health literacy when they had to select the worst nursing home from five nursing homes listed.
In line with previous studies on CPI,11 ,23 ,25 ,27 a substantial proportion of consumers had difficulties in comprehending and using the displayed CPI. For example, almost one-third of respondents (31%) were not able to correctly select the top three of best providers out of 20 listed nursing homes. Even when the number of nursing homes was reduced to five, about 20% of respondents were still not able to correctly do so. Of course we should keep in mind that for identifying the second and third best provider in our study, subjective weighing of indicators probably played a role and influenced consumers’ answers. However, for the selection of the one best provider, for which we created an absolute best provider that was dominant on all indicators, 14% of respondents still failed to correctly identify the best provider. Previous studies investigating CPI found similar percentages of respondents ‘miscomprehending’ information.11 ,25 ,31 ,34 Studies into other types of numerical health information also found similar proportions of people not being able to correctly comprehend or use information.40 In light of these studies, combined with the fact that less numerate people were more likely to miscomprehend information, our findings suggest that miscomprehension occurred due to difficulties in grasping the numbers as demonstrated in previous studies.23 ,40–42 However, it should be noted that most of our experimental materials did not display numbers, but rather visual displays of numbers. Even then it may be hard to perform the numerical tasks, such as computations, needed to adequately comprehend CPI.
Our findings support the notion that presentation formats can impact how consumers understand and use CPI.6 Especially the display of an overall performance score appeared to facilitate comprehension and use of information, with a relatively large number of providers displayed as well as a relatively small number. Furthermore, using coloured dots or word icons greatly contributed to better comprehension and use of CPI. Previous studies identified approaches such as explanatory frameworks and evaluative symbols as being supportive of consumers’ comprehension and use of CPI.11 ,23–26 What our study adds to this literature is specific novel formats, such as coloured dots, that seem to be just as effective as word icons, as well as insight into the effects of combinations of these formats. Notably, displaying an overall performance score and using evaluative symbols seemed to compensate each other in providing meaning to CPI: coloured dots and word icons especially helped consumers when no overall performance score was displayed, and vice versa. In contrast, a small number of quality indicators sometimes helped people only when there was also an overall performance score displayed.
It is also interesting to note that reducing the number of healthcare providers seemed to have a greater effect compared with reducing the number of quality indicators displayed. For example, in selecting the best nursing home, reducing the number of providers displayed from 20 to 5 was associated with a 10% increase to 97% of people correctly doing so. Reducing the number of information consumers have to process has been shown to positively affect people's comprehension and use of information in numerous studies both inside the health domain27 ,31 and outside it.43 It likely greatly affects the cognitive effort that consumers have to put in when weighing performance scores. For websites providing CPI, it can be difficult in practice to lower the number of providers displayed, as showing only a subset of providers might interfere with their task of creating full transparency in healthcare. A common way is to let consumers search for providers in their geographical area and to only display the ones that fall within a self-chosen distance. However, this can still leave consumers with a relatively high number of providers to compare. Showing only providers with the best performance on key indicators might be an additional option, at least as long as it is clearly communicated that such a selection has been made and that more providers can be reviewed through additional mouse clicks.
Our findings should be interpreted in light of several limitations. First, as is common with controlled experiments, this study used fictitious stimuli and hypothetical choices of consumers. Consumers’ responses might not reflect their responses in real decision situations in which real CPI is used. However, the materials used were realistic and combined several different presentation formats as also used on websites. Second, we focused on ‘correct’ responses of consumers without taking into account the differential importance that consumers might attach to different quality indicators. While this latter approach would probably reveal more about the extent to which consumers’ choices can be considered ‘informed’, the approach we used was an efficient way to test consumers’ comprehension of information. Finally, the response rate (43%) was not very high. As is common with this type of internet panel research, it might well be that those who were not interested in quality of care hastily decided not to enrol in our study. This might be problematic because one could argue that these consumers might benefit most from viewing CPI. However, PAM levels in our study population were similar to that of the general Dutch population, so it seems that we did not capture only the responses of highly motivated consumers.
That consumers do not adequately comprehend and use CPI is a tremendous missed opportunity. It means that informed consumer choices are not shaping provider behaviour in the intended direction, and that many consumers are choosing less optimal providers. Furthermore, the substantial investment in effort and resources that goes into the production of CPI is not being fully used. The evidence reported in this study can help guide more effective designs of CPI.
Eliane Poort designed the experimental materials. Dave Harmsen helped in designing the experimental materials and in the data collection. He also assisted in data cleaning and statistical analyses. Uriell Malanda advised on the study design and data collection.
Contributors OCD designed the study, developed the questionnaire and collected the data. She also performed the statistical analyses, interpreted the data and drafted the manuscript. ADJ contributed to the conception and design of the study, coordinated the design of the experimental materials and data collection and helped draft the manuscript. JH contributed to the conception and design of the study and to the draft versions of the manuscript. DRMT contributed to the conception and design of the study and to the draft versions of the manuscript.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.