Introduction

Health-related quality of life (HRQoL), or psychological, social, and physical functioning [1], has become an important outcome measure in medical care. Standardized assessment of HRQoL preceding each consultation may potentially provide physicians with valuable information. Several studies have shown that physicians vary in their ability to elicit psychosocial information or that they underestimate patients’ HRQoL [25]. Furthermore, various studies have shown that when communication with the physician encompasses both physical and psychosocial issues, patients have better treatment compliance, are more satisfied with the consultation, and report less symptoms [68].

Nevertheless, relatively few studies have assessed the value of HRQoL measurement in clinical practice. Some have shown positive results with regard to acceptance by patients and physicians or a significant increase in the identification and/or discussion of HRQoL issues [914]. Less consistent and favorable results have been obtained with regard to the effectiveness of standardized HRQoL measurement in actually improving HRQoL or psychosocial outcomes. Even though decreased depression [15], improved overall and emotional functioning [10], improved mental health [16], and a decrease in disease-specific debilitating symptoms of patients undergoing chemotherapy [13] have been associated with HRQoL measurement in clinical practice, several other studies found no significant improvement in HRQoL or psychosocial outcomes [9, 1720]. A possible explanation might be that the majority of existing studies assessing the effectiveness of HRQoL measurement in clinical practice with regard to patients’ psychosocial functioning or HRQoL have included oncological patients or patients from general practice. Oncological patients can be considered a special group due to the life-threatening nature of the disease. Patients from general practice, on the other hand, may be too diverse and often present with generally minor complaints, which may hamper the discovery of beneficial effects. Both groups impede generalization of results to other chronic patient populations.

Two important studies [9, 10] used designs in which physicians were part of both the control and the experimental group, either by using a crossover design (physicians were first assigned to one group, then crossed over to the other group halfway through the study) [9] or by assigning patients rather than physicians to the different groups [10]. This may possibly have caused bias. Two systematic reviews have stressed the need for further research evaluating the effectiveness of repeated measurements of HRQoL in clinical practice [18, 20] and the need for further research to help health care professionals identify patients who would benefit most from such interventions [20].

The study reported here differs from previous studies by including a patient population with chronic liver disease (CLD) in order to study the effects of HRQoL use in clinical practice in a population that is more representative of other patients with a chronic disease. CLD is one of the most prevalent diseases in the world. The most common causes of CLD, hepatitis B virus (HBV) and hepatitis C virus (HCV), have been estimated to affect 360 million and 200 million people worldwide, respectively (http://www.epidemic.org, 4-12-2006). In addition, alcohol is another main cause of end-stage liver disease worldwide and the second most common reason for liver transplantation in the United States [21]. CLD is a serious disease that is associated with significant physical and psychological symptoms such as impaired cognition, hepatic coma, fluid in the abdomen, abdominal pain, joint pain, fatigue, depression, and anxiety [2228]. Not surprisingly, HRQoL in patients with CLD has been shown to be impaired [29, 30]. CLD is an appropriate example of a typical chronic disease, with patients experiencing substantial comorbidity and possibly mortality, as is the case in other chronic diseases such as kidney disease and chronic obstructive pulmonary disease.

Our study also differs from previous studies by assessing the benefits of HRQoL measurement for patients with different demographic characteristics (e.g., men and women, young and old), which is essential for determining which patients are most likely to benefit from HRQoL measurement in clinical practice, a point recently reiterated in a systematic review on this topic [20]. In addition, in our study, physicians rather than patients were assigned to the control or the experimental group. This assigning of physicians to only one group prevents bias of physicians being focused on discussing HRQoL when seeing patients in the control group.

The aims of the study were twofold: the first was to assess the effectiveness of real-time computerized measurement of HRQoL in various patients with CLD and presentation of the results to physicians before the consultation in terms of improvement in patient HRQoL, patient management, and patient satisfaction with the consultation by means of a randomized trial with repeated measurements. The second aim was to assess hepatologists’ experiences with the availability of real-time HRQoL patient data and to measure the possible effect(s) it had on their consultations.

Patients and methods

Patient recruitment

This study was performed at the Department of Gastroenterology and Hepatology of the Erasmus Medical Centre, Rotterdam, where HRQoL measurement on a regular basis was implemented for the duration of 1 year. All patients older than 17 years of age with CLD visiting the department between September 2004 and January 2005 were invited to participate. Written information about the study was sent to the patients 3 days before their consultation at the outpatient department. Patients interested in participating informed their physician, who consequently directed them to the researcher for further explanation of the study and to sign informed consent. For this effectiveness study, we included all patients with two or more measurement moments. All physicians working at the Department of Hepatology participated. The protocol was in accordance with the ethical guidelines of the modified 1975 Declaration of Helsinki and approved by the Medical Ethics Committee of the Erasmus MC.

Study objectives

The primary aim of this study was to assess the effectiveness of computerized measurement of HRQoL in clinical practice. The primary outcome measures were patients’ generic HRQoL (physical and mental component score separately) and disease-specific HRQoL. Secondary outcome measures were patient satisfaction with the consultation and patient management. The secondary aim of this study was to assess hepatologists’ experiences with the availability of real-time HRQoL patient data.

Study design and intervention

Physicians

Physicians were randomly assigned to either the experimental or control group by means of a restricted randomization procedure called blocking. To ensure an equal number of physicians in each group, it was decided to include six in the experimental group and five in the control group. We used a random sequence table to assign physicians to one of the conditions. Due to the nature of the intervention, it was impossible to blind physicians to group assignment.

Physicians in the experimental group were able to obtain an instant computerized graphical output of HRQoL patient data, which also included data from previous measurement moments so that changes in patients’ HRQoL could be monitored (Fig. 1). Prior to the study, physicians received instructions from a psychologist with expertise in the field of HRQoL measurement on how to interpret this output. First, physicians were shown the questionnaires in order to familiarize them with the content. Second, they were informed that the red line in the graph was the average score of patients with CLD on the Short Form-36 (SF-36) measuring generic HRQoL and that scores under this line were to be considered low. They were also told that the average score of healthy people on this questionnaire was 50. The physicians were instructed to interpret the disease-specific Liver Disease Symptom Index 2.0 (LDSI 2.0) at item level, with scores ranging from 1 (not at all) to 5 (to a large extent). The physicians were asked to use the HRQoL data in all consultations for 1 year. No recommendations for specific responses were given. Instead, they were instructed to use their clinical experience to choose an appropriate treatment. After seeing a participating patient, physicians in both groups completed a checklist about the content of the consultation. Physicians in the control group conducted their consultations as usual.

Fig. 1
figure 1

Example of the graphical output of patients’ health-related quality of life as presented to physicians in the intervention group. A score of 50 is the average score of a healthy norm population. The dashed line represents the mean score for patients with chronic liver disease

Patients

Through the random assignment of physicians, patients were indirectly allocated to either group. Patients were initially blinded to the group assignment. All patients participating in the study completed a computerized generic- and disease-specific HRQoL questionnaire and the first part of a pen-and-paper questionnaire on patient satisfaction with the consultation before each consultation at the outpatient Department of Hepatology for 1 year. They also completed the second part of the satisfaction questionnaire after the consultation. More specific information on the content of the questionnaires is provided in “Study measures”. To ascertain good questionnaire completion, a researcher was always available to answer questions about the computer and/or questionnaires at the patient’s request.

Study measures

HRQoL

Disease-specific HRQoL: This was assessed by means of the LDSI 2.0, which measures severity and hindrance of nine symptoms: itch, joint pain, pain in the right upper abdomen, decreased appetite, jaundice, fatigue, depressed mood, worries about family situation, and fear of complications [24]. Because of time constraints, only items measuring symptom severity were included in this study (n = 9). The physicians were instructed to interpret the questionnaire at item level, with scores ranging from 1 (not at all) to 5 (to a large extent). For data analysis, a total score, ranging from 9 to 45, was computed by summing the scores of each item. The reliability of the LDSI 2.0 is good (internal consistency α > 0.79), as is the construct validity [30].

Generic HRQoL: This was assessed by means of the Short Form-12 version 1 (SF-12). The SF-12 produces a Physical Component Summary (PCS) and Mental Component Summary (MCS), representing physical and emotional functioning, respectively. The mean score of the PCS and MCS in the general population is 50 [standard deviation (SD) 10] with higher scores representing better HRQoL. Mean scores and SD of the PCS and MCS of CLD patients was calculated from a large database (n = 1,175) [29, 31] (PCS: mean 43.2, SD 10.7; MCS: mean 44.4, SD 12.8). These means were used as a reference point (red line) in the graphical representation for physicians so they could easily identify patients scoring below average within the CLD group. The SF-12 has been shown to be reliable between test and retest (MCS r = 0.76, PCS r = 0.89), and median relative validity estimates of 0.67–0.97 for the PCS and MCS, respectively, have been found [32].

Patient satisfaction with the consultation

Patients’ satisfaction with the consultation was measured with the QUOTE-Liver, a newly developed questionnaire consisting of 20 items that assesses the discrepancy between patients’ needs/expectations (importance: measured before the consultation) and the actual care that they receive (performance: measured after the consultation). The internal consistency of the overall QUOTE-Liver was excellent (α = 0.90), as was the face validity: all patients (n = 152) in the validation study and three psychologists and a hepatologist agreed that the items of the QUOTE-Liver adequately reflected the most important aspects of care for CLD patients. Construct validity, as measured by the correlation between a visual analog scale (VAS) measuring overall satisfaction and the total score on the QUOTE-Liver was good (r = 0.69; P < 0.01). Content validity was also good: none of the 152 patients in the validation study suggested new items to be included (Gutteling et al. 2006, unpublished). A reduced version consisting of the nine items ranked by patients as most important and the two liver-disease-specific items was used in our study. Using a formula applied for all QUOTE-Liver instruments (10-importance × performance), a total satisfaction score can be computed ranging from 0 to 10, with 0 meaning not satisfied at all and 10 meaning completely satisfied [33].

Patient management

The effect of the intervention on patient management was measured by means of a checklist that physicians completed after each consultation with a study participant, including the question: “Have you changed your treatment in any way?” and a subquestion: “If so, what have you done?” followed by several options: “Prescription of antidepressants,” “Referral to psychosocial care,” “Altering the frequency of consultations,” and “Other.”

Physicians’ experiences

Experiences of physicians with the experimental condition were assessed through the checklists that they completed after each consultation with a study participant, asking the question: “Did you find the HRQoL information useful? Why?” with the answering options: “Yes, it provided new information,” “Yes, it saved time,” “Yes...,” “No, the patient is doing well,” “No, I know this patient well enough,” “No, the patient tells me a lot,” “No...”. Also, a semistructured interview was conducted 6 months into the study and at the end of the study. The interview included questions about the effort to request HRQoL information, the usefulness of the information, whether the availability of HRQoL information increased the duration of the consultation, and whether participating patients addressed HRQoL issues more often than patients who did not participate. Physicians were also asked whether there were certain subgroups of patients whose HRQoL information they found particularly useful. Opinions of physicians in the control group toward possible future availability of HRQoL information during the consultation were assessed by means of the same semistructured interview at 6 months only.

Statistical analysis

Sample size

A nonclustered power analysis based on a medium effect size (Cohen’s D = 0.50) with a 5% significance level and 80% power indicated that at least 64 patients were needed in each group to detect a statistically significant difference.

Data selection

For patients who were included in both groups because they had consultations with physicians from the control group as well as physicians from the experimental group during the year of the study, data from the condition in which they had most often been was included (n = 33). For patients who had been in both conditions equally (n = 19), all data were excluded. The first measurement moment of all patients (T1) was considered a baseline measure, as no HRQoL data had yet been presented to the physicians.

Data analysis

Differences on the variables gender, diagnosis, disease severity, and age between participants and nonparticipants were assessed by means of χ2 tests or t tests. The same was done for assessing differences between patients in the control group and the intervention group. Scores of participating patients on measurement moments (T2−Ti) were summarized into one overall score per variable in the study. Univariate analyses of variance were performed in SPSS 11.0. Fixed factors were age, gender, disease severity, presentation of HRQoL data to the clinicians (feedback), and interactions between these variables. Differences in diagnoses between patients in both groups were controlled for by entering one propensity score of the variable diagnosis as a covariate in the analyses. Propensity scores were especially designed for situations in which study participants could not be randomly assigned to groups, and their characteristics were therefore not balanced among the groups. A propensity score was defined as the conditional probability of assignment to a certain treatment group given a set of observed pretreatment characteristics and was usually estimated by means of a logistic regression analysis [34]. Thereby, the background characteristic(s), in this case diagnosis, was reduced to one single score, the propensity score. We calculated the propensity score by entering the different diagnoses (HBV, HCV, cholestatic liver disease, pretransplantation, posttransplantation, autoimmune hepatitis, and other) as dummy variables (M-1) in a logistic regression analysis. The unstandardized logistic regression weights were then multiplied by the relevant dummy variable and summed, together with the constant. This score was used in the univariate analysis to adjust for baseline confounding.

Univariate analyses of variance were performed for each outcome variable (disease-specific HRQoL and generic HRQoL MCS and PCS) separately. A forward technique was used in which the main effects of the fixed factors were assessed in the first block, and the interactions between feedback of HRQoL data and each of the other fixed factors (age, gender, severity of the disease) were explored in the second block. Differences between the two groups on patient management variables and satisfaction with the consultation were assessed by means of Mann–Whitney tests.

Hepatologists’ experiences with the availability of real-time patient HRQoL data was assessed by means of semistructured interviews and checklists. These data were of a descriptive nature and are presented as such.

Results

Characteristics of patients and physicians in the study

Of the 587 patients that agreed to participate in the study, 181 completed the questionnaires more than once. Of these, 19 were included in the experimental and control conditions equally often and were therefore excluded from the analyses. One hundred and sixty-two patients (control group n = 80, experimental group n = 82) were included (Fig. 2). Differences in age, gender, diagnosis, and disease severity between patients in the study and nonrespondents are presented in Table 1. Demographic characteristics of the 162 patients are presented in Table 2. Patients in the control and experimental groups were comparable, except for the variables diagnosis and disease severity (Table 2). In the analyses, these differences between conditions were controlled for. All physicians working at the Department of Hepatology (n = 11, ten men) agreed to participate. Their mean age was 39 (range 27–55) years, and their average working experience was 8.7 (range 0–27) years.

Fig. 2
figure 2

Patients in the study

Table 1 Differences in age, gender, diagnosis, and disease severity between patients in the study and nonrespondents
Table 2 Characteristics of patients included in the data analysis

Descriptives

The number of times that patients in the control and experimental groups completed the questionnaires varied between two and 11 (Table 3). Mean scores of patients at T1 and T2−Ti on the outcome variables generic HRQoL and disease-specific HRQoL are presented in Table 4.

Table 3 Questionnaire completion rate of patients in the control and experimental groups
Table 4 Patients’ adjusted means and 95% confidence intervals at T1 and T2−Ti

Effects of the experimental condition on patients’ HRQoL and satisfaction with the consultation

Disease-specific HRQoL

There was no main effect for the experimental condition on disease-specific HRQoL. There was a statistically significant interaction effect for the variables age and feedback of HRQoL data on the outcome variable disease-specific HRQoL (Table 5): older patients (>48 years of age, as determined by the median split) in the experimental group had significantly lower total scores on the LDSI 2.0 (meanAdj = 18.1, 95% CI: 15.3–21.0) (F = 4.18; P < 0.05), indicating better disease-specific HRQoL, than other patients, especially older patients in the control group (meanAdj = 22.1, 95% CI: 19.9–24.3). This difference between older patients in the experimental group and the control group on disease-specific HRQoL is equivalent to a Cohen’s D of 0.51, reflecting a medium difference [35].

Table 5 Interaction effects between age, gender, disease severity, and feedback on the outcome variable disease-specific HRQoL, controlled for diagnosis

Generic HRQoL

Mental Component Summary score

No main effect for the experimental condition on mental HRQoL was found. However, a significant interaction effect for the variables age and feedback of HRQoL data was found. Older patients in the experimental group had higher scores on the SF-12 MCS (meanAdj = 45.9, 95% CI: 41.6–50.3) (F = 4.62; P < 0.05), reflecting better HRQoL, than other patients, especially older patients in the control group (meanAdj = 41.3, 95% CI: 37.8–44.7) (Table 6). Furthermore, a significant interaction effect was found for the variables gender and feedback of HRQoL data, with male patients in the experimental group showing higher scores on the SF-12 MCS (meanAdj = 46.7, 95% CI: 42.1–51.2) (F = 6.10; P < 0.05) than other patients, especially male patients in the control group (meanAdj = 41.2, 95% CI: 37.8–44.6) (Table 6).

Table 6 Univariate analysis of variance with the variables age, gender, disease severity, and feedback on the outcome variable mental generic HRQoL, controlled for diagnosis
Physical Component Summary score

No significant main effect or interaction effects were found for the variables feedback of HRQoL data and age, gender, and disease severity on the SF-12 PCS.

Patients’ satisfaction with the consultation

The scores on patient satisfaction did not differ significantly between the experimental and control groups (z = −1.20, P = 0.23). Also, no interaction effects of age, gender, and/or disease severity were found on this outcome variable.

Effects of the experimental condition on the consultation and on patient management

Physicians in the experimental group requested information of their patients in 92% of consultations, and they discussed it with their patients in 58% of consultations. They indicated finding the HRQoL information useful in 45% of consultations, which is generally in accordance with the percentage of patients in the experimental group scoring below average on the MCS (39%) and PCS (42%). They mostly found the HRQoL useless when a patient was doing well. Physicians in the experimental group indicated significantly more often than physicians in the control group that they spent more time than usual discussing psychosocial issues (30.7% vs. 6.6% of consultations, z = −6.65; P < 0.001). Treatment policy was altered significantly more often in the experimental group (11% of consultations vs. 1% of consultations in the control group; z = −3.73, P < 0.001). Most commonly, frequency of consultations was increased (n = 5). Other alterations concerned prescription of medication [3], increased attention for physical complaints [4], referral to psychosocial care [1] or occupational health physician [1], and increased attention to explanations/reassurance [2].

Physicians’ experiences with the availability of HRQoL information in clinical practice

Experiences of physicians in the experimental group at 6 months and at the end of the study did not differ. All physicians in the experimental condition found the HRQoL information useful, except for one older physician who claimed to know his patients very well. They indicated being better able to understand some of their patients through the extra information that was provided by the questionnaires. These physicians did not perceive requesting the information as an extra effort on their part. Furthermore, they did not think that using the information lengthened their consultations. All physicians in the experimental group indicated that they wanted to continue using the HRQoL information in the future. Physicians in the control group were similarly positive toward the possible availability of HRQoL information during their consultations in the future, on the condition that it would not be time consuming. This specifically concerned patients awaiting liver transplantation, patients with hepatitis C, and nonnative speakers (mostly patients with hepatitis B).

Discussion

Computerized, real-time measurement of HRQoL at our busy outpatient Department of Hepatology and presentation of the results to physicians before each consultation did not show a main effect on patients’ overall HRQoL. However, secondary analyses showed that the HRQoL measurements positively affected disease-specific HRQoL and generic mental HRQoL of older patients (>48 years of age) with CLD and also generic mental HRQoL of male CLD patients. The results of our study are among the first to show a beneficial effect of presenting HRQoL data to physicians in clinical practice. Most other studies have failed to show evidence for the actual improvement in HRQoL or psychosocial outcomes [9, 1720]. Of the studies that did find a beneficial effect, one showed a decrease in disease-specific debilitating symptoms [13], and another showed improved emotional functioning [10], which is in line with findings of our study. It should be noted that due to the cross-sectional data analyses, a causal relationship between intervention and HRQoL could not be demonstrated. Future studies should address this in further detail.

Our study found no differences between patients in the experimental and control groups with regard to satisfaction with the consultation, which is in line with findings from previous studies [9, 36, 37]. The lack of observed differences between the study groups may have been due to high levels of satisfaction, resulting in a ceiling effect.

This study was among the first to show a significant difference in patient management between experimental and control groups, with physicians in the experimental group mostly reporting a significant increase in the frequency of consultations. Our findings were statistically significant and in accordance with the findings of a systematic review [20] and subscribe to the increasingly acknowledged importance of using HRQoL information for the improvement of physician consultations [38]. However, it should be noted that even though the differences in patient management between control experimental groups were statistically significant, the absolute numbers were small. Therefore, the results should be interpreted cautiously, and further studies using more elaborate methods of data collection—for instance, monitoring patients’ medical records or administering more detailed checklists—are recommended.

Physicians’ experiences with using HRQoL information during the consultation were generally positive; requesting the information was not considered an extra effort on their part, and they found the information especially useful for certain groups of patients, such as those awaiting liver transplantation, those with hepatitis C, and nonnative speakers. All physicians but one found the information useful for at least some (45%) of their patients. Physicians indicated finding the information least useful when patients were doing well in terms of HRQoL or when they knew the patient well. These generally positive experiences are in accordance with findings from previous studies [914], which assessed oncologists’ attitudes toward using HRQoL information in clinical practice. The confirmation of these results in hepatologists suggests that HRQoL information may also be well accepted by physicians treating patients with other chronic conditions. Another result of our study was that when HRQoL information was available, more time was spent discussing psychosocial issues and more treatments were altered. Interview and checklist data were contradictory regarding the duration of consultations when HRQoL information was available. In a previous study in which the duration of consultations was timed, no increase in consultation time was found [14]. Future studies should shed more light on whether the availability of HRQoL information increases the length of consultations in hepatology.

The strength of our study lies in the analyses performed, where benefits for specific groups of liver patients were explored by entering interactions between gender, age, disease severity, and feedback of HRQoL data, rather than solely investigating main effects between the intervention and control groups. Also, this study included patients with CLD rather than patients with cancer or patients from general practice, making it especially relevant to a more general population of patients with a chronic illness.

We are aware of several limitations of this study. First, physicians rather than patients were randomly assigned to either the intervention or control group. Randomization is a complicated issue in these kinds of implementation studies, and both methods are subject to limitations. An important advantage of the randomization of physicians is that the control group was not biased toward mentioning HRQoL topics more often than usual. Future studies using the same design but including more physicians are needed to further explore possible main effects of HRQoL measurement on patients’ overall HRQoL. A second limitation was the high number of nonparticipants. Part of the explanation may lie in the fact that patients were responsible for contacting their physician if they were interested in participating in the study. In addition, the number of non-Dutch-speaking patients visiting the department is relatively large (hepatitis B, for example, is most common among people from North Africa). These patients were also invited to participate but were less likely to respond. The relatively large number of patients who completed the questionnaires only once may be explained by the small window of opportunity to complete the questionnaires before each consultation. In addition, for such implementation endeavors, cooperation of all staff members is essential, and future research should explore this further. A last limitation of this study was that the checklists used to assess consultation content were not very detailed. This was done on purpose, as longer inventories would have compromised physician participation. However, considering the positive outcomes of this study, it is advisable that future studies consider ways to obtain a more detailed view of how the HRQoL information affects consultation content, for example, by recording consultations.

In conclusion, although a main effect of the intervention was not found, this study showed a beneficial effect of implementation of HRQoL measurement in clinical practice on the HRQoL of older and male patients with CLD and on patient management. Nevertheless, the study had several shortcomings, and further studies are needed to substantiate these findings. Physicians’ experiences with the availability of HRQoL information were positive, especially for patients awaiting liver transplantation, patients with hepatitis C, and nonnative speakers. They expressed an interest in continued use of HRQoL information. These results advocate the continued use of measuring HRQoL in a clinical practice of hepatology. Including older patients and male patients, who have been shown to benefit most from such a procedure, should be aimed for.