Brief reportMeasuring depression outcome with a brief self-report instrument: sensitivity to change of the Patient Health Questionnaire (PHQ-9)
Introduction
Many depression screening instruments have been developed during the past 40 years. However, the “Patient Health Questionnaire (PHQ)” (Spitzer et al., 1999) was the first self-report questionnaire designed for use in primary care that actually diagnoses specific disorders using criteria from the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) (American Psychiatric Association, 2000). Its nine-item depression module, the PHQ-9, is increasingly being used as a brief diagnostic and severity measure in research and clinical practice (Kroenke et al., 2001). The superior criterion validity of the PHQ-9 compared to two other established depression screening questionnaires has recently been demonstrated with respect to the diagnosis of ‘major depressive disorder’ made by a standard interview in assessing psychiatric disorders (Löwe et al., in press, a). Similarly, superior operating characteristics were found for the panic module of the PHQ (Löwe et al., in press, b).
Apart from criterion validity, the most important characteristic of an outcome measure is to appropriately reflect change over time, for instance, the responsiveness to a specific treatment. However, sensitivity to change of the PHQ-9 has not yet been evaluated. If it does prove sensitive to change, the PHQ-9 would be a practical option for assessing depression outcomes in research and clinical practice.
To examine the sensitivity to change of the PHQ-9, we prospectively followed cohorts of medical outpatients with major depressive disorder, other depressive disorders, or no depressive disorder. The PHQ-9 and a standard diagnostic interview were completed at baseline and follow-up. Using the diagnostic interview as the external criterion standard, the patients were assigned to subgroups of improved, unchanged, or deteriorated depression status. With regard to sensitivity to change, we hypothesised that PHQ-9 change scores would differ significantly between these three patient subgroups.
Section snippets
Subjects
From the baseline sample of the German PHQ validation study1 (Löwe et al., in press, a), the following patients were contacted for the follow-up assessment after a mean of 12.3±3.0 months: (i) any patient with ‘major depressive disorder’ diagnosed using the
Results
At follow-up, ‘major depressive disorder’, according to SCID, was present in 40 patients (24.0%), ‘other depressive disorders’ in 43 patients (25.7%), and ‘no depressive disorder’ in 84 patients (50.3%). As a result, a total of 52 patients (31.1%) improved in SCID depression diagnostic status from their baseline to their follow-up assessment, 91 patients (54.5%) remained stable, and 24 patients (14.4%) deteriorated. The corresponding PHQ-9 scores and sensitivity to change indices of the patient
Discussion
This study was performed with the primary objective of evaluating the sensitivity to change of a brief depression diagnostic and severity instrument. We found that changes in the PHQ-9 score correspond with changes in depression diagnostic status over time, providing preliminary evidence that the PHQ-9 can be used for longitudinal as well as for cross-sectional studies. The patient groups were well matched and had excellent follow-up rates.
Our study has several limitations. First, the PHQ-9 and
Acknowledgments
This study was supported by unrestricted research grants from Pfizer, Germany and from the medical faculty of the University of Heidelberg, Germany (project 121/2000). There are no conflicts of interest.
We would like to thank our patients for their participation at the baseline and the 1-year follow-up assessment. Moreover, we would also like to express our thanks to Dipl.-Psych. Beate Wild and to Dipl.-Psych. Dieter Schellberg for their helpful suggestions regarding data analyses and to cand.
References (19)
- et al.
Measuring change over time: assessing the usefulness of evaluative instruments
J. Chronic. Dis.
(1987) - et al.
Telephone assessment of depression severity
J. Psychiatr. Res.
(1993) - et al.
Agreement between face-to-face and telephone-administered versions of the depression section of the NIMH Diagnostic Interview Schedule
J. Psychiatr. Res.
(1988) Diagnostic and Statistical Manual of Mental Disorders DSM-IV-TR
(2000)- et al.
Measuring depression in the community: a comparison of telephone and personal interviews
Public Opin. Q.
(1982) Statistical Power Analysis for the Behavioral Sciences
(1988)- et al.
Reproducibility and responsiveness of health status measures. Statistics and strategies for evaluation
Control Clin. Trials
(1991) - et al.
Structured Clinical Interview for DSM-IV (SCID)
(1995) A rating scale for depression
J. Neurol. Neurosurg. Psychiatry
(1960)