Original article
The meaning of kappa: Probabilistic concepts of reliability and validity revisited

https://doi.org/10.1016/0895-4356(96)00011-XGet rights and content

Abstract

A Framework—the “agreement concept”—is developed to study the use of Cohen's kappa as well as alternative measures of chance-corrected agreement in a unified manner. Fcusing on intrarater consistency it is demonstrated that for 2 × 2 tables an adequate choice between different measures of chance-corrected agreement can be made only if the characteristics of the observational setting are taken into account. In particular, a naive use of Cohen's kappa may lead to strinkingly overoptimistic estimates of chance-corrected agreement. Such bias can be overcome by more elaborate study designs that allow for an unrestricted estimation of the probabilities at issue. When Cohen's kappa is appropriately applied as a measure of chance-corrected agreement, its values prove to be a linear—and not a parabolic—function of true prevalence. It is further shown how the validity of ratings is influenced by lack of consistency. Depending on the design of a validity study, this may lead, on purely formal grounds, to prevalence-dependent estimates of sensitivity and specificity. Proposed formulas for “chance-corrected” validity indexes fail to adjust for this phenomenon.

References (35)

  • CB Begg

    Biases in the assessment of diagnostic tests

    Stat Med

    (1987)
  • H Brenner et al.

    Chance-corrected measures of the validity of a binary diagnostic test

    J Clin Epidemiol

    (1994)
  • J Cohen

    A coefficient of agreement for nominal scales

    Educ Psychol Meas

    (1960)
  • JL Fleiss

    Statistical Measures for Rates and Proportions

    (1981)
  • HC Kraemer

    Ramifications of a population model for κ as a coefficient of reliability

    Psychometrika

    (1979)
  • RL Brennan et al.

    Coefficient kappa: Some uses, misuses, and alternatives

    Educ Psychol Meas

    (1981)
  • M Maclure et al.

    Misinterpretation and misuse of the kappa statistic

    Am J Epidemiol

    (1987)
  • Cited by (99)

    View all citing articles on Scopus

    This work was supported in part by BMFT Grant 07 PHF 01.

    View full text