Reliability of diagnoses coding with ICD-10
Introduction
The use of classified and coded medical entities for reimbursement, quality management, and health care policy has increased enormously in the last 30 years. The usefulness of these data relies basically on an identical coding of the same entity independent of the coding person and/or the time of coding. Thirty years ago the Institute of Medicine (IOM) analyzed the reliability of diagnoses coding from hospital discharge abstracts with the 8th Revision of the International Classification of Diseases (ICD) [1]. An independent re-coding of the principal diagnoses confirmed 65.2% of the original codes. Since then, various studies have raised issues such as whether hospitals use systematically wrong codes to increase reimbursement [2] or whether administrative data include the necessary elements for quality management [3]. Many studies have been published concerning the validity of coded data [4], [5]. But it is still not clear whether diagnoses coding with ICD is more than a matter of chance.
Some established problems raise concerns about the present reliability of diagnoses coding with ICD:
- •
The ICD includes ambiguities and inconsistencies [6].
- •
Coding of abstracts and medical reports is influenced by different conclusions about existing diagnoses [7].
- •
Refinement of ICD for reimbursement and a high number of rules constitute a complex coding system, which is quite difficult to understand, even for coding experts.
Coding of medical entities with classifications is a hot topic in Germany. The codes are used for reimbursement and system design of the German Diagnosis Related Groups (G-DRGs), introduced on a mandatory basis to hospitals in 2004. Obligatory public quality reports from hospitals include performance statistics comprising codes. These reports were published first in 2005 for 2004. A system for risk compensation is in progress. Health insurance companies will establish morbidity scores derived from coded data.
We conducted an investigation on the reliability of diagnoses coding from discharge letters with the German modification of the ICD-10 for health care financing (ICD-10-GM) [8]. The ICD-10-GM is a successor of a pooling of an earlier German adaptation of WHO's 10th revision with the ICD-10 Australian Modifications (ICD-10-AM) Version 1. Due to the adoption of the Australian Refined DRGs (AR-DRGs) in 2003 compatibility with the ICD-10-AM was required. ICD-10-GM is revised each year according to requirements from the G-DRGs. For coding of procedures a national classification – abbreviated as OPS – is used based on WHO's International Classification of Procedures (ICPM), also adapted to the Australian DRGs. The ICD-10-GM 2004 included 12,983 terminal codes.
We aimed at calculating the reliability of diagnoses coding. Reliability measures the agreement of different persons coding the same case (inter-rater reliability) or the agreement of one person at different times coding the same case (intra-rater reliability). Reliability is different from validity. Validity measures the agreement with a gold standard. On the one hand it is possible to have high reliability but weak validity, if all raters agree in their wrong decisions. On the other hand, low reliability can be explained two-fold. It can be the consequence of insufficient education and training, and of inadequate standardization of the coders and the coding scenario. But it can also indicate weaknesses in the classification used for coding mentioned above. In the latter case, low reliability indicates poor quality of a coding system and should lead to a major revision!
The investigation was split into three studies: medical students coding diagnoses lists from discharge letters, physicians working in medical management in hospitals coding from discharge letters, and specialists in medical documentation also coding from discharge letters. Results from the first study with medical students were published previously [9]. Objectives of our study were to learn about the ICD-10, to find arguments for the discussion who should code and to get information on the quality of data coded in routine care.
Section snippets
Materials and methods
Discharge letters were used as basis for coding. The letters originate from a department of internal medicine of a medium sized municipal hospital and had been written by one physician in the early 1990s. They cover a full range of medical problems with special emphasis on nephrology. Personal data had been deleted including any datestamps concerning seldom events, rare diseases, or pathognomonic information. The length of the letters ranged from 1 to 4 pages (cf. Fig. 1). Participants were
Results
Table 2 gives an overview of the study groups. One hundred and eighteen student forms from 15 discharge letters include 516 codes with a mean of 4.4 codes per form. The most frequent code was I10 “essential (primary) hypertension” (38 forms). One hundred and eighteen different codes were used. One hundred and thirty-five manager forms include 751 codes with a mean of 5.6 codes per form. The most frequent code was E66.0 “Obesity due to excess calories” (23 forms). Three hundred and twelve
Discussion
The coding of diagnoses with ICD-10-GM is of great importance for hospitals in Germany today. Their revenue depends mainly on the coding of diagnoses and procedures that build the definition for DRGs. Appropriateness of care is systematically monitored by a timely communication with health insurance companies using the same codes. In questionable cases an assessment of the correct coding, the appropriateness of admissions and the appropriateness of medical decisions is carried out analyzing the
Conclusions
We argue that the stated fair reliability is caused by the extensive refinement of the ICD-10 in Germany, accompanied by the introduction of complex and numerous coding rules. It is obvious to all coding experts that it is impossible to obtain reliable data on such a base. It is surprising, that re-coding studies as presented by Dixon et al. [15] did not recognize the role of the classification itself, even if they conclude a “low level of agreement between coders over main diagnosis and
Acknowledgements
We are very grateful to all the voluntary participants in our study, who received no additional fee for coding.
References (17)
Questions on validity of international classification of diseases-coded diagnoses
Int. J. Med. Inform.
(1999)- Institute of Medicine, Reliability of hospital discharge abstracts, Report of a study, National Academy of Sciences,...
- et al.
Accuracy of diagnostic coding for medicare patients under the prospective-payment system
N. Engl. J. Med.
(1988) Assessing quality using administrative data
Ann. Intern. Med.
(1997)- et al.
Assessing data quality: from concordance, though correctness and completeness, to valid manipulatable representations
J. Am. Med. Inform. Assoc.
(2000) - et al.
Accuracy of data in computer-based patient records
J. Am. Med. Inform. Assoc.
(1997) - et al.
Exploring the boundaries of plausibility: empirical study of a key problem in the design of computer-based clinical simulations
- Deutsches Institut für Medizinische Dokumentation und Information (Hrsg.) ICD-10-GM Systematisches Verzeichnis. Version...
Cited by (94)
Creating a computer assisted ICD coding system: Performance metric choice and use of the ICD hierarchy
2024, Journal of Biomedical InformaticsICD2Vec: Mathematical representation of diseases
2023, Journal of Biomedical InformaticsValidation of Diagnostic Coding for Diabetes Mellitus in Hospitalized Patients
2022, Endocrine PracticeReliability of trauma coding with ICD-10
2022, Chinese Journal of Traumatology - English EditionCitation Excerpt :Coding reliability refers to obtaining the same results upon repeating the coding activity.11 In other words, the coding reliability is an agreement between different people in coding a diagnosis (external reliability), or an individual coding the same diagnosis at different time (internal reliability).12 Adopting a consistent encoding process that results in the reliable coded data is crucial to using these data because users will trust the data when they are convinced that the data encoding process is reliable.13