- •
Sensitivity (true-positive rate)=True positives/(True positives+false negatives)
- •
Specificity (true-negative rate)=True negatives/(True negatives+ false positives)
- •
Positive predictive
A Hierarchical Outcomes Approach to Test Assessment☆,☆☆,★
Section snippets
INTRODUCTION
Many authors and readers focus on sensitivity and specificity to assess the merits of a diagnostic test. Although these familiar parameters are important to consider, they ignore other aspects of a test’s value. In this review a classification system for diagnostic technology assessments, reported by Fryback and Thornbury1 in 1991, is used as the basis for a discussion of other ways of evaluating diagnostic tests, their efficacy, and their ultimate value in emergency medicine.
The model used
TECHNICAL EFFICACY
Technical efficacy refers to a test’s ability to produce usable information.1, 2 This includes factors affecting test interpretation and implementation. The concept can be elaborated to include consideration of several related issues.2, 3, 4, 5
A recently developed whole blood rapid troponin T (TT) assay, the cardiospecific troponin T immunoassay (cTnT), serves as an example where technical efficacy is relevant to diagnostic testing in emergency medicine.6, 7, 8 The test is performed at the
DIAGNOSTIC ACCURACY EFFICACY
This category includes measurements of a test’s ability to detect or exclude disease compared with a criterion standard. In addition to sensitivity, specificity and predictive values (Figure 2), likelihood ratios and receiver operating characteristic (ROC) curves18 are other important measurements to consider.
DIAGNOSTIC THINKING EFFICACY
This level of analysis is concerned with assessment of the effect of test information on diagnostic reasoning and disease categorization.1 Studies of diagnostic thinking serve as a proxy for estimating the effect of a test on patient care.
Diagnostic thinking assessments are based on measuring clinicians’ subjective impression of disease status. Most designs require clinicians to prospectively record their clinical diagnosis or differential diagnosis, to report their subjective pretest
THERAPEUTIC EFFICACY
Investigations at this important level seek to determine the effect of testing on patient management. Unfortunately, these are rarely performed before a test diffuses into general practice.2, 22 After widespread diffusion of the test, it is difficult to conduct a randomized, controlled trial of the test. Therefore investigators often resort to prospective case-series designs that assess management plans before and after testing, as demonstrated by the following example.
Weissman et al29
CLINICAL OUTCOME EFFICACY
The value of a therapeutic intervention lies in its ability to improve outcome or to provide an equivalent outcome at reduced cost. The same should be true of a diagnostic test. However, in most situations clinical outcome is temporally remote from testing, and the relationship between the two is not always apparent.30, 31 Consequently, there are a paucity of outcome level studies of diagnostic tests.30 Decision analysis provides an alternative for estimating the effects of diagnostic testing
SOCIETAL EFFICACY
Studies on the societal value of diagnostic tests conducted in the ED have not been reported. In this category the emphasis would shift from measuring benefit accrued to patients, to the benefit provided for society as a whole.1, 39
The methods for conducting this type of study are complex. All direct and indirect costs and benefits associated with the diagnostic process must be included. An assessment of societal benefit includes considerations such as resource utilization, worker productivity,
DISCUSSION
There is a growing awareness of the need for more appropriate utilization of diagnostic tests.3, 12, 43 Although the sensitivity and specificity of a test are frequently known, there are few hard data demonstrating the clinical efficacy of many tests. This is particularly true for screening tests in the ambulatory setting and “routine” admission tests.3
In response to this awareness, several disciplines have emerged, including medical technology assessment. Medical technology assessment refers
References (49)
- et al.
Rapid bedside whole blood cardiospecific troponin T immunoassay for the diagnosis of acute myocardial infarction
Am J Cardiol
(1995) - et al.
Evaluation of the clinical impact of endoscopic ultrasonography in gastrointestinal disease
Gastrointest Endosc
(1996) - et al.
The efficacy of diagnostic imaging
Med Decis Making
(1991) - et al.
Disease, level of impact, and quality of research methods: Three dimensions of clinical efficacy assessment applied to magnetic resonance imaging
Invest Radiol
(1992) - et al.
Conceptual framework for evaluating laboratory tests: Case-finding in ambulatory patients
Clin Chem
(1994) memorandum for the evaluation of diagnostic measures
J Clin Chem Clin Biochem
(1990)- et al.
Guidelines for the assessment of new diagnostic tests
Invest Radiol
(1995) - et al.
Development and characterization of a rapid assay for bedside determinations of cardiac troponin T
Circulation
(1995) - et al.
Multicentre evaluation of an immunological rapid test for the detection of troponin T in whole blood samples
Eur J Clin Chem Clin Biochem
(1996) - et al.
Laboratory Medicine: The Selection and Interpretation of Clinical Laboratory Studies
Troponin T: A diagnostic marker for myocardial infarction and minor cardiac cell damage
Eur Heart J
Cardiac troponin T levels for risk stratification in acute myocardial ischemia
N Engl J Med
Use of methodological standards in diagnostic test research–Getting better but still not good
JAMA
Clinical Diagnosis and Management by Larboratory Methods
Learning how to differ: Agreement and reliability statistics in psychiatry
Can J Psychiatry
The Logic of Laboratory Medicine
Statistical nethodology: II. Reliability and validity assessment in study design, part A
Acad Emerg Med
Statistical methods for assessing agreement between two methods of clinical measurement
Lancet
Medical Decision Making, Newton
Early diagnostic efficiency of cardiac troponin I and troponin T for acute myocardial infarction
Acad Emerg Med
The clearing “haze”: A view from my window
Med Decis Making
Ruling out acute myocardial infarction—A prospective multicenter validation of a 12-hour strategy for patients at low risk
N Engl J Med
Technology assessment: Scientific challenges
AJR Am J Roentgenol
A method of comparing the areas under receiver operating characteristic curves derived from the same cases
Radiology
Cited by (37)
Methodology of method comparison studies evaluating the validity of cardiac output monitors: A stepwise approach and checklist
2016, British Journal of AnaesthesiaCitation Excerpt :This is important because the performance of CO monitors may differ considerably depending on (patho)physiological conditions in the patient.1 2 Moreover, method comparison research represents only the initial part of the validation process of new CO monitors.30 Besides technical efficacy, the ultimate goal of any newly developed monitor is to improve patient outcome and to be cost-effective.
Academic assessment of arterial pulse contour analysis: Missing the forest for the trees?
2016, British Journal of AnaesthesiaIs heart rate variability better than routine vital signs for prehospital identification of major hemorrhage?
2015, American Journal of Emergency MedicineCitation Excerpt :The second implication relates to research methodology. By way of background, Pearl [43] described a 7-tier hierarchical approach to evaluating diagnostic testing. The type of analysis in the current report—directly comparing HRV to routine vital signs—corresponds to Pearl's third tier “diagnostic thinking efficacy,” which includes the “percentage of cases in which the final diagnosis changed after testing.”
The six-item screener and AD8 for the detection of cognitive impairment in geriatric emergency department patients
2011, Annals of Emergency MedicineCitation Excerpt :This is precisely why such instruments need to be validated within the environments in which they could be used.60,61 Diagnostic tests are therefore subjected to a hierarchic outcomes approach progressing from technical value to diagnostic accuracy to clinical outcome efficacy and societal efficacy.62 Because the SIS evaluates only 2 domains (recall, orientation) of cognitive dysfunction, future assessment of screening tools in the ED should evaluate different or additional domains.
Emergency Ultrasound Guidelines
2009, Annals of Emergency MedicineManaging Laboratory Test Use: Principles and Tools
2007, Clinics in Laboratory MedicineCitation Excerpt :In particular, decisions about evaluation methodology and marketing both have important downstream effects on the use of the tests they develop. Fryback and Thornbury [1] have proposed a useful hierarchy, which illustrates how the use of a test is dependent on a number of factors beyond the technology embedded in the test (Box 1) [2]. They also use this hierarchy to illustrate how efficacy at a particular level is generally dependent on efficacy at all lower levels, and does not guarantee efficacy at any higher levels.
- ☆
Address for reprints: William Pearl, MD, Department of Surgery, Division of Emergency Medicine, Emory University School of Medicine, 69 Butler Street SE, Atlanta, GA 30303; 404-616-4620, fax 404-659-6012.
- ☆☆
0196-0644/99/$8.00 + 0
- ★
47/1/94610