Article Text

Download PDFPDF

Testing process errors and their harms and consequences reported from family medicine practices: a study of the American Academy of Family Physicians National Research Network
  1. J Hickner1,
  2. D G Graham2,
  3. N C Elder3,
  4. E Brandt2,
  5. C B Emsermann4,
  6. S Dovey5,
  7. R Phillips6
  1. 1
    Department of Family Medicine, The University of Chicago Pritzker School of Medicine, Chicago, Illinois, USA
  2. 2
    American Academy of Family Physicians National Research Network, Leawood, Kansas, USA
  3. 3
    University of Cincinnati Department of Family Medicine, Cincinnati, Ohio, USA
  4. 4
    Department of Family Medicine, The University of Colorado Health Sciences Center, Aurora, Colorado, USA
  5. 5
    Royal New Zealand College of General Practitioners Research Unit, Dunedin, New Zealand
  6. 6
    The Robert Graham Center: Policy Studies in Family Medicine and Primary Care, Washington, DC, USA
  1. J Hickner, Department of Family Medicine, The University of Chicago Pritzker School of Medicine, 5841 S. Maryland Ave, MC 7110, Suite M-156, Chicago, IL 60637-1470, USA; jhickner{at}


Context: Little is known about the types and outcomes of testing process errors that occur in primary care.

Objective: To describe types, predictors and outcomes of testing errors reported by family physicians and office staff.

Design: Events were reported anonymously. Each office completed a survey describing their testing processes prior to event reporting.

Setting and participants: 243 clinicians and office staff of eight family medicine offices.

Main outcome measures: Distribution of error types, associations with potential predictors; predictors of harm and consequences of the errors.

Results: Participants submitted 590 event reports with 966 testing process errors. Errors occurred in ordering tests (12.9%), implementing tests (17.9%), reporting results to clinicians (24.6%), clinicians responding to results (6.6%), notifying patient of results (6.8%), general administration (17.6%), communication (5.7%) and other categories (7.8%). Charting or filing errors accounted for 14.5% of errors. Significant associations (p<0.05) existed between error types and type of reporter (clinician or staff), number of labs used by the practice, absence of a results follow-up system and patients’ race/ethnicity. Adverse consequences included time lost and financial consequences (22%), delays in care (24%), pain/suffering (11%) and adverse clinical consequence (2%). Patients were unharmed in 54% of events; 18% resulted in some harm, and harm status was unknown for 28%. Using multilevel logistic regression analyses, adverse consequences or harm were more common in events that were clinician-reported, involved patients aged 45–64 years and involved test implementation errors. Minority patients were more likely than white, non-Hispanic patients to suffer adverse consequences or harm.

Conclusions: Errors occur throughout the testing process, most commonly involving test implementation and reporting results to clinicians. While significant physical harm was rare, adverse consequences for patients were common. The higher prevalence of harm and adverse consequences for minority patients is a troubling disparity needing further investigation.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Testing errors are common in primary care practice and may lead to patient harm and malpractice claims.15 The testing process is traditionally divided into preanalytic, analytic and postanalytic phases. In each phase, clinicians, patients, office staff and lab staff perform a series of tasks. These tasks can be grouped into the following expanded categories: ordering the test, implementing the test, performing the test, reporting results to the clinician, clinician responding to the results, notifying the patient of the results and following up to ensure the patient took appropriate action based on the test results (fig 1). The complexity of the testing process, frequent separation of lab testing from clinic location and general lack of quality control systems in primary care offices make testing in primary care error-prone.

Figure 1 Conceptual framework of the testing process.

Systematic efforts to describe, understand and eliminate problems that occur in the management of the testing process in office practice lag behind those of hospitals. The first study of testing process errors reported from US primary care practices found a rate of 1.1 lab-related problems per 1000 office visits and suggested that one in four of these errors affected patient care.1 In a more recent office-based study, 15 family physicians self-reported errors immediately after patient encounters.2 The investigators noted an error or preventable adverse event in 24% of patient encounters. Fourteen per cent of those errors/adverse events involved the testing process, a rate of one testing process error for every 30 office visits. In three other primary care patient safety studies, 15% to 54% of errors reported by physicians were related to the testing process.367

This study is the largest study to date of testing process errors reported from family physician offices in the United States. We sought to determine the distribution of error types, to describe the outcomes of these errors and to perform an exploratory analysis of predictors of error types and adverse outcomes. The aims of this study were to gain a deeper understanding of testing process errors in primary care practice and to lay a foundation for developing interventions to improve the safety of the testing process.


This study took place in eight family physician offices of the American Academy of Family Physicians National Research Network, including four private practices and four family medicine residency clinics. The offices were purposefully selected from a list of 58 volunteers to maximise practice diversity. The offices were a mix of small and large practices. One residency clinic was a federally qualified community health centre. The practices were in seven states throughout the US. Four were rural, three were urban, and one was in a suburban location.

Physicians, residents, nurse practitioners, physician assistants and office staff submitted anonymous reports of errors they recognised or experienced during the course of their work day. We asked participants to report only errors related to the testing process, including lab tests, diagnostic imaging and other tests such as pulmonary function tests and electrocardiograms. We asked participants to report anything about the testing process that they observed or committed “that should not have happened and that you don’t want to happen again.”

Physicians and staff were given the option of reporting via the internet to a secure website or by mail. The reporting tool is consistent with the Institute of Medicine’s recommendations for error reporting, which are based on research on human performance measurement in industry.8 The fields of this error-reporting form are based on the conceptual framework of the Australian Incident Monitoring System and have been refined over the course of several primary care errors reporting studies (error-reporting form available on request).679 A physician and a designated study coordinator (usually a nurse or manager) from each of the participating sites attended a mandatory 1½-day group training session, and they trained other clinicians and staff and implemented the study at their respective offices.

Reports were submitted anonymously but included practice codes and reporter-type codes for four categories of reporters: physicians, nurse practitioners and physician assistants, office staff (including non-clinician nurses) and residents. Participants reported errors for 32 weeks in 2004. Each practice was assigned four intensive reporting weeks (1 week out of every 8 weeks) and 28 routine weeks. During the intensive reporting weeks, participants were asked to report every testing process error they identified, while during routine weeks, they were asked to report errors that they wanted us to know about, with particular attention to errors that had the potential to cause or actually did cause harm.

Before reporting began, a knowledgeable person in each office, generally the office manager, completed a survey describing the practice’s clinical testing processes.


Office lab survey

We summarised the responses to the office lab survey with frequency distributions. We dichotomised the offices into those with high-quality testing processes (offices reporting a specific system to ensure testing follow-up and who believe that more than 95% of patients are directly notified of significant abnormal lab results) and low-quality testing processes (offices that reported not having a specific follow-up system and who believe that less than 95% of patients are directly notified of significant abnormal lab results).

Coding the reports and errors

We studied the narrative sections of each report and assigned codes to each error and to the contributing factors, harms and consequences identified by the reporter. We coded these elements using the January 2005 version of the International Taxonomy of Medical Errors in Primary Care (ITME-PC—Version 2).10 We expanded the testing process errors section of the taxonomy to allow more detailed classification. The main categories of testing process errors correspond to the boxes in fig 1 but do not include analytical errors. If a report identified multiple errors, they were numbered chronologically in order of occurrence.

DG, EB, SD and JH coded the event reports. Coding rules were logged, and when necessary, we recoded previously coded reports to maintain consistency. Any coding discrepancies were resolved by consensus among at least three coders. This process resulted in 100% agreement for coding to the first three levels of the taxonomy. The kappa coefficient for coding error types at the highest level the error was coded, generally at the fourth digit level, was 0.338 (95% CI: 0.24, 0.45; p<0.0001) on a 10% sample of error reports, which indicates good general agreement among raters.


We coded negative outcomes as “adverse outcomes” and “harm.” The categories for adverse consequences are: no consequence, time, financial, delay in care, pain, suffering, clinical consequence and unknown. Because of the small sample sizes for two categories (time and financial), we grouped these consequences into one category for this analysis. The coders classified harm according to the United States Pharmacopoeia MedMARx Error Outcome Categories (8, p. 293), which is based on the risk-assessment index developed by the National Coordinating Council for Medication Error Reporting and Prevention.

Descriptive analysis

Descriptive analyses were performed using frequencies and percentages to summarise characteristics of the event reports, patients involved in the events, error types, contributing factors, and outcomes. Chi-square tests were performed to test associations between demographic and event report variables and the types of error reported.

Exploratory univariate and multivariate analyses of predictors of outcomes

Exploratory unadjusted and adjusted analyses were performed to determine associations for adverse consequences and harm, and a defined set of 16 characteristics derived from the event reports and the practice lab surveys that the literature and our own research and clinical experience suggest might be associated with adverse consequences and harm. These include characteristics of the event reports (reporter type, intensive versus no intensive reporting period); the errors (type and occurrence location), the patients (gender, age, race/ethnicity, presence of a chronic medical condition) and the practices (residency versus non-residency, number of labs used, presence or absence of an EMR, testing processes quality rating (high/low), and four general characteristics of good test tracking processes (present or absent).

For the univariate and multivariate analyses, we dichotomised the outcomes to harm/no harm and adverse consequence/no adverse consequence. Event reports for which harm or consequences could not be determined were excluded from these analyses. Univariate logistic regression analyses were first used to determine unadjusted associations between the event outcome and each practice, patient, error and lab test characteristic. To determine adjusted associations, multivariate logistic regression analyses were performed, adjusting for any practice, patient, error or lab characteristic that was associated with harm or adverse consequences at the alpha level of 0.20 from the univariate analyses. Because the interclass correlation coefficient was 0.9% for consequences and 2.0% for harm, multilevel analyses were performed. Fifty-one event reports did not identify a specific patient involved in the error. Therefore, any analysis that used patient characteristics excluded 51 event reports (n = 539).

Assignment of error type to each event report

For the report-level analysis of associations between type of error and the other 15 characteristics described above, it was necessary to assign a single error type to each event report. For the 288 event reports (49%) that contained more than one error, we tested three methods for assigning error type to these multi-error event reports. The first method selected the first error that occurred; the second method selected the last error that occurred; and the third method randomly selected one of the errors. The distributions of error types defined by the three methods were similar for the 590 event reports. Because the method of random selection eliminated bias toward the order of the reported error, we used this method for assignment of error type to the multi-error event reports. For the analyses, all errors were aggregated into the 10 mutually exclusive major categories listed in table 1.

Table 1 Types of 966 testing process errors reported by family physicians and their office staffs in 590 event reports

All analyses were performed using SAS 9.1. For unadjusted and adjusted logistic regression analyses, the glimmix macro was used. This study was reviewed and approved by the University of Missouri-Kansas City Social Sciences Institutional Review Board and by individual site institutional review boards as required.


Office testing process survey results

Four of the eight practices had an electronic medical record. Two practices used only one lab for testing, and six used two or three labs. Six offices reported having a system for tracking abnormal results or results needing action until the action was completed. Five practices said they informed patients about normal labs at least 75% of the time, and four practices informed patients about clinically insignificant abnormal results at least 75% of the time. Five of the eight practices claimed to follow-up on abnormal labs greater than 95% of the time.

Event reports

During the 32-week reporting period, the 243 participants submitted 661 event reports. Of these, 590 events had 966 errors related to the testing process. Staff submitted 51% of the reports, physicians submitted 39%, nurse practitioners and physician assistants submitted 8%, and residents submitted 2%. Reporting volume varied from a high of 25.8 reports per reporter in one practice to 1.5 reports per reporter in another, with a mean of 4.1 reports per reporter across all practices. Other characteristics of the event reports are given in table 2.

Table 2 Descriptors of 590 testing process event reports submitted by family physicians and their office staffs

Approximately one-third of the event reports were submitted during the four intensive reporting weeks. During the routine and intensive reporting period, there were 1.7 and 6.4 event reports per practice per week, respectively. Half of the reports (49%) contained more than one error, and 90% of these reports had “error cascades” in which one error led directly to the next.11

The demographic characteristics reported for the 539 patients involved in the events are summarised in table 3.

Table 3 Characteristics of 539 patients involved in testing process errors reported by family physicians and their office staffs*

Among patients with available data, 64% were female, 70% were 18–64 years old, and 57% were white; 68% were reported to have chronic health problems, and 53% had complex health problems.

Error types and associations

Table 1 shows the distribution of the types of 966 reported testing process errors. The most common types of errors were related to reporting results to the clinician (24.6%), implementing tests (17.9%), general administrative errors such as filing and chart availability (17.6%) and test ordering (12.9%). There was considerable variability in the distribution of error types reported from each practice site (table 4).

Table 4 Summary of testing process errors by practice site (column %)

Clinicians reported a greater proportion of administrative errors (18% vs 13%) and errors in responding to test results (8% vs 3%), and office staff reported more test ordering errors (18% vs 11%) (χ2 = 25.4, p = 0.001). Participants from the residency practices reported a greater proportion of administrative errors (19% vs 11%) and errors related to reporting results to the clinician (32% vs 24%), while participants from the non-residency practices reported more errors in test ordering (19% vs 12%) and test implementation (24% vs 18%) (χ2 = 21.5, p = 0.003). Compared with routine reporting, intensive reporting periods had a higher proportion of test ordering (19% vs 13%) and test implementation (27% vs 17%) errors and a slightly lower proportion of errors reported in all other categories (χ2 = 18.0, p = 0.012).

Among the 416 event reports in which race/ethnicity was reported, minority patients were more likely to have errors of test implementation (32% vs 18%) and less likely to have administrative errors (8% vs 17%) (χ2 = 16.8, p = 0.018). There was no difference in error-type distribution by patients’ age.

Error types also varied depending on lab survey responses. Practices that used more than one laboratory were much more likely to experience errors in testing ordering (17% vs 5%), but much less likely to report errors related to test implementation (18% vs 30%) (χ2 = 21.68, p = 0.003). Practices with a specific system to monitor the testing process reported a much lower proportion of errors related to test implementation (15% vs 33%) (χ2 = 29.2, p<0.001). We found no statistical associations between having an electronic medical record and error types.


Event outcomes are displayed in table 5. A quarter of the errors resulted in delays in care, and 13% caused pain, suffering, or a definite adverse clinical consequence. Harm resulted from 18% of the events, and the coders could not determine if there was patient harm in 28%.

Table 5 Adverse consequences and harms of 590 testing process events reported by family physicians and their staffs

Predictors of outcomes

Adverse consequences

Race/ethnicity, age, and error type were associated with adverse consequence at the univariate level. Minority patients had higher odds of experiencing an adverse consequence compared with white, non-Hispanic patients (OR: 2.74, 95% CI (1.45, 5.18); p = 0.017), as did patients between the ages of 45 and 64 years of age compared with 0- to 17-year-olds (OR: 2.97, 95% CI (1.30, 6.80); p = 0.047). Patients with errors of test implementation had higher odds of adverse consequences compared with patients with errors of test ordering (OR: 3.27, 95% CI (1.55, 6.87); p = 0.001). In the adjusted model (table 6), minority race/ethnicity status (OR 3.07) and implementation errors (OR 4.18) remained significant predictors of adverse consequences.

Table 6 Multivariable analysis: adverse consequence versus no adverse consequence (365 reports were included in this analysis)


Race/ethnicity and error type were associated with harm at the univariate level. The odds of harm from errors in implementing tests were higher compared with errors in test ordering (OR: 5.06, 95% CI (2.39, 10.75); p<0.001), and minority patients had higher odds of experiencing harm compared with white, non-Hispanic patients (OR: 2.42, 95% CI (1.40, 4.19); p = 0.016). In the adjusted analysis (table 7), errors in implementing tests remained a significant predictor of harm (OR: 5.32, 95% CI (2.22, 12.76); p<0.001), and there was a trend toward increased harm among minority patients (OR: 2.27, 95% CI (1.09, 4.73); p = 0.066).

Table 7 Multivariable analysis: harm versus no harm (312 event reports are included in this analysis)


We have examined errors in the testing process reported from eight diverse family medicine offices from across the United States. Family physicians and their office staffs were willing and able to identify and report in reasonable detail a wide variety of errors in the testing process. Although one cannot determine true error rates from error reporting studies, testing process errors appear to be common. Since many errors are undetected or unreported, we can assume this is an extreme lower bound and believe the volume of errors reported during the intensive reporting weeks supports this supposition.

Each practice reported errors across the spectrum of the testing process, regardless of the level of sophistication of their testing process management systems. About one-third of the errors related to ordering tests and getting them done. One-quarter of the errors were related to getting the results back to the ordering clinician in a timely fashion. One-fifth of the errors were general administrative errors such as misfiling. These administrative errors cut across the spectrum of the testing process and could not be attributed to a specific step in the testing process.

The small proportion of errors we classified as communication errors (6%) needs comment. Because we chose to use a classification system with mutually exclusive categories, it does not accurately represent the contribution of communication to patient safety events. Communication difficulties are inherent in many, perhaps most errors in physician offices.11 In another error-reporting study from family physician offices that used a multi-axial coding system, a communication problem was present in 71% of event reports.3 No analytical errors were reported. Analytic errors represent less than 10% of testing process errors and are unlikely to be observed in primary care offices.1213 Of note, there were no reported errors of failure to monitor patient response to an action based on a test result, the last step of the testing process (fig 1). For example, a patient’s coumadin dose might be adjusted based on INR result, and the physician may have failed to repeat the INR at the appropriate interval after the adjustment. Absence of reported errors from this last testing process step could reflect that clinicians and staff do not consider this a testing process step but rather a treatment or patient adherence error.

The error-type distribution was not associated with most of the measured variables related to event report, practice or patient. There was minor variation in the types of errors reported by clinicians and staff and from residency and non-residency practices, but these seem to have little practical importance. “Intensive” reporting appears to pick up more “front end” problems. Forty-six per cent of the errors during the intensive reporting periods were errors of ordering and implementation compared with 30% during the non-intensive periods.

There was considerable variation, however, in the types of errors reported from each practice. For example, test ordering errors comprised 28% of the errors reported from one practice but only 4% from another. This suggests that each practice must examine its own testing process to discover the weak links. For example, practices lacking a specific system to monitor the testing process were twice as likely to report errors related to test implementation. Curiously, having an EMR appeared to have no effect on the type of errors reported. The lack of effect of an EMR supports recent studies showing that having an EMR can actually lower quality if they are not configured to appropriately support care.14

The most important association with error type and adverse outcomes is the race/ethnicity of the patient involved. Errors of test implementation were nearly double for minority groups compared with non-Hispanic whites (32% vs 18%). Minority patients were also more likely to experience adverse consequences. The odds of a minority patient suffering an adverse consequence from a testing process error was three times that of a white non-Hispanic patient, even after adjusting for the association between implementation errors and race/ethnicity. Minority patients were twice as likely to experience harm compared with white, non-Hispanic patients. Coders were blinded to racial/ethnic status during the coding procedures, so these findings are not due to coder bias. Further investigation in a more representation sample of practices is required.

A significant portion of testing process errors result in adverse outcomes. We classified outcomes in two ways, and depending on which of these outcome categories one uses, adverse consequences or harm occurred in 18% to 59% of the events. What could be labelled as inefficiency and inconvenience (lost time, greater cost, delays in care) occurred in nearly half of the events. Thirteen per cent of events resulted in adverse clinical outcomes to patients, and 18% resulted in physical or emotional harm. Our analysis of predictors of adverse outcomes suggest that clinicians and staff should focus improvement efforts on the implementation and notification steps of the testing process, and researchers should focus on race/ethnicity.

Study limitations

There are several limitations to this study. We did not use a standard definition of medical error, and we did this intentionally. The definition we used, anything you observe “that should not have happened and that you don’t want to happen again,” has been highly successful in eliciting error reports in three of our prior studies of errors reported by family physicians. Next, the study has all of the usual limitations of error-reporting studies, which cannot give a true distribution or frequency of all errors that occur.15 For example, the lab and radiology facilities used by the offices would have observed other errors that were not reported by the office staff and physicians. Chart review would have provided yet another set of errors. While the true distribution of testing process errors in family physician offices may be different, our data lead to some testable hypotheses and needed exploration. Second, the sample of practices is small and may not be representative, but we were purposeful about our selections to improve generalizability. Third, the demographic and outcomes data from reporters were incomplete (racial/ethnic status was not reported for 23% of patients), so the results of the analysis of patient variables and outcomes may have been different with a complete data set. We used an anonymous system, rather than a confidential system, so we could not follow-up with the reporter to ask questions and get additional information. Fourth, the reports are from physicians and office staff. Lab staff and patients may have identified other errors, and the frequency distributions of errors they observe and report could be much different. Fifth, despite this being the largest study to date of testing process errors reported from primary-care offices, the sample size was relatively small for testing associations and predictors of adverse consequences and harm. Finally, the classification system we used provides for mutually exclusive classification of errors which provides a broad brush picture of error encounters and is, therefore, best for research projects but lacks the texture of multi-axial coding systems, which are best used for case-by-case analysis in active quality improvement systems.

We believe that when taken as a whole, this study in conjunction with others (refer to American Academy of Family Physicians/Elder prior studies and Applied Strategies for Improving Patient Safety, etc) strongly supports the need for office-by-office improvements in the overall testing process within primary care. Even the lower-bound estimates of frequency and harm provided by this report are unacceptable. Given the volume of lab and imaging studies performed or ordered through the primary-care system, the extent of harm, inconvenience and waste caused by errors is significant. Clinicians, office administrators, and office staff need to assess their internal systems in these areas in an ongoing fashion and address discovered weaknesses in a timely manner. Further work to develop solutions at both the monitoring and improvement stages is needed.


JH had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. We wish to acknowledge the staff and physicians of the eight family medicine offices who submitted event reports, W Pace for his critique, and E Staton and K Hitchcock for manuscript preparation and review.



  • Funding: The study was funded in part by federal grants R21 HS13554-01 and P20 HS11584-02 from the Agency for Health Care Research and Quality.

  • Competing interests: None.

  • Ethics approval: Ethics approval was obtained.