John Williamson and the terrifying results of the Medical Practice Information Demonstration Project
Here is how Avedis Donabedian described John Williamson’s contributions in a letter written in July 2000. “Many years ago, much before the assessment of outcomes became so fashionable a slogan, Williamson perceived that ‘outcomes’ should indeed be the focus of attention in assessing the quality of care . . .” Williamson’s concept of quality is “achievable benefit not achieved”. Donabedian said that Williamson’s 1978 book “Assessing and improving health care outcomes”1 “deserves to be one of the classics in our field”. “Williamson insisted that assessment of quality should involve not only physicians but a team of health workers cognizant of the issues at hand. I should add that Williamson can be considered the founder of a school of quality studies”, influencing the work of Robert Brook, Evert Reerink and Alison Kitson.*
John is now retired and living in San Diego. He is ready to talk and laugh and seems more bemused than angry over the consequence of his 1976–9 study of the evidence in support of quality of care. How could such a person get into trouble in pursuit of better quality?
THE NEED FOR EVIDENCE ABOUT BEST PRACTICE
In 1975 William Munier, MD was the national director of the Professional Standards Review Organizations (PSROs) working for Theodore Cooper, MD who was Assistant Secretary for Health in the US Department of Health Education and Welfare (HEW) from 1975 to 1977.2,3 The PSROs were created at the state level of government to review the use of services and quality of care related to the Federal Medicare program.4,5 PSRO staff did examine length of hospital stays, but were mostly mystified over how to set standards for quality of care evaluation. Williamson and his colleague Peter Goldschmidt, MD (publisher of “Report on medical guidelines and outcomes research medical care management corporation” and co-author of “Quality management in health care”) proposed to Munier that this gap might be addressed by inviting leading experts to assess the state of knowledge about diseases and their treatment. The experts would be chosen by the appropriate branch of the National Institutes of Health (NIH). These experts would be asked specific questions about outcomes of care and to support their answers with the best available evidence. Thus, best practices and quality could be defined and the conclusions used to guide PSROs in evaluating quality.
Munier took this idea to Cooper who told him it was very “important”. Cooper asked Munier to talk about it with Donald Fredrickson, MD, the distinguished head of NIH from 1975 to 1981. Munier recalls Fredrickson being very polite at their meeting. Later Fredrickson and Cooper talked about it by telephone. Munier, many years later, recalls Cooper’s description of the conversation. Cooper said Fredrickson was not in favour of the proposal. Cooper told Fredrickson he had six months to come up with his own plan if he did not like this one. He did not and in 1976 Williamson, along with Peter Goldschmidt and Irene Jillson, were given a contract through their company, Policy Research Incorporated, to carry out the Medical Practice Information Demonstration Project (MPIDP).6 Susan Horn, PhD, and Theodore Colton, PhD, helped to organize a panel of leading biostatisticians to assess the quality of the literature.7 Munier was the government director for the project and he contacted the heads of several NIH institutes to invite them to participate. Those who agreed were asked to select a health condition within their domain where research had contributed substantially to improving patient outcomes. The National Cancer Institute chose malignant melanoma, the National Heart Lung and Blood Institute picked chronic rheumatic heart disease, and the National Institute of Mental Health chose bipolar disorder. Each of these institutes selected 20–40 experts across the country to help evaluate the evidence related to care. Experts were chosen in the areas of epidemiology, diagnosis, treatment and economics. To guide the decision making, Williamson’s group developed workbooks with specific questions to be answered such as: incidence and prevalence of the condition, diagnostic test sensitivity and specificity, treatment efficacy (under ideal conditions) and effectiveness (in usual clinical practice), and economic costs to the nation. Critical information being sought included:
the magnitude of the problem, incidence, prevalence, natural history (without health care) and social (including economic) costs;
achievable benefits from current health care technology if optimally applied;
extent of achievable benefits actually being achieved at present;
cost effectiveness of reasonable improvement action (Williamson J, Goldschmidt P, Jillson I, et al. “The Medical Practice Information Demonstration Project”. January 1984, unpublished, page 5).
The literature was searched by finding all citations for the three topics listed by Medline, the main computerized bibliography of the National Library of Medicine (NLM). This yielded 2696 citations. A random sample of 365 of these were drawn and analyzed independently by two biostatisticians, at least one of whom was a specialist on the topic. Of these, 37% were “conceivably applicable in some way to practice” and only 4% “met criteria of information maturity for either quality assurance or clinical practice.”8,† Workbooks containing the best evidence were sent to the expert panel. These experts were asked to provide the best available estimates for data points such as: the expected health outcome for a 25 year old white woman with stage I malignant melanoma of the head or neck, given current treatment. The experts were asked to provide the research citations for the best documentation of their answer, and whether their answer was based on research, extrapolation, or assumption.
THE RESULTS OF THEIR RESEARCH
Across all three topics a total of 872 such data points were requested. Only 130 (15%) were supported by a direct research study. The statistical team for the overall study found that, of these 130, only 30 of the data points were actually contained in the cited study. Only six of the 872 data points were “even moderately substantiated by empirical research”. 298 requests for best articles were asked of the experts. For 245 at least one expert said there was “no valid data or acceptable article”.
According to Williamson: “On reading the comments provided by the expert teams in response to our post-study project evaluation questionnaire, the following comments were typical of their reactions. One stated that it was depressing to see that the state of the art is so primitive. Another stated that it is frightening to think that such uncertain material can serve as the basis of policy; indeed terrifying.”8
Apart from the soundness of the information, Williamson found serious problems retrieving the relevant research information. For example, the experts in bipolar disease listed 124 key articles. When their list was compared with the 1640 articles on bipolar disorder in Medline only 13 overlapped. The National Institute of Mental Health bibliographic file had 294 articles on this topic of which only eight overlapped with the experts’ list. “Only a single article could be referenced through all three sources.”8 The leading textbook (by team consensus) for bipolar disorder had 859 citations, only 46 of which could be found on Medline. In short, there was no common body of literature on best practice—another worrisome finding.
According to Goldschmidt, Munier and Williamson, long term senior staff at NIH were not happy about this project. This negative reaction is understandable since the project was imposed on NIH by the Assistant Secretary for Health. The findings posed a potential political threat to the NIH because it justified its government funding based on improvement of care. The NLM was supposed to be giving practitioners the published evidence they needed to practice appropriately. Dr Martin Cummings, Director of this Library and a recipient of multiple honorary degrees,9 wrote a letter to the powerful US Representative Henry Waxman attacking Williamson’s study.
By that time key government backers of the study were gone. On 20 January 1977 at 12 noon Ted Cooper was relieved of his position by the new Secretary of HEW, Joseph Califano. Afterwards Dr Munier said he was “cornered by the Carter people” and left government service two years later.
The contract came to an end in 1979 with a report to the Assistant Secretary of Health. NIH staff took the very unusual step of retrospectively peer reviewing Williamson’s report. These peers were critical and said “This research is leading to no useful result”. Williamson was told that he could publish no reports from his study. Munier thinks that from then on Williamson was blackballed by the NIH. Today Williamson says he was told, off the record, that future grant proposals from him to the NIH would not be looked upon favorably.‡ His report remains unpublished to this day.
Williamson spent the years 1981–1983 as a Visiting Professor in the Department of Biostatistics at Harvard University’s School of Public Health. He resigned his Johns Hopkins Professorship in 1984 to take a senior position at the Veterans Administration Medical Center in Salt Lake City, the place of his birth, where he continued his research on quality of care. Today Munier thinks that the trouble at NIH led to Williamson’s move to Salt Lake City. According to Horn: “I can remember the pain it caused him. He always wanted to be discrete about the situation . . . He was a true visionary, but few wanted to listen”(S Horn, personal communication, 8 April 2002). Today Munier (personal communication, 5 April 2002) thinks that “so much could have been accomplished if the report had been received in an intellectually honest way and acted upon”. He thinks that this study influenced the later work of the NIH Office of Medical Application of Research (OMAR) which, since the 1980s, has produced a long series of consensus panel reports on best medical practice. A later director of the NLM, Dr Donald Lundberg, put much effort into practitioner friendly computerized bibliographies. The US National Clearinghouse of Practice Guidelines has been established.
With the benefit of 25 years of hindsight, Williamson’s findings are not surprising. The evidence based medicine movement is working to fill in this lack of evidence. Today the Cochrane Collaboration has collected information from 250 000 randomized clinical trials,10 and we could use many more. NIH and NLM continue to prosper and are highly regarded. In short, we now realize that John Williamson and his colleagues were right in 1979, but some were not ready for his message then. Transparency and an honest willingness to evaluate what we do and know are worthy goals. They will be essential to our efforts to reduce error and improve quality.
John W Williamson, MD, was born in 1931 in Salt Lake City. He gained a BA degree in 1953 from the University of California, Berkeley and an MD degree in 1956 from the University of California, San Francisco. After residency in internal medicine he did a fellowship at the Office of Research in Medical Education at the University of Illinois in Chicago from 1963 to 1965. From 1966 to 1984 he was on the faculty of the School of Public Health at Johns Hopkins University. From 1984 to 1999 he held senior administrative positions in research and education at the Veterans Affairs Medical Center in Salt Lake City. He is now retired and living in San Diego. In 2000 he was given the individual Ernest Amory Codman award of the Joint Commission on Accreditation of Healthcare Organizations for his life time of scholarship and other contributions related to quality of health care.
↵* Other books by Williamson include “Improving medical practice and health care. A bibliographic guide to information management in quality assurance and continuing education”, Cambridge, Mass: Ballinger, 1977, 1035 pp (containing 3500 abstracts) and Williamson JW, Weir CR, Turner CW, et al. “Health care informatics and information synthesis”, Thousand Oaks, CA: Sage, 2002.
↵† This paper briefly summarizes Williamson’s MPIDP project and was in print before the embargo was imposed (Williamson, personal communication, February 2002).
↵‡ There is no written documentation in support of Munier and Williamson’s perception. Such prejudgement would be completely inconsistent with official NIH policy (Williamson, personal communication 30 July 2002). Note, however, that Williamson’s following career moves can be understood by this perception of his.