Article Text

Download PDFPDF

Design of high reliability organizations in health care
  1. J S Carroll1,
  2. J W Rudolph2
  1. 1MIT Sloan School of Management, Cambridge, MA, USA
  2. 2VA Boston Healthcare System and Boston University School of Public Health, Boston, MA, USA
  1. Correspondence to:
 Professor J S Carroll
 MIT Sloan School of Management, 50 Memorial Drive, Cambridge, MA 02142, USA; jcarroll{at}


To improve safety performance, many healthcare organizations have sought to emulate high reliability organizations from industries such as nuclear power, chemical processing, and military operations.We outline high reliability design principles for healthcare organizations including both the formal structures and the informal practices that complement those structures. A stage model of organizational structures and practices, moving from local autonomy to formal controls to open inquiry to deep self-understanding, is used to illustrate typical challenges and design possibilities at each stage. We suggest how organizations can use the concepts and examples presented to increase their capacity to self-design for safety and reliability.

  • high reliability organization
  • design
  • patient safety

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Following the IOM reports,1,2 healthcare organizations have sought ways to improve their safety performance. Many seek to emulate high reliability organizations (HROs) from industries such as nuclear power and air traffic control that are considered to operate in hazardous settings with reliability and safety.3,4 The purpose of this paper is to examine the design of HROs and to ask how healthcare organizations can design safer and more effective operations.


Organizations such as hospitals, nuclear power plants, and air traffic control agencies seek to ensure high levels of safety for customers, employees, and the public. Safety is one of many organizational goals that can be pursued with high reliability. In contrast to other goals such as low clinic waiting times, fast and accurate laboratory results, or shareholder return on investment, safety is a particularly challenging goal for several reasons:

  • Organizations have primary service and production goals that compete or may be perceived as competing with safety.5–8

  • Primary service and production activities usually have immediate bottom line effects measured with quantitative precision whereas safety is a “dynamic non-event”9 produced by people making continual and often invisible small adjustments that may be difficult to define and measure.

  • As safety increases, the decreasing frequency of problems may lead to complacency and diversion of safety focus and resources.4,7,10

  • The combined challenges of managing frequent but routine interruptions of daily work along with occasional novel problems requires skills and techniques that may conflict with each other.11

  • The champions of safety are often external organizations (regulators, citizens’ groups, media, public) or unfamiliar safety specialists who may be seen as interfering with the legitimate service and production work of the organization.12


Organizational design typically refers to the decomposition of the organization into subparts and the processes that integrate the subparts to support the strategy and achieve organizational goals.13 The formal organizational design includes separating what the organization will do itself rather than buy from others, dividing sub-tasks and assigning roles, choosing or developing technology, and establishing and enforcing policies and procedures. A hospital, for example, may be public or private, be part of a larger healthcare system, include an emergency department, build research capacity, affiliate with a medical school, do its own billing or outsource to a contractor, or make various other strategic choices. Expertise can be grouped in various ways—for example, as functional departments (medicine, nursing, emergency), service lines,14 geographical teams (buildings or clinics), or customer teams (insured v private pay, inpatient v outpatient). Interdependent groups that must work together effectively and share resources efficiently must be linked and aligned by hierarchy, task forces, information systems, meetings, incentive systems, socialization practices, leadership vision, training, and so forth.15,16

Although most concepts of organizational design focus on formal structures, our expanded concept of design includes the design of policies, procedures, and practices that turn structure into action. The same formal structure can produce very different behaviors if different processes are used. For example, surgical teams learning the same minimally invasive techniques produced very different results depending on their leadership and communication practices.17 Thus, the effectiveness and sustainability of the formal structures and processes are linked to the informal organization of people, politics, and culture. Organization theorists are focused increasingly on designing aspects of the informal organization including the learning organization,18 informal networks,19 improvisation,20 and safety culture.5


HROs have been defined in terms of their results—namely, highly predictable and effective operations in the face of hazards that can harm hundreds or thousands of people at a time.3,4 In many organizations, reliability is achieved by simplifying and standardizing operational tasks and by anticipating and defending against organizational disruptions.5,21 Industries with more complex, interdependent, unpredictable, and unforgiving technologies, whose frontline experts know more about their work than do their supervisors, cannot rely solely on a factory model of “divide and monitor”.3,21 This is the situation in much of health care which is challenged by variability of individual patients, incomplete evidence bases, rapidly evolving technologies, and shifting financial and regulatory climates.1,2,22

HROs are characterized by a constant wariness by employees at every level, a willingness to shift decision making to knowledgeable experts including frontline employees who know the immediate situation and need to respond promptly, a reluctance to simplify or explain away problems, sensitivity to operational personnel and details, and willingness to make investments in training to strengthen the ability of employees to improvise and learn from experience.3,23 HRO theory offers design principles such as training and giving discretion to frontline employees, avoiding hierarchy and formalization that inhibits flexibility, and maintaining slack resources, but more specificity is given to cultural values and practices such as mutual respect, heedfulness, collective mind, learning from experience, improvisation, sensemaking, and maintenance of doubt.3,24 Yet such prescriptions for the informal organization are difficult for managers and professionals to understand and implement: they rightly criticize researchers for second guessing them after organizational failures without specifying actions to avoid future failures.


We approach organizational design with the assumption that organizations, like organisms, are continually growing and evolving in a changing environment, and therefore designing for high reliability is an ongoing self-design process.25 Dualities must be managed in tension—such as standardization with flexibility, conformity with initiative, accountability with learning, anticipation with resilience,21 and cost reduction with safety.26 However, as new demands arise from more diverse or aging populations, new competitors, new technologies, or new financial pressures, the organization responds with various changes that upset the internal balances; various adjustments occur over time to stabilize the new “way we do things around here” until that new way, too, gradually becomes out of step with changing demands.27 It may even be impossible to match internal and external needs with a stable design, and organizations may experience cyclical changes such as centralization to establish shared focus and functional expertise, followed by decentralization to strengthen commitment and flexibility, followed by centralization, and so forth.28

One unfortunate consequence of this is that there will not be a single design that is “safe” or “best” for all organizations and all times. However, in our work with several high hazard organizations29 we have found a typical developmental sequence in which organizations start with a local and decentralized knowledge structure and then move toward a more formalized and standardized design best suited to establishing control. This “control” form is highly attractive to managers, engineers, regulators, and others who desire reliability and safety. However, not everything can be anticipated and controlled, and therefore organizations that have stalled in their improvement efforts attempt to open their boundaries (for example, try to learn from other organizations) and achieve increased flexibility and innovation which is often at odds with practices and beliefs around control (but not always30). If organizations can move beyond openness to a deep and systemic understanding of their operations, they stand a better chance of sustaining the structures and culture that can integrate or maintain productive tension between control and flexibility or learning.30

In this brief discussion of design by stage we first describe some of the typical challenges at each stage and then select high reliability design elements which organizations in each stage can use to move forward. Table 1 illustrates the design challenges of the four stages and offers suggestions for design possibilities from classic HRO theory and from our own research. Throughout this discussion it is important to realize that many organizations are not uniformly “in a stage” but rather that different parts of the organization may be at different stages. Moving to a next stage does not mean giving up the knowledge and skills of the previous stage, but rather adding to and integrating new capabilities along with those already functioning. Furthermore, there is no guarantee of uniform or upward movement through stages—for example, organizations can move backwards when financial pressures are managed by cutting “unnecessary” budgets for travel, training, hiring, or innovation.

Table 1

 High reliability design in different organizational stages

Local stage

For most of human history, organizations were small and local—for example, farms and craft workshops. Most work in organizations is still local, as individuals or groups perfect their skills and cope with the constraints and costs of dealing with other groups or “the system”. Local learning is often hard to verbalize, closely tied to the details of the work, and difficult to transfer (often requiring apprenticeships or moving people).31 Even the best hospitals have local variability and ad hoc work practices that vary from department to department. Departments within local stage organizations or the organizations themselves may have difficulty communicating across professional boundaries. Because of this, reinventing practices rather than building on industry benchmarks that have been established and tested elsewhere is common. In many hospitals, doctors are not even employees of the hospital but rather individual or small group practitioners who have privileges to work at one or more hospitals. These artifacts of work organization are accompanied by cultural values around individual autonomy and independence articulated in medical school training, certification processes, awards for individual excellence, and so forth.32

While local autonomy has strengths—fostering specialization, innovation, and improvisation—it can also be a weakness. For example, in thinking that this is not a team endeavour, physicians deny or avoid confronting their own vulnerability and assume they have to be “iron men” who can do everything themselves, learn everything themselves, and work long hours without sleep.33 Personal accountability easily shades into “shame and blame” of healthcare professionals who are seen as not up to the challenge.32 Doctors rarely ask for help and do not take kindly to someone telling them how to practice medicine.34 It is challenging to standardize practices that vary by region, hospital, and physician, up to 85% of which have not been tested empirically.35 Each professional and each hospital wants to be known for inventing something new, not copying others. Not surprisingly, healthcare administrators may respond with elements of design that increase standardization and compliance with best practices, as discussed in the next section.

From the local stage viewpoint, designs for safety and quality emerge from professional specialization, expertise, and experience which foster the skills to support within-domain improvisation and innovation in the face of safety challenges. From an HRO perspective, healthcare professionals have superb resources on which to draw in their willingness to make decisions under pressure, to improvise fluidly, to appreciate new knowledge, and to place their patients’ well being even above their own. These practices are based on deeply held assumptions developed over centuries when individual doctors had to be responsible for their patients under extreme and uncertain conditions.36

Control stage

Much of classic management theory identifies managers as planners and controllers.13,15,37 Senior executives are commonly viewed as the architects of the organization who set strategy and communicate the vision while middle managers and supervisors assign roles and tasks and measure and reward performance.15 HROs such as nuclear power plants and aircraft carriers exhibit many strong control characteristics in lengthy procedure manuals, strict rules, extensive training, and strong hierarchy. In an effort to suppress the variability and inefficiency of local innovation, it is natural to view the healthcare system as needing more risk analysis and planning, more standardization, more rules, more performance indicators, more scientific evidence for clinical practices, and more management authority to organize and direct healthcare professionals.38 When competition in an industry drives down profit or operating margins, calls for standardization and efficiency become more urgent, further reducing the resources for innovation, reflection, and quality improvement.

In the control stage, mastery of routine and standardized clinical service may be accompanied by characteristics “which are designed to repress or forget confusing or contradictory qualities”.39 One challenge is how to retain the benefits of control yet also to address novel problems or opportunities that do not fit existing procedures instead of dismissing these as anomalies.11 Design features that handle normal situations may conceal or exacerbate ambiguous situations, such as the use of redundancy to increase reliability that may also increase complexity and invisibility and decrease personal accountability.40

Designs for reliability in the control stage thrive in those aspects of health care amenable to a factory model of formal or bureaucratic control that have helped other industries move toward ultra-safe operations.38 For example, nosocomial infections can be reduced greatly by strict hand washing and gowning practices that can be counted, monitored, rewarded, and punished.41 Spear and colleagues have successfully applied the Toyota Production System and other TQM practices to health care, more successfully with nurses than with doctors.42 Wrong site surgery can be avoided by “sign-your-site” procedures, but compliance is not uniform.43 In response to drug prescription and administration errors, hospitals have developed medication reconciliation strategies that mean stronger bureaucratic controls over who can prescribe, more double checks by people and computers, increased training, color coding, and single dose technologies.44

However, because the control stage can make organizations ignore or discount information not consistent with their current procedures and mental models, mastery of control can bring a false and rigid sense of reliability and safety. The capabilities and practices of the open and deep stages help to make the rigid and sometimes fragile reliability of the control stage more robust.

Open stage

In the open stage, organizations design opportunities for diverse viewpoints to engage in conversations through cross-functional teams and task forces, exchanges of personnel, benchmarking visits, encouragement of participation regardless of hierarchical position, and experimentation with new practices. There are three challenges to reliability in this stage:

  • Costs of experimentation with new procedures: performance in well established streamlined processes developed in the control phase may decline while the organization tries new approaches.

  • It may be difficult to maintain existing controls or know how to adapt them as the organization experiments.

  • There can be clashes between units of the organization that are adopting aspects of a more open culture—for example, cross-functional, cross-hierarchy communication and external benchmarking—and the control culture of other parts of the organization. The new open HRO design structures and practices that include new cross-hierarchical or cross-functional openness may conflict with existing control phase design structures that have been successful for mastering routine operations. In addition, some people may resist new procedures adapted from leading industry practices (benchmarks) because they are “not invented here”.

Units operating at the control phase and those operating in the open phase often regard each other’s motives, incentives, and processes as unproductive and even illegitimate. For example, the Cardiac Surgery Program at Concord Hospital (New Hampshire, USA) restructured clinical teamwork into a patient centred model using a communications protocol adapted from human factors science. The entire cross-hierarchical, cross-service team met at the same time each day, along with family members as active participants to develop care plans for each patient.45 The mortality rate in cardiac surgery patients declined by more than half from expected rates and satisfaction scores for open heart surgery patients were consistently in the 97–99th percentile nationally.45 After the hospital won the Eisenberg Patient Safety Award, key members of the team who had developed and implemented the collaborative rounds left the hospital. Such conflicts are indicative of inconsistent organizational evolution and the precariousness of the open stage in the midst of a culture of control.

Elements of design that support reliability at the open stage often come from processes and structures of “heedful interrelating”.46 Heedful interrelating consists of talk and action by which people heed (attend to) each other’s concerns and ideas, drawing out and pulling together people’s specialized knowledge and the unique perspective each person has in specific situations. Processes of heedful interrelating are supported by values acknowledging multiple legitimate perspectives, new emphasis on the ability to acknowledge and manage emotions and conflicts, and attempts to increase trust across levels and functions.29

Among all of the skills for improvement, the most crucial one may be the skill to cooperate across traditional boundaries.47 For example, a study examining the adoption of new CT scanners in two hospitals in the 1980s found that a crucial step in the integration and safe reliable use of the new technology was the willingness of radiologists to allow themselves to be instructed and informed by CT technicians about using the new equipment and reading its images.48 In a study of 16 hospitals implementing new minimally invasive cardiac surgery technology, more rapid and successful learning depended on high status surgeons empowering lower status operating room team members to contribute to the learning process: “The ability of the surgeon to allow himself to become a partner, rather than a dictator, is critical [for creating a] free and open environment with input from everybody.”17

Progress toward the deep stage of systems understanding helps to integrate local, control, and open practices.

Deep stage

Designing for reliability reaches the deep stage when an organization links positive aspects of the local, control, and open stages to systems thinking. Systems thinking is a discipline and framework that helps organizations to perceive interrelationships underlying situations or events and to identify short and long term patterns of change rather than static “snapshots”. It includes specific tools and techniques for mapping causal relationships, noting and accounting for time delays, and finding points of leverage for system change that are usually hidden.18

Deep stage organizations tend to face the following challenges to reliability. Firstly, complex interdependencies among service or production functions mean the origins of problems are often obscure and that obvious solutions rarely address these problems. Inaccurate mental models of problems often lead people to well intentioned actions that help in the short run but create other unintended delayed problems.18,49 Another challenge to reliability at this stage arises if the organization is successful in improving safety and reliability. Often this success is a result of maintaining some slack and personnel resources available to reflect on and investigate current standard operating procedures. As reliability improves, cost pressures tempt organizations to reduce those resources.

Effective high reliability design elements to address such problems require systems theory, task analysis, cybernetic and system dynamics models, hierarchical control structures, and other ways of seeing and discussing systemic interdependencies, leverage points, temporal delays, and underlying assumptions.25,29,49–51 This is what the Institute of Medicine report means by saying: “Trying harder will not work. Changing systems of care will.”1 Using the open stage emphasis on valuing multiple perspectives, deep stage organizations mobilize local expertise to continually redesign and refine standard operating procedures. Systems thinking skills allow organizations to link the rational planning and risk analysis skills of the control stage with the emphasis of the open stage on heedful interrelating and of the local stage on local expertise. There is an emphasis on developing comprehensive shared representations (such as process maps and root cause analyses that include physical structures, organizational processes, and individual mental models about these structures and processes). Deep stage organizations use these shared representations to enhance attention to interrelationships and improvement possibilities. Organizations operating at this stage have come to understand that latent failures, vulnerabilities, and system problems are difficult to perceive at a local level and thus difficult to act upon.5,11,30 These organizations create opportunities for people to examine problems cross-functionally in a way that reduces lapses in safety and reliability generated by unconnected local perspectives and initiatives.

Paradoxically, insights that allow organizations to take complex interdependencies and system-wide linkages into account are often generated by very narrow, qualitative, focused analysis motivated by learning and understanding rather than by finding immediate fixes to problems.29,30,51 Consider the case of a New England hospital in which many staff members were frustrated about phlebotomy delays that were leading to delayed discharges, higher costs, and lost revenue. Like the proverbial blind men and the elephant, physicians, nurses, phlebotomists, residents and laboratory managers each defined the problem used on their local perceptions and blamed others for not seeing the obvious solution. The hospital created a task force to investigate the problem and provide recommendations for improvement assisted by two neutral outside consultants.52 Pulling data together from across specialties showed that phlebotomists’ average time per blood draw was significantly better than industry standards and benchmark comparisons. The real problem involved subtle interactions in the timing of multiple tasks, such as when rounds were conducted, when bloods were drawn, and the distribution (but not the overall number) of phlebotomists across different daily draws. It was only when the task force noted the interdepartmental and temporal interdependencies and how their own fixed mental models exacerbated the problem that they were able to initiate changes in staffing patterns and task timing that contributed to a steady decline in length of stay and increased revenues.


Designing for reliability and safety requires balancing and integrating different and often conflicting goals and behaviors. In describing each stage, we have suggested practical design elements appropriate for managing the conflicting tensions of that stage. Our research in the nuclear power and chemical processing industries suggests that it is desirable that complex organizations such as hospitals, healthcare systems, and multiple provider practices should move towards the deep stage to enhance reliability. To improve their reliability, healthcare organizations will need to tolerate and manage the tension between formal bureaucratic controls and continual improvisational adjustments. Control is exercised partly by rules and hierarchy but partly by professional culture and local leadership.25,53 An appropriate “balance” or compromise (how much of one thing or another) among the many control and innovation mechanisms is only a step towards an integrated approach based on deep system knowledge.25 We argue for designing with and rather than or: centralized control and decentralized expertise, standardized practices and improvised adaptations that improve practices, managing routine and novelty.

Key messages

  • Many healthcare organizations aspire to be high reliability organizations with highly predictable, safe, and effective operations.

  • Organizational design—including formal structures, procedures, incentives, and informal culture and communication—is crucial to developing and sustaining high reliability healthcare organizations.

  • There is no single design that is “safe” or “best” for all organizations and all times; rather, clinicians and managers must design and redesign for their organizations at different organizational stages.

  • Designing to achieve goals that appear to be in conflict, such as safety and cost, can be opportunities to understand more deeply and improve systems of care.

Dualities should be viewed not as threats to consistency and coherence, but as opportunities for creative organization development, learning, and renewal.”54 Organizations face a choice—they can stay in a stage and work to perfect the skills of that stage or they can move towards the deep stage, maintaining the skills of the other stages as well. We believe that organizations can use the concepts and examples presented here to increase their capacity to self-design for safety and reliability without placing those goals and capabilities in competition with production, efficiency, or innovation.



  • JWR’s work on this article was supported by a Merit Review Entry Program career development grant from the US Department of Veterans Affairs.

  • Competing interests: none declared.