Article Text


Hospital quality improvement in context: a multilevel analysis of staff job evaluations
  1. U Krogstad1,
  2. D Hofoss2,3,
  3. M Veenstra3,
  4. P Gulbrandsen4,
  5. P Hjortdahl5
  1. 1Norwegian Health Services Research Centre, Oslo, Norway
  2. 2Institute of Community Medicine, University of Tromsø, Norway
  3. 3Rikshospitalet University Hospital, Oslo, Norway
  4. 4Akershus University Hospital, University of Oslo, Norway
  5. 5Institute of Public Health and Community Medicine, University of Oslo, Norway
  1. Correspondence to:
 DrU Krogstad
 Norwegian Health Services Research Centre, P O Box 7004, 0130 Oslo, Norway; unni.krogstad{at}


Objective: To investigate how much of the variance in data on nurse evaluation of different aspects of hospital work can be attributed to individual, ward, department and hospital levels, and to discuss the implication of the findings on quality improvement strategies.

Design and method: National survey data of work experiences were collected from hospital nurses working at 124 hospital wards in 36 departments in 15 hospitals across Norway during the autumn of 1998. The multilevel structure of the variation of nine indices of job satisfaction was explored by fitting four-level random intercept models (nurse, ward, department and hospital).

Results: A total of 2606 nurses (66%) responded. The indices showed varying clustering to organizational units. Intraclass correlations (ICCs) varied from 0.05 to 0.38, representing considerable higher level variation. The ward level was the dominating level for the clustering of nurses’ job aspect evaluations.

Conclusion: Multilevel modelling of staff work experiences may identify which improvement goals can be addressed at which organizational level. Improvement efforts should be directed specifically towards each aspect of work and at its most relevant organizational level. Strategies aimed at the micro-organizational level (ward management) rather than the individual level or the macro level (hospital top management) might prove worthwhile.

  • multilevel analysis
  • organization of care
  • quality improvement
  • work experiences

Statistics from

The organisation of hospital care is not yet evidence based.1,2 Over the last couple of decades, however, we have witnessed a substantial number of reports on the improvement of hospital structure,3–6 work processes,7–9 and outcomes.10–13 This paper discusses how survey data on staff job evaluation can provide useful information in quality improvement work. Variation in staff evaluation of work clusters differently at the ward, department and hospital levels. This may indicate what types of quality problems belong to—and can therefore be addressed—at which organization level.

Staff work experiences have previously been seen as reflecting hospital organization and “how hospitals work” at the macro level14 as well as the micro level. Adams and Bond15 reported that nurse job satisfaction was predicted by the cohesiveness and organisation of local work. Aiken et al16 showed strong associations between leadership, cooperation and continuity on the one hand and nurse job satisfaction and burn out on the other. Variation between hospitals has been interpreted as an indication that satisfaction with different domains of work might reflect quality of hospital care. However, the question still remains whether staff work experiences can be considered information about hospital organisations at all, or whether they are only about idiosyncratic staff attitudes.

Persons working in the same ward, department, or hospital share a number of experiences with all hospital employees. But they also have experiences particular to their special unit, which those working in other units (wards, departments, and hospitals) do not share. We can therefore regard significant variation in staff job experiences at organization levels (between wards, between departments, between hospitals) as reflecting organizational characteristics. Differences in the relative amounts of variation at higher levels across domains of work may indicate which kinds of quality problems should be addressed at which administrative level.

During the last few years, staff survey data have been subjected to multilevel analyses.17 Multilevel modelling was first introduced to reduce the problem of type 1 errors by taking account of the hierarchical clustering of the data. By this means, one avoids underestimating the true variance and artificially deflating of standard errors of the estimates.18,19 It has, however, also been used to study the structure of the variance in the data,20 which is the main point in the present study.

This study was undertaken to investigate how much of the variance in data on nurse evaluation of different domains of hospital work can be attributed to individual, ward, department, and hospital levels. The implications of the findings on quality improvement strategies are discussed.



Data were derived from a large 1998 survey program of the experiences of hospital staff in a selection of 15 Norwegian somatic hospitals representing all geographical regions and all types of hospitals (table 1). Eleven of the 14 hospitals stratified as a national sample accepted our invitation to participate and four other hospitals asked to be included in the study. All medical and surgical departments in these hospitals were included. The survey was approved by the National Board of Health and the Norwegian Data Inspectorate.

Table 1

 Description of nursing staff responders (N = 2606)

All doctors, registered nurses (RNs), and auxiliary nurses working in 124 wards in 36 departments (general internal medicine, heart, lung, general surgery, orthopaedic surgery, neurosurgery) in the 15 hospitals were included in the study. Staff names, addresses, and unit of work were drawn from the hospitals’ administrative databases. Non-responders received one reminder after 2–3 weeks. After the reminder, all personal identifications were deleted. This paper analyses responses from RNs and auxiliary nurses only, as these are the groups that can be assigned to wards (doctors in this material can only be grouped by department).

The mean number of responders per ward was 21 (range 7–43), and the mean number of wards per departments was 3.4 (range 1–11 (six departments were represented by only one ward)). The mean number of departments per hospital was 2.4 (range 1–5 (one hospital was represented by only one department)).

The questionnaire

Staff work experiences were measured by the Work Research and Quality Improvement Questionnaire (WORQUA) developed by the Foundation for Health Services Research’s hospital staff surveys.21,22 The questionnaire included questions on general job satisfaction and psychosocial working conditions, workload, quality of leadership, competence of colleagues, continuity of patient care, system continuity (familiarity with procedures and work patterns), clearness of tasks, quality of cooperation, and physical layout of workplace. Most questions had a 5-point Likert scale format, but some were measured on 10-point scales with unique anchoring statements. Exploratory factor analysis was used to investigate the empirical pattern of meaning in developing the WORQUA questionnaire. For the purpose of this article, we used factor analysis for data reduction; 29 items covering various domains of work were selected from the questionnaire. By principal axis factoring with promax rotation, we reduced the number of domains to nine multi-item indices. The indices of workload, competence, and physical layout represented the structural domain of hospital organisation. The indices of leadership, cooperation, and communication, clarity of tasks, patient continuity, and system continuity reflect processual and cultural aspects of work. Indices and items are listed in Appendix 1. Each index was transformed into a 0–100 score, higher scores indicating more positive evaluation. If at least 50% of the items in a specific subscale were answered, the mean of the answered items for the missing responses was substituted (the half rule).23 The nine indices reflect areas of central as well as local levels of responsibility and decision. We therefore expected them to cluster at different organizational levels.

Statistical analysis

Differences between responders and non-responders were tested for statistical significance by χ2 tests. The internal consistency of the indices was assessed by Cronbach’s coefficient alpha.24,25

MLWin18 was used to test if staff work experiences were hierarchically clustered. The partitioning by organizational level of the response variance was identified by fitting random intercept models with four levels: nurse, ward, department, and hospital. For each domain of work experience we first applied the full four level model. Levels at which variances appeared non-significant by the crude criterion of having a variance less than twice its standard error (usually the hospital and department levels) were then removed from the model and the change in the model’s −2LL value was inspected for significance. The removed levels were then re-entered separately and accepted where their re-inclusion produced a significant reduction (p<0.05) in −2LL.

The degree of clustering was measured by the size of the intraclass correlation coefficient (ICC), calculated as the percentage of the variance of organizational levels divided by the total variance in each index.19


A total of 2606 questionnaires from nurses were returned (response rate 66%). The number of responders was about 10% of the occupationally active nurses and auxiliary nurses in Norway in the year of study.26 The distribution of respondents across hospital types, fraction of part time employed, and percentage of nurses and auxiliaries corresponded very well with the national corps of nursing staff.

The response rate was lower for nurses working in university hospitals (62%) than in county or local hospitals (67%; p<0.001). The characteristics of responders are presented in table 1. There were no significant differences between responders and non-responders with regard to type of hospital or department. As shown in table 2, the index means (observed ranges 0–100 for all indices) differed considerably. Two domains of work stood out as areas of less satisfaction: physical layout of workplace and workload. Cronbach’s alpha was lowest for continuity of care (0.57) and communication (0.68) and highest for leadership (0.87) and clearness of tasks (0.79).

Table 2

 Properties of each index of work experience (scale: 0–100)

The proportions of variance at each organizational level and the ICCs for each index are shown in table 3. ICCs ranged from 0.05 for quality of communication and for system continuity to 0.35 for physical layout of workplace and 0.28 for quality of leadership.

Table 3

 Mean scores and intraclass correlation coefficients (ICCs) by job aspect and level of variation

One domain of job evaluation—physical layout of workplace—had much of its variation at hospital level. All other domains of job evaluation had most of its organization level variance at ward level. This was particularly true for quality of leadership, workload, and clearness of tasks.


In the four-level random intercept analyses, staff job experiences varied significantly across organizational units. Of the nine indices, four had ICCs above 0.20 and none were below 0.05. ICCs between 0.05 and 0.20 are considered to indicate strong clustering of scores, and ICCs above 0.20 are interpreted as high.19,27

Trying to catch the flow of local work, we asked about their experiences at “their” ward. The ward is the organizational, professional, social, and cultural frame of work for nursing staff.28,29 Even if directing their attention to this level, the variance of ward experiences might vary between hospitals and departments. The ward was, however, the dominating organizational level for the clustering of the variation in nurses’ job assessment. Other studies also confirm that, in the smallest hospitals, wards specialise in diagnoses and in patient groups and run a selected repertoire of procedures. Nurses working in different wards therefore experience nursing jobs under different work conditions. This differentiation also contributes to the building of shared experiences, values, and attitudes, and shapes ward specific local cultures.

The relatively strong clustering of our data probably reflects the fact that our responders were a relatively stable workforce: more than 40% of them had worked “at this ward” for more than 5 years and two thirds had worked “at this hospital” for more than 5 years.

Ward level variation was particularly strong for nurse evaluation of leadership, workload, clearness of tasks, and physical layout of workplace. For these indices, the grouping of responses by ward identified important similarities between the evaluations of nurses in the same wards. Yet one should not jump to the conclusion that efforts to improve the quality of these four aspects of work should be made at ward level only. Discussion of the level at which an intervention should be directed points directly to a lack of information that links quality problems to the right level of decision.30 Much of what is observed and evaluated at the ward level may be decided at higher levels in the hospital hierarchy, or even outside it. Nurse workloads reflect the way work is organized and led at the wards, but is probably better explained by external factors such as population health status and hospital capacities in each hospital’s catchment area, the referral patterns of the local doctors, and the emergency admittance profile of each unit. (In 1998, Norwegian hospitals had an overall bed occupancy rate of 85–90% and the hospitals in our study had a mean emergency admittance rate of 80%.) In spite of the variance between individual nurses, our analyses show clear similarities across wards, departments and hospitals. Parts of the local variation in the perceived quality of leadership and clearness of responsibilities must definitely be within reach of lower level improvement efforts. Our study suggests that a larger part of the quality improvement effort should take place at the lower levels of the hierarchy, at the patient and staff interface.

Little variation was identified at the department level, although some variation across departments in evaluation of leadership and workload was identified. Some departments were perceived to be better led than others, and some were definitely more busy than others. This seemingly limited importance of the department level may reflect the indices we used. If we had explored other aspects of work, we might have found a larger proportion of the variation at department level. The importance of not neglecting the department level is underlined by the fact that workload—one of the two job aspects with sizeable variation at the department level—had low satisfaction scores.

Aiken et al16 found variation in nurse job satisfaction between hospitals related to differences in staffing and support. In our study, variation in workload at the hospital level was low. However, one aspect of job evaluation—the physical layout of the workplace—varied quite a lot from hospital to hospital in 1998, with some being seen as more functional and better suited to their tasks than others. Closer inspection of our data (not shown) confirmed that the highest rated hospitals were the newest and most modern (and vice versa).

Key messages

  • National strategies for improving quality of hospital care are strikingly similar across national borders; they have mainly been directed towards the organisational macro level.

  • Improvement strategies need to be directed to specific organizational levels of decision making.

  • Multilevel modelling of staff work experiences may identify which improvement goals can be addressed at which organizational level.

  • The work experiences of nurses provide substantial information about improvement opportunities at the ward level.

Our results reflect evaluations by nurses and auxiliary nurses. Their viewpoint is the ward. They are excellently located to assess patient care at the sharp end of the system, but may have a less precise view of the advantages and problems at other organizational levels.

Finally, one should take care not to read low ICCs and/or low variance at any particular organizational level as an indication that no quality improvement effort should be taken at that level. Little variation in a bad score may signal that the situation is equally problematic in all units at that level, and one cannot rule out the possibility that the roots of the problem and the key to its solution may be found at just that level.

Major restructurings of hospital care at the macro level have taken place all over Europe. Few of the reforms, however, have been based on evidence—or produced evidence for—quality improvement. It has even been suggested that the concentration on the macro level is more a part of the problem than its solution.16 Our study demonstrates substantial variation in how nurses evaluate different domains of work. It also shows that a significant part of the variance was clustered at the ward level, indicating that this level is an important focus for quality improvement efforts. We suggest that quality improvement leverage points should be sought by multilevel analyses and ICCs.

Organising for improvement of hospital services has been a major topic in health services all over the western world for a long time. Many of the problems documented are strikingly similar across national borders.7,11,31,32 We suggest that national governments should not only instigate macro reforms of the hospital system (such as changes in ownership and financing systems), but also stimulate internal hospital improvement work at the department and ward levels. After all, that’s where patients are.



  • How do you rate your physical workload this autumn?

  • How do you rate your mental workload this autumn?

Original item scale: 1 = very light; 10 = intolerably heavy


  • How well do these statements fit?

  • “This hospital has a very functional layout”

  • “At this department we experience a major lack of space”

  • “This ward is built exactly to suit its tasks”

Original item scale: 1 = fits completely; 5 = does not fit at all


  • How would you rate the competence of the nurses at this ward?

  • How would you rate the competence of the doctors at this ward?

Original item scale: 1 = very insufficient; 10 = fully sufficient


  • How well do these statements fit?

  • “My immediate superior always speaks clearly”

  • “My immediate superior knows my job situation well”

  • “My immediate superior has clear goals for the future development of this unit”

  • “My immediate superior provides feedback so that I know whether I’m doing a good job”

  • “My immediate superior provides continuous information about goals and results”

  • “The department management knows the job situation at the wards”

  • “The head nurse knows the daily capacity and professional challenges at this ward

Original item scale: 1 = fits completely; 5 = does not fit at all


  • How well do these statements fit?

  • “At this hospital, interdepartmental cooperation is very good”

  • “Information about patients gets to the right place at the right time”

  • “At this ward, interprofessional cooperation is very good”

  • “At this ward, all professions have common goals for the patient’s stay”

  • “At this ward, members of the other professions ‘know’ their patients”

Original item scale: 1 = fits completely; 5 = does not fit at all


  • How well do these statements fit?

  • “Expectations from my superiors are clear”

  • “Tasks are clearly defined”

  • “New employees are being supervised and followed up”

Original item scale: 1 = very good; 10 = quite unsystematic


  • How well do these statements fit?

  • “I discuss patient information with other professions many times a day”

  • “Patient information from other professions is vital to my work”

  • Original item scale: 1 = fits completely; 5 = does not fit at all


  • How well do these statements fit?

  • “I know this hospital inside out”

  • “I am thoroughly familiar with the job routines of this ward”

  • “I have firm knowledge of the most typical patient groups at this ward”

  • Item scale: 1 = fits completely; 5 = does not fit at all


  • How well do these statements fit?

  • “By and large I observe the patients that I am responsible for every day during their stay at this ward”

  • “Our work organization imply that we actually follow up the same patients”

  • Original item scale: 1 = fits completely; 5 = does not fit at all


View Abstract


  • Competing interests: none.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles

  • Quality lines
    BMJ Publishing Group Ltd