Article Text

Download PDFPDF

Use of a safety climate questionnaire in UK health care: factor structure, reliability and usability
  1. A Hutchinson1,
  2. K L Cooper1,
  3. J E Dean1,
  4. A McIntosh1,
  5. M Patterson2,
  6. C B Stride2,
  7. B E Laurence1,
  8. C M Smith1
  1. 1Section of Public Health, ScHARR, University of Sheffield, Sheffield, UK
  2. 2Institute of Work Psychology, University of Sheffield, Sheffield, UK
  1. Correspondence to:
 Professor A Hutchinson
 Section of Public Health, ScHARR, Sheffield S1 4DA, UK; Allen.Hutchinson{at}


Aim: To explore the factor structure, reliability, and potential usefulness of a patient safety climate questionnaire in UK health care.

Setting: Four acute hospital trusts and nine primary care trusts in England.

Methods: The questionnaire used was the 27 item Teamwork and Safety Climate Survey. Thirty three healthcare staff commented on the wording and relevance. The questionnaire was then sent to 3650 staff within the 13 NHS trusts, seeking to achieve at least 600 responses as the basis for the factor analysis. 1307 questionnaires were returned (36% response). Factor analyses and reliability analyses were carried out on 897 responses from staff involved in direct patient care, to explore how consistently the questions measured the underlying constructs of safety climate and teamwork.

Results: Some questionnaire items related to multiple factors or did not relate strongly to any factor. Five items were discarded. Two teamwork factors were derived from the remaining 11 teamwork items and three safety climate factors were derived from the remaining 11 safety items. Internal consistency reliabilities were satisfactory to good (Cronbach’s alpha ⩾0.69 for all five factors).

Conclusions: This is one of the few studies to undertake a detailed evaluation of a patient safety climate questionnaire in UK health care and possibly the first to do so in primary as well as secondary care. The results indicate that a 22 item version of this safety climate questionnaire is useable as a research instrument in both settings, but also demonstrates a more general need for thorough validation of safety climate questionnaires before widespread usage.

  • CFI, comparative fit index
  • RMSEA, root mean square error of approximation
  • SRMR, standardised root mean square residual
  • TLI, Tucker-Lewis index
  • patient safety
  • safety climate questionnaire
  • safety culture
  • teamwork
  • CFI, comparative fit index
  • RMSEA, root mean square error of approximation
  • SRMR, standardised root mean square residual
  • TLI, Tucker-Lewis index
  • patient safety
  • safety climate questionnaire
  • safety culture
  • teamwork

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

As health care comes to be seen as a potentially high risk environment, there is increasing pressure to assess the safety culture of healthcare organisations. Some authors promote the use of semi-structured or qualitative approaches to assessing culture,1 while others suggest that culture can be assessed using a questionnaire approach.2,3

There is a real debate over how effectively culture can be assessed using climate questionnaires. As described by Denison,4 culture refers to the “deep structure of organisations” (values, beliefs and behaviours) and is traditionally assessed through interviews or observation, whereas climate mainly concerns individuals’ perceptions of their work environment (policies, practices and procedures) at a particular point in time5–8 and is amenable to measurement by questionnaire.9 Despite this debate, safety climate questionnaires have been used to assess safety culture in many safety critical industries such as nuclear power, aviation10 and petrochemicals.11 Some questionnaires for use in health care have been derived from work in other industries such as aviation12 or have included several items validated in other settings—for example, the armed forces.13 Most of the available healthcare safety climate questionnaires have been developed in the United States.

It has been suggested that safety climate questionnaire data may be used as an indicator of aspects of the underlying safety culture,14 while others have promoted use of these data to evaluate safety programmes or track changes over time in a healthcare setting.15,16 Now a number of US developed safety climate questionnaires can be accessed via the internet1,16,17 and are beginning to be used by US and UK healthcare organisations.

Although there is a slowly building research literature on the scientific basis for measuring safety culture in health care, there has been a warning that “the enthusiasm for measuring culture may be outpacing the science”.2 This may particularly be the case where there has been limited testing of how consistently the questionnaires measure specific domains of safety, and also of whether questionnaires developed in one environment are valid for use in another healthcare setting. Such environmental differences might be at the national level—for example, whether a US developed questionnaire has the same meaning in a UK setting—or at the organisational level—for example, whether a questionnaire designed for acute hospital care can also be used in primary care/general practice.

The aim of this study was to explore these questions by investigating the factor structure, internal reliability, and usefulness of a US developed safety climate questionnaire in a UK healthcare setting. An additional aim was to explore the relevance of the same questionnaire in both primary and secondary care. The study sought to achieve enough responses from each of the two settings to be able to undertake an analysis of the underlying factor structure of the questionnaire items (rather than attempting to assess the actual safety culture of the individual organisations which would require a response rate of at least 50–60%).


Selection of instrument

A review of available patient safety climate questionnaires was undertaken in late 2003. Selection criteria were that (1) there was some public domain evidence base concerning development, (2) the instrument measured safety climate at the clinical team or directorate level, and (3) the instrument was short enough for use by busy health professionals. Using these criteria there was a limited selection from which to choose, some still being in the final development stage13,15,17 and others having limited peer reviewed data available, including the “family” of the approximately 65 item Safety Attitude Questionnaire (SAQ),2,12,18 the 19 item Safety Climate Survey,3,16,19 and the 27 item Teamwork and Safety Climate Survey.18

Using our selection criteria, we identified two of the shorter instruments (the Stanford University Patient Safety Climate in Healthcare Organizations questionnaire13 and the Teamwork and Safety Climate Survey18) as possibly useful among frontline NHS staff.

Thirty three healthcare professionals, 16 from primary care and 17 from acute hospital care, were asked to complete these two questionnaires and comment on their understanding of each item in the manner of a “thinking aloud” protocol.20,21 On the basis of this work, we concluded that the 27 item Teamwork and Safety Climate Survey contained a greater number of items that were applicable to frontline clinical teams; this instrument was therefore taken forward for further study. The original one page format was retained and UK related demographic questions added.

The sample

The purpose of using the questionnaire in a single round study was to provide sufficient responses on which to carry out factor analysis and reliability analysis of the questionnaire structure. At least 300 responses were required for the factor analysis of each of the healthcare staff populations (acute care and primary care), a minimum of 600 in all. The sample size for the study was calculated accordingly, with the anticipation that the response rate may be as low as 20%.

Staff from 13 healthcare organisations were invited to take part. In the four acute hospital trusts a random sample of 1900 recipients was drawn from staff involved in direct patient care, management, clinical support services, or patient contact administration roles. The Teamwork and Safety Climate questionnaires were distributed through the hospital postal systems and returned in a reply paid envelope direct to the study team.

In the nine primary care trusts the questionnaire was sent to 1750 staff including general practitioners, practice nurses and practice managers, and a sample of other staff (community nurses, health visitors, school nurses and allied health professionals).

Analysis of data

Factor analysis explores the extent to which individual items in a questionnaire can be grouped together according to the correlations between the responses to them, hence reducing the dimensionality of the data. If a questionnaire is to have construct validity, the items should measure key underlying concepts (or factors) in a coherent way; items successfully measuring the same underlying factor should consistently generate similar responses to each other. The resulting groups of items can then be examined to interpret the meaning of the factors.

The questionnaire contained two sections: “teamwork” and “safety climate”.18,25 An initial exploratory factor analysis on all 27 items showed that the teamwork questions factored out separately from the safety climate questions, but that there were multiple factors within each section. Exploratory factor analysis was therefore undertaken separately on each of the two domains of the questionnaire: teamwork (14 items) and safety climate (13 items). This was carried out on a random 50% sample of respondents (the “construction” half of the data) using principal components extraction. The number of factors to be extracted was determined using the Kaiser criterion (eigenvalues >1) in conjunction with assessment of scree plots (the former method has a tendency to over-extract but in this case the two methods suggested identical solutions). Oblique rotation was used to aid interpretation of the resulting factor loadings. An optimal factor structure was derived and the internal consistency reliabilities of the resulting factors were assessed using Cronbach’s alpha. Analysis was performed in Mplus using maximum likelihood estimation.

Confirmatory factor analysis was then undertaken to assess the fit of the proposed factor structure to the remaining 50% of the dataset. This was examined using a number of fit indices—CFI (comparative fit index), TLI (Tucker-Lewis index), RMSEA (root mean square error of approximation) and SRMR (standardised root mean square residual)—in addition to the model χ2 statistic. Missing items were listwise deleted and items were treated as continuous variables.

To assess how well the factor model separately fitted each of the primary care and hospital datasets, and also whether there was a difference in the level of reported teamwork and safety climate between the two groups, a test of factorial invariance and population homogeneity was carried out. This consisted of performing a series of confirmatory factor analyses within which successive model parameters (interfactor correlations, scale means, variances and factor loadings) were allowed to vary between the primary and secondary care groups. At each stage the results were examined to determine whether allowing the parameters to differ between the groups had improved the fit of the model (which would suggest a difference between the groups either in the factor structure or in the level of teamwork/safety climate). In addition, a confirmatory factor analysis was carried out on each of the primary and secondary care groups separately.

Research governance

External scientific review was provided by the Sheffield Health and Social Research Consortium. Ethical review was provided by North Sheffield Research Ethics Committee and research governance approval sought from each participating organisation.


Face validity

As a result of the “thinking aloud” exercise, minor adaptations were made to the questionnaire wording before it was used in the survey. For example, “institution” was changed to “organisation” and “physicians” to “doctors”. However, care was taken not to alter the underlying meaning of the items and, for this reason, some wording was left unchanged—for example, the term “briefings” (which was unfamiliar to a number of respondents) and “medical error” (which several respondents associated only with doctors/medical interventions).

Survey response rates

Since the aim of the study was primarily to examine the factor structure, a single round posting was used (with no follow up requests to non-responders). 1307 responses were received, the overall response rate being 36% (33% for primary care trusts and 38% for acute hospital trusts). The number of responses and associated response rates were felt to be sufficient for factor analysis since non-responders were unlikely to differ from respondents in terms of the pattern of relationship between their responses to different questions (that is, the factor structure).

Applicability of items to staff groups

A number of responses were obtained from staff not involved in direct patient care, many of whom answered “not applicable” to several questions. Factor analysis was therefore carried out only on responses from the 897 staff involved in direct patient care, which included 237 hospital nurses, 187 primary care nurses, 51 hospital doctors, and 152 GPs.

Many questions, particularly in the teamwork domain, exhibited very weak discrimination with a high proportion of respondents answering “agree” or “strongly agree”. Item responses are shown in tables 1–3 and the skew of the distributions in table 4.

Table 1

 Teamwork factors and % responses to items

Exploratory factor analysis

Exploratory factor analysis initially suggested a three factor solution for the teamwork domain and a three-factor solution for the safety climate domain. However, the four negatively worded items consistently formed a separate factor. This occurred when analysing the teamwork and safety climate domains separately and when analysing all 27 items together (see table S1 available online at Other authors22,23 have suggested that artificial factors can occur as a result of respondents misreading negative items. Some of the staff we spoke to also commented that they had almost misread these items, especially since only a small proportion of items were reverse worded. These items were therefore removed from the analysis at this stage (table 3).

Exploratory factor analysis was then carried out on the remaining items in each of the teamwork and safety climate domains. The optimal solution for the teamwork domain contained two factors which together explained 50% of the variance of the 11 items (table 1). This suggested two underlying themes which were interpreted as representing: (1) input into decisions and collaboration with other staff, and (2) information handover.

The best solution for the safety climate domain gave three factors explaining 61% of the variance (table 2). From their item loadings, these were interpreted as representing: (1) attitudes to safety within own team and capacity to learn from errors, (2) overall confidence in safety of the organisation, and (3) perceptions of management’s attitudes to safety. The factor loadings (a measure of how strongly each item relates to each factor) are shown in tables 5 and 6.

Table 2

 Safety climate factors and % responses to items

Table 3

 Responses (%) to items omitted from final factor analysis

Table 4

 Mean score for all items in each factor (scale)

Table 5

 Optimum factor loadings of teamwork items*†

Table 6

 Optimum factor loadings of safety climate items*†

Internal consistency reliabilities

Internal consistency reliabilities (how clearly a set of items measure a single theme) were satisfactory to good, with Cronbach’s alpha 0.69 or above in all five factors (tables 1 and 2). Removing a further item from the initial five items forming teamwork factor 2 improved the internal consistency reliability of this factor. This item (“Briefing staff on handovers between shifts is important for patient safety”) appears to relate to opinion rather than what actually happens.

Confirmatory factor analysis

Confirmatory factor analysis on the remaining 50% (the “validation half”) of the dataset indicated an almost adequate fit of the model to the data under the widely applied fit indices criteria.24 The CFI and RMSEA took values of 0.93 and 0.08 for teamwork, and 0.94 and 0.07 for safety climate, respectively (see table S2 available online at These can be compared with the cut off values for a good model, estimated as a CFI >0.95 and an RMSEA <0.06.24

Use of the questionnaire in different care settings

To assess how well the factor model fitted each of the primary care and hospital datasets and also to explore whether there was a difference in teamwork or safety climate between the two groups, a series of confirmatory factor analyses were performed, allowing successive model parameters to vary between the groups. Separate confirmatory factor analyses were also carried out on each of the primary and secondary care groups (see table S3 available online at

There was some difference in reported teamwork climate between the groups; the factor model showed a slight but significant improvement when scale means were allowed to vary, and mean scores on both teamwork factors were lower for secondary care than for primary care. There also appeared to be some difference in optimal factor structure between the groups, since allowing the factor loadings to vary resulted in a significant improvement in fit (difference in χ2 = 39 on 9 df, p<0.05). Factor loadings for two items (“It is easy for staff here to ask questions when there is something that they do not understand” and “I have the support I need from other staff to care for patients”) were relatively low for the primary care sample (0.713 and 0.562), reflected in the low percentage of variance explained for these items for the primary care sample (R2 = 0.249 and R2 = 0.195 compared with R2 = 0.411 and R2 = 0.512 for secondary care). Separate confirmatory factor analyses on the two datasets indicated that the model fitted the secondary care data substantially better than the primary care data (CFI = 0.938 and 0.858, respectively).

For safety climate the results of the multigroup analyses suggested that the optimal factor model was similar across both primary and secondary care, since improvement in fit was not statistically significant when factor loadings were allowed to vary between the groups. The separate confirmatory factor analyses on the two datasets indicated that the model offered a better fit to the hospital data, although the difference was smaller than for the teamwork climate model. However, there was evidence of a difference in level of safety climate between the groups, with the model fit greatly improved by allowing variation in the factor means (difference in χ2 = 43 on 3 df, p<0.05) and variances (difference in χ2 = 15 on 3 df, p<0.05). The mean scores on each of the three safety climate factors were significantly lower for the secondary care subsample, and their variances were greater.


The purpose of this study was twofold: (1) to test whether the questionnaire met conventional scientific criteria for internal reliability and factor structure, and (2) to determine whether there was any difference in the factor model when compared between primary care and secondary care responses.

Factor structure

One key consideration when using questionnaires to assess safety climate is the need for a well grounded item content (reflecting the topics to be covered by the questionnaire), together with a clearly defined factor structure relating groups of items (questions) to specific themes.

Interpretation of the factor structure of this questionnaire was not clearcut since some items were found to relate to more than one factor. This may cause difficulties in interpreting results if the questionnaire was to be used to evaluate safety programmes or track changes over time.15 Conversely, the factors within a topic such as safety climate are likely to be interrelated to some extent, which may partially account for the cross-loading of several items to more than one factor (tables 5 and 6).

Removing five items from the questionnaire improved the internal reliabilities of both domains. The three factor safety climate solution was the more satisfactory, explaining 61% of the variance of the 11 items, whereas the two factor teamwork scale only explained 50%. Furthermore, our teamwork factor 1 and safety climate factor 1 correspond reasonably well with the “teamwork climate” and “safety climate” factors in the 60-item Safety Attitudes Questionnaire from which the Teamwork and Safety Climate Survey questionnaire was originally developed25 and which has recently been proposed for hospital-wide use.2 Interestingly, our three safety climate factors also agree well with three key safety climate dimensions identified in reviews by Flin et al14 and Wiegmann et al.26 These relate to (1) employees’ own attitudes to risk and safety, (2) organisational commitment and safety system, and (3) management’s attitudes to safety.

Negatively worded items

In common with many other survey instruments, the Teamwork and Safety Climate Survey questionnaire contained some items that were negatively worded (table 3). Although suggested as a means of reducing response bias,27 negatively worded items have often been found to factor out separately.22,28,29 Schmitt and Stults23 showed that an artificial factor can be produced when as few as 10% of respondents fail to recognise the reversal of the wording. Since the four negatively worded items exhibited this clustering behaviour, they were removed from the final stage of the factor analysis.

Use of the questionnaire in primary and secondary care

We have also been able to show, we believe for the first time, that a safety climate questionnaire can be used across a whole health community, both in primary and in secondary care. The final overall factor model appeared to fit both groups reasonably well, although the fit was slightly better for the hospital data than for primary care, which is not surprising given that the questionnaire was designed for a hospital setting. This suggests that, after further refinement (our interview data suggest that some questions may not be as relevant for primary care), comparisons using this type of questionnaire could be made between primary and secondary care.

However, a further cautionary note is that staff whose main role was not direct patient care answered “not applicable” to many questions which has implications for the use of this type of questionnaire across whole healthcare organisations.

Limitations of the study

The aim of this study was to evaluate the factor structure—and hence usefulness—of this safety climate questionnaire in UK health care. However, we did not aim to assess from first principles the key dimensions of safety climate in an NHS setting. This would have been the best strategy had we been setting out to create a new questionnaire, but it would have been much more resource intensive.

At this stage of the analysis we did not set out to undertake a multilevel confirmatory factor analysis to explore the influence that caregiver type (for example, doctor or nurse) might have had on the results.


The refined 22 item questionnaire provides a measure of safety climate in both primary and secondary UK health care, meeting some of the criteria on factor structure and internal reliability. It might, for example, prove useful in research studies that sought associations between safety culture and health outcomes, seeking also to determine the predictive validity of the instrument. However, there are enough cautionary points arising from the item content and factor analysis of this questionnaire to suggest that there is more to be done in exploring the properties of safety climate instruments, even those recently released,17 before proceeding wholesale into measuring safety climate across health services, at least those in the UK.

As Pronovost and Sexton2 have recently pointed out, there is still much work required before we are able to understand the full value of using climate questionnaires in health care, including the meaningfulness of the resulting data. Until it is possible to derive evidence of predictive validity, such as whether positive culture data predict measurably safer health care, the costs of routinely using safety climate questionnaires may not be justified.


The authors thank the members of staff in all 13 sites who assisted with the organisation of the surveys. Particular thanks are due to Eric Thomas and Bryan Sexton for their support in using the Teamwork and Safety Climate questionnaire. Karen Beck acted as project administrator and her support was invaluable.


View Abstract


  • Funding was provided by the Sheffield Health and Social Care Research Consortium and by the devolved funds programme at the University of Sheffield.

  • Competing interests: none declared.

Linked Articles

  • Quality lines
    David P Stevens