Psychometric properties of the Hospital Survey on Patient Safety Culture: findings from the UK
- 1Department of Human Sciences, Loughborough University, Loughborough, UK
- 2Institute of Work Psychology, University of Sheffield, Sheffield, UK
- 3Ham Associates Ltd., London, UK
- 4Healthcare and Patient Safety Research Unit, Department of Human Sciences, Loughborough University, Loughborough, UK
- Correspondence to Dr Patrick Waterson, Department of Human Sciences, Loughborough University, Loughborough LE11 3TU, UK;
- Accepted 5 March 2009
- Published Online First 8 March 2010
Background Patient safety culture is measured using a range of survey tools. Many provide limited data on psychometric properties and few report findings outside of the US healthcare context. This study reports an assessment of the psychometric properties and suitability of the American Hospital Survey on Patient Safety Culture for use within the UK.
Methods A questionnaire survey of three hospitals within a large UK Acute NHS Trust. 1437 questionnaires were completed (37% response rate). Exploratory factor analysis, confirmatory factor analysis and reliability analyses were carried out to assess the psychometric performance of this survey instrument and to explore potential improvements.
Results Reliability analysis of the items within each proposed scale showed that more than half failed to achieve satisfactory internal consistency (Cronbach's α<0.7). Furthermore, a confirmatory factor analysis carried out on the UK data set achieved a poor fit when compared with the original American model. An optimal measurement model was then constructed via exploratory and confirmatory factor analyses with split-half sample validation and consisted of nine dimensions compared with the original 12 in the American model.
Conclusion This is one of the few studies to provide an evaluation of an American patient safety culture survey using data from the UK. The results indicate that there is need for caution in using the Hospital Survey on Patient Safety Culture survey in the UK and underline the importance of appropriate validation of safety culture surveys before extending their usage to populations outside of the specific geographical and healthcare contexts in which they were developed.
The measurement of patient safety culture is a growing industry among researchers and healthcare professionals.1–6 In the UK, at least a third of NHS Trusts are taking part in some form of culture assessment.7 Measurement methods range from more generic “toolkits” to methods designed for specific healthcare contexts (eg, primary care).8 ,9 Questionnaire surveys are frequently used to measure—for example, team working, attitudes towards errors and general perceptions of safety. However, it has been suggested that many questionnaires lack explicit theoretical underpinning and fail to report the full psychometric properties of measures,10 ,11 raising the possibility that they neither consistently measure specific aspects of patient safety nor generalise across different national and healthcare-specific environments.2 This has particular relevance for the assessment of patient safety culture in the UK because a number of the surveys currently being used within NHS Trusts were developed in the USA.7 ,4 ,12 In this paper, we report the use within the UK of the American Agency for Healthcare Research and Quality–sponsored Hospital Survey on Patient Safety Culture (HSOPC) questionnaire. We focus on the psychometric properties of the HSOPC and its suitability for use within a UK context.
Hospital Survey on Patient Safety Culture
The HSOPC questionnaire is based on a set of pilot studies carried out in 21 different hospitals involving 1461 hospital staff across the USA.12 As a result of a series of item and content analyses, reliability analysis, and exploratory and confirmatory factor analyses, it consists of 42 items that group into 12 dimensions; two outcome dimensions and 10 safety dimensions. For each item there were five possible response categories, the labelling of which varies across dimensions. Of the 42 items, 17 are asked from a “negative” viewpoint and are subsequently reverse-scored. The confirmatory factor analysis carried out during the development of the questionnaire indicated that the 12-factor model proposed had an adequate level of fit to the data using established criteria,13 specifically with comparative fit index (CFI)=0.94, non-normed fit index (NNFI)=0.93, root mean square error of approximation (RMSEA)=0.04 and root mean square residual (RMR)=0.04.12 Very few published accounts of the use of the survey are available; however, the Agency for Healthcare Research and Quality have made available a database that facilitates the benchmarking of findings from other users of the survey. The database for 2008—for example, consists of data drawn from 160 176 respondents across 519 hospitals in the USA.14 Comparable data from the UK and Europe are not available, although there is evidence that the survey is being used within UK Trusts.7
The HSOPC questionnaire was distributed to three hospitals within a large NHS Acute Trust in the East Midlands between May and June 2006. Questionnaires were distributed by key staff working in wards and other specialist areas across the three hospitals. Clinical and non-clinical staff could freely and anonymously fill in the questionnaire and return their responses by post in an envelope provided. The project was reviewed and approved as an audit by both the chair of the local ethics research committee and the research and development department.
Changes made to the questionnaire
As a result of presurvey group discussions with staff members, a number of changes were made. These included adjustments to the wording of individual items with respect to terminology used within UK. The words “area” and “unit” were changed to “ward” and “department” (affecting questions A28, A1, A7, A20, A12, F4, F13, F2, F7, F3, F9) and the term “adverse outcome” was used to substitute for “error” and “mistake” (questions D1, D2, D3, C7, C9). The words “over and over” in question B4 were replaced by “repeatedly”. In addition, after discussions with hospital management, one item (question A19) in the “Non-punitive responses to error” dimension was removed from the questionnaire. Finally, because of a proofreading error, the meaning of one item (question F1) in the “Hospital management support for patient safety” dimension was altered. This item was subsequently discarded because of this change of meaning, resulting in 40 items used in our data analyses compared with 42 from the original HSOPC survey (table 1). The survey also collected a small amount of background information, specifically on respondents' hospital, job type and tenure.
Survey response and sample properties
Four thousand questionnaires were distributed, of which 1461 were returned (a 37% response rate representing 12% of the total employees in the Trust). Within these cases, 1017 respondents had given valid responses to the 40 HSOPC items subsequently analysed. Sixty per cent of the sample were nursing staff (trained and untrained), followed by allied healthcare professionals (21%), management and administrative staff (11%) and medical staff (8%); just less than half the sample (45%) had been working in their current hospital for at least 5 years.
Analysis of data
We first examined the responses made to each item within the 12 HSOPC dimensions, and assessed the original 12 dimension model in relation to our sample, both in terms of the internal consistency reliability of each dimensional grouping of items and as a whole using confirmatory factor analysis to assess the overall level of fit.
We then constructed the optimal measurement model for our sample to see if and how this differed from the original model. Our sample was split randomly into two halves; on one “construction” half, exploratory factor analysis (EFA) was used to construct a measurement model for the items; the other “validation” half of the data was then used to test this model via confirmatory factor analysis (CFA). Having finalised our optimal model, we then performed reliability analysis on the sets of items in each resulting dimension using the whole sample. Table 2 provides a glossary that explains some the common terms used when carrying out EFA and CFA statistical analyses.
With the exception of two factors (ie, hospital handover handoffs and transitions), the main findings were positive with regard to the type of safety culture within the Trust as a whole. Online Appendix A shows the percentage responses in each category reported for each item used in the survey.
Testing the original model
The results of a reliability analysis on the original dimensions are presented in table 3. Of the 12 groupings of items, 7 (Overall perceptions of safety, Supervisor/manager expectations, Organisational learning—continuous improvement, Communication openness, Non-punitive responses to error, Staffing, Hospital management support) fell short of an adequate level of internal consistency (Cronbach's α<0.7), with Staffing exhibiting an extremely poor level of reliability (α=0.58). Only two of the dimensions achieved α values >0.80 (Frequency of error reporting, Feedback and communication about error).
A CFA of the original model was then run (χ2=1907, 674 df); the full range of fit indices suggested a level of fit with marginal adequacy; specifically CFI=0.91, NNFI=0.89, RMSEA=0.04, standardised root mean square residual=0.05. Of the 40 items, 4 (A12, A13, B4 and B7) had <20% of their variability explained by the model, and a further 7 items had <30% of variability explained. In addition, of the 40 standardised path coefficients, 8 dropped below the widely applied 0.5 cutoff.
Constructing an optimal model
Having found that the original model did not fit the UK data satisfactorily, we then carried out a robust construction of the optimal measurement model for the 40 HSOPC items in the UK survey. On one randomly selected “construction” half of the data, we performed an EFA, using principal axis factoring as the extraction method and assessing the number of factors to be extracted by a combination of Kaiser's criterion and Cattell's screen plot method.15 An oblique rotation was carried out to aid interpretation of the resulting factors. Having examined a series of possible models, and gradually removing 13 items that were either severely cross-loaded or had very low loadings and communalities, the evidence pointed most strongly towards a nine-factor model for the remaining 27 items. This accounted for 66.8% of their total variance and is given with the factor loadings in online appendix B.
We then tested the fit of this model to the other “validation” half of the data set using CFA (χ2=588, 288 df). The fit indices suggested an adequate fit to the data, with CFI=0.95, Tucker–Lewis Index=0.93, RMSEA=0.04, standardised root mean square residual=0.04. Furthermore, the model accounted for at least 20% of the variance of each item and greater than 30% of the variance for all but two items. All but one of the factor loadings from the EFA and all 27 standardised path coefficients from the CFA were >0.5.
The interpretations of the dimensions resulting from the optimal measurement model constructed and tested on the UK data were similar to those from the original model. Indeed, there still existed dimensions for “Communication openness”, “Feedback, frequency of event reporting”, “Non-punitive responses to error” and “Hospital handoffs and transitions”, which all formed as before. The dimensions for “Teamwork across units” and “Teamwork within units” both dropped a single item, and the “Supervisor/manager expectations and actions promoting patient safety” dimensions dropped two items. The most noticeable differences were the absence of “Organisational learning—continuous improvement” and “Hospital management support”, and the grouping of a subset of the items that previously formed the “Overall perceptions of safety” and “Staffing” dimensions into a single dimension.
Finally, using the whole sample, reliability analyses were performed for each of the groups of items defined by this factor structure. These generally indicated suitable internal consistency, with Cronbach's α>0.7 for seven of the nine dimensions. Of the two dimensions that fell below this level, one was a two-item scale, and both were among the five dimensions to survive unchanged from the original model (ie, the weak reliability was not due to the form of our revised model). None of the scales gained improved consistency by dropping further items.
Discussion and conclusions
Our findings differ from the results obtained within the USA. Although we might have expected the changes made to the UK questionnaire to have resulted in some differences, they are unlikely by themselves to explain the findings. The results from the spilt EFA and CFA indicate that the questionnaire may be measuring different constructs, or aspects of patient safety within the UK, as compared to the USA. For example, the optimal model derived from the UK data resulted in a dimension that linked “Overall perceptions of safety” and “Staffing”. This may have come about because of an increased tendency to associate staffing levels with safety within the UK as compared to the USA. Similarly, it is possible that the items in the dimensions “Organisational learning—continuous improvement” and “Hospital management support for patient safety” may have been interpreted differently within a UK sample. Our findings indicate that national and healthcare-specific differences may limit the extent to which the HSOPC survey is applicable outside of the USA. We would also point to the lack of cross-validation (EFA followed by CFA) in the USA data set as indicating another potential flaw in the design and validation of the HOSPC questionnaire. The relatively higher values for the CFA fit indices achieved in the original study from which the HSOPC scales were constructed may be partially explained by their use of the same sample for the EFA and CFA. Split-half validation was not undertaken; and testing the model using the same data from which it was constructed would most likely result in an overestimate of the degree of fit.
The measurement of safety culture and climate in healthcare is still in a relatively immature stage of development as compared to other domains (eg, offshore installations, manufacturing).16 ,17 Other researchers3 have warned about the dangers of too readily generalising about safety culture and climate across industries with widely differing characteristics, forms of hierarchy and work practices. This is especially the case within healthcare, where hospitals—for example, may vary greatly according to norms and operating procedures, even within the same Trust. Our findings add further weight to the argument that there is a need to further develop and construct theoretical models that are sensitive to the context-specific nature of healthcare environments including hospitals.18 Without such work, researchers run the risk of adopting a “broad brush” approach to safety culture and overgeneralising their findings. Our advice to healthcare managers considering a survey of patient safety culture within their organisations is twofold: first, to examine carefully the extent to which survey tools and instruments provide extensive details of their psychometric properties; and second, to consider the degree to which potential surveys have undergone validation in other contexts either within their own country or with other healthcare systems that are similar or comparable. A number of surveys fulfil the requirements of these criteria and psychometric and validation exercises have been carried out with other patient safety tools and surveys.4 ,6 ,9 With regard to our future work, we plan to compare our findings using the HSOPC with similar studies that we understand to be on-going within the UK, other European countries and elsewhere.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.