Objective To examine the reliability and predictive validity of two patient safety culture surveys—Safety Attitudes Questionnaire (SAQ) and Hospital Survey on Patient Safety Culture (HSOPS)—when administered to the same participants. Also to determine the ability to convert HSOPS scores to SAQ scores.
Method Employees working in intensive care units in 12 hospitals within a large hospital system in the southern United States were invited to anonymously complete both safety culture surveys electronically.
Results All safety culture dimensions from both surveys (with the exception of HSOPS's Staffing) had adequate levels of reliability. Three of HSOPS's outcomes—frequency of event reporting, overall perceptions of patient safety, and overall patient safety grade—were significantly correlated with SAQ and HSOPS dimensions of culture at the individual level, with correlations ranging from r=0.41 to 0.65 for the SAQ dimensions and from r=0.22 to 0.72 for the HSOPS dimensions. Neither the SAQ dimensions nor the HSOPS dimensions predicted the fourth HSOPS outcome—number of events reported within the last 12 months. Regression analyses indicated that HSOPS safety culture dimensions were the best predictors of frequency of event reporting and overall perceptions of patient safety while SAQ and HSOPS dimensions both predicted patient safety grade. Unit-level analyses were not conducted because indices did not indicate that aggregation was appropriate. Scores were converted between the surveys, although much variance remained unexplained.
Conclusions Given that the SAQ and HSOPS had similar reliability and predictive validity, investigators and quality and safety leaders should consider survey length, content, sensitivity to change and the ability to benchmark when selecting a patient safety culture survey.
- Patient safety
- health services research
- safety culture
- social sciences
- organisational theory
- healthcare quality improvement
Statistics from Altmetric.com
- Patient safety
- health services research
- safety culture
- social sciences
- organisational theory
- healthcare quality improvement
Hospitals are increasingly concerned with measuring patient safety culture given Joint Commission requirements (Standard LD.03.01.01) and the need to identify issues that impact safety. While many safety culture surveys exist, only a few have been used extensively. The Safety Attitudes Questionnaire (SAQ) and the Agency for Healthcare Research and Quality (AHRQ) Hospital Survey on Patient Safety Culture (HSOPS) are two that have been used the most1 and much psychometric evidence exists for both surveys.2 3 Despite use of these surveys during the past 10 years, much is still not understood about how useful they are to hospitals or about their relative strengths and weaknesses. Our goals were to examine two aspects of their measurement utility—reliability and predictive validity—and to provide a method to convert scores from one survey to the other.
Predictive validity4 refers to whether HSOPS and SAQ are significantly associated with outcomes. The 12 predictors in this study are the culture dimensions measured in HSOPS (10 dimensions) and SAQ (two dimensions). While the SAQ traditionally measures several dimensions related to the workplace, we focused on the two most commonly used and thoroughly studied domains of the SAQ—safety culture and teamwork culture. The four outcomes in our study are self-reported measures from HSOPS: frequency of events reported; overall perceptions of patient safety; number of events reported within the last 12 months; and patient safety grade. By comparing patient safety culture surveys regarding their reliability and ability to predict outcomes when the surveys are administered to the same participants, we address a research gap that has not been examined previously. To provide results consistent with previous safety culture work,2 3 we examined individual-level and unit-level results.
A secondary aim was to quantitatively explain how a culture dimension from one survey (HSOPS) can be converted to a score on a second survey (SAQ). Converting scores can be accomplished when surveys purport to measure similar constructs, as is the case with the HSOPS and SAQ (ie, both surveys have teamwork scales). Score conversion can be helpful to researchers and hospitals who want to change the patient safety culture survey they are using but still want to be able to draw conclusions about their data with respect to previous data they collected.
In sum, we used an innovative approach—administering both surveys to the same participants—to explore three main aspects of these surveys—reliability, predictive validity and score conversion. Given that we have no a priori reason to believe that one safety culture survey is a better predictor of outcomes, we hypothesised that: individual-level responses to AHRQ's HSOPS dimensions and SAQ dimensions (ie, predictors) will be related to individual-level outcomes measured by the HSOPS survey (hypothesis 1); unit-level responses to AHRQ's HSOPS dimensions and SAQ dimensions (ie, predictors) will be related to unit-level outcomes measured by the HSOPS survey (hypothesis 2).
To examine the ability to convert scores, we formulated the following research question: what are the models that allow one to convert: HSOPS dimension scores into a SAQ teamwork score; and HSOPS dimension scores into a SAQ safety score?
We invited 1055 employees working in intensive care units (ICUs) in 12 hospitals within a large hospital system in the southern United States to complete an anonymous electronic survey that contained teamwork climate and safety climate survey items from the SAQ and all survey items from the HSOPS. The hospitals and units included a large tertiary care teaching hospital with medical, coronary care, shock trauma, burn, transplant, paediatric and neonatal ICUs; three small community hospitals with four ICUs in total; and five large urban hospitals with eight ICUs in total. This study was conducted as part of the system's annual safety culture survey of all employees.
A total of 567 non-physician employees (54% response rate) completed at least a portion of the surveys. After cleaning the data and removing participants who worked in ICUs which had less than five participants (n=11 from the burn, coronary care, and medical ICUs from the large tertiary care teaching hospital) and those who did not answer all questions (n=325), there were 220 participants who completed all of the safety culture items from both surveys (21% usable response rate). We conducted the analysis on responses from these 220 participants. These participants worked in 15 different ICUs in 12 hospitals. We counterbalanced the order of these two surveys (ie, half of the sample completed the SAQ items first followed by the HSOPS items while the other half received the HSOPS items first followed by the SAQ items). Institutional review board approval was obtained prior to conducting the survey. Each hospital offered a lottery/drawing as an incentive, with prizes ranging from a pizza party to a $50 gift certificate.
The SAQ dimensions of teamwork culture (six items) and safety culture (seven items) were measured on a five-point Likert-type scale (1 = disagree strongly to 5 = agree strongly). All of the survey items (culture dimensions and outcomes) from AHRQ's HSOPS were included and measured with the scales they recommended using in their survey user guide.5
We computed descriptive statistics for the sample and each of the culture dimensions. We examined the reliability of each dimension of culture with Cronbach's α and also examined the correlations among the culture dimensions. To examine hypothesis 1, we conducted stepwise regression models for the culture predictors and outcomes at the individual level. Prior to conducting the same analysis to test hypothesis 2, we needed to justify unit-level analyses by demonstrating that the data should be aggregated at the unit level. To the extent that aggregation indices support such a justification, unit-level scores for the culture dimensions can be created and tested via regression. We used aggregation indices (ie, rwg(j), intraclass correlation coefficient 1 (ICC(1)), ICC(2), one-way analysis of variance (ANOVA)) to justify examining unit-level analyses (hypothesis 2). Guidelines for interpreting these indices as evidence that aggregation is appropriate are: rwg(j) should be 0.70 or greater,6 ICC(1) values are typically between 0.05 and 0.30,7 ICC(2) values should be 0.70 or greater,7 and a significant one-way ANOVA.7 Bivariate correlations were computed between all of the SAQ and HSOPS dimensions, including HSOPS outcomes. SAQ and HSOPS culture dimensions were entered as predictors in a linear regression model, with each HSOPS outcome being predicted separately.
To examine our research question, we used regression modelling to determine the extent to which we could convert HSOPS scores into SAQ scores. We chose regression modelling because it allows the use of multiple predictor variables. We used stepwise regression because it is an appropriate technique when conducting exploratory research.
The sample consisted primarily of women and nurses. Most (81%) of the participants had three or more years in their specialty and the majority of participants worked 70 or more hours per 2-week pay period (table 1).
Descriptive statistics, estimates of reliability and bivariate correlations between individual-level predictors (SAQ and HSOPS dimensions of culture) and individual-level outcomes from the HSOPS are contained in table 2. All of the correlations between the HSOPS and SAQ dimensions (not including HSOPS outcomes) were significant and most indices of reliability were adequate. Reliability estimates for the culture dimensions ranged from 0.62 (staffing from HSOPS) to 0.87 (teamwork within hospitals from HSOPS), with the Staffing dimension being the only culture dimension, with a Cronbach's α <0.70.
The HSOPS outcomes we measured were: frequency of event reporting, measured with three items asking the participant how often mistakes of varying severity are reported; overall perceptions of patient safety, measured with four items focused on the importance of patient safety in the unit; number of events reported within the last 12 months, based on one item that measures the number of event reports completed and submitted in the last year; and overall patient safety grade, measured with one item asking about how well the unit performs with respect to patient safety. SAQ teamwork and SAQ safety were positively correlated with frequency of event reporting (r=0.41 and r=0.44, respectively), overall perceptions of patient safety (r=0.59 and r0.63, respectively), and overall patient safety grade (r=0.64 and 0.65, respectively). Of the 10 HSOPS culture dimensions, 9 (teamwork within hospital units being the exception) were positively correlated with frequency of event reporting (ranging from r=0.22 to 0.62). All 10 HSOPS dimensions were positively correlated with overall perceptions of patient safety (ranging from r=0.39 to 0.72) and overall patient safety grade (ranging from r=0.40 to 0.64). Neither the SAQ dimensions nor the HSOPS dimensions were significantly correlated with number of events reported within the last 12 months. This finding is consistent with previous attempts at predicting this outcome.3
Table 3 contains agreement indices to determine whether aggregation of data at the unit level is appropriate. Previous researchers have recommended that rwg(j) and ICC(2) values should be ≥0.70 or greater,6 7 ICC(1) values should be between 0.05 and 0.30,7 and the one-way ANOVA should be significant.7 When these aggregation indices are satisfied, one has confidence that agreement within a unit is greater than agreement between units. Unanimous support for aggregating the data at the unit level did not exist because none of the culture dimensions have all four acceptable aggregation indices and only one culture dimension has three acceptable aggregation indices (table 3). We still report correlations (table 4) at the unit level so interested readers can see differences that existed between the individual-level and unit-level correlations. Not all unit level culture dimensions predicted the outcomes measured by the HSOPS (table 4). SAQ teamwork did not predict any of the four HSOPS outcomes while SAQ Safety was positively correlated with three of the HSOPS outcomes. At least two HSOPS dimensions predicted each of the HSOPS outcomes. Overall, though, there were fewer significant culture dimension predictors of the HSOPS outcomes at the unit level than at the individual level. In sum, the lack of justification for aggregating our results at the unit level coupled with the many non-significant unit-level correlations observed give us confidence that for the sample we studied, it is more appropriate to examine individual level data.
To test the first hypothesis, safety culture dimensions that were significantly correlated with the outcomes at the individual level were entered into a stepwise regression for each outcome separately. Of these dimensions, we report only the significant predictors from the regression analysis. The HSOPS culture dimension of feedback significantly predicted frequency of event reporting and overall perceptions of patient safety (table 5). Two other HSOPS dimensions—organisational learning and continuous improvement and management support—each predicted overall perceptions of patient safety and overall patient safety grade. The SAQ dimensions predicted patient safety grade but not frequency of event reporting or overall perceptions of patient safety. The second hypothesis was not tested via stepwise regression because the unit-level agreement indices did not support aggregating the results at the unit level.
To address the research question, we reviewed the HSOPS and SAQ items to determine which items seemed most similar to each other in terms of content. Based on our review, it was decided that the teamwork dimension of the SAQ was most similar to two of the HSOPS dimensions—teamwork within hospital units and communication openness. The safety dimension of the SAQ was judged to be most similar to the organisational learning–continuous improvement dimension of HSOPS. Tables 6 and 7 contain the items for teamwork and safety from both scales, respectively. The regression model for teamwork revealed that estimated teamwork SAQ dimension score = 0.83+0.34*teamwork within hospital units score +0.51*communication openness score. This model showed that 54% of the variance in the teamwork SAQ dimension was accounted for by these two HSOPS dimensions. The regression model for safety revealed that estimated safety SAQ dimension score = 1.63+0.65*organisational learning/continuous improvement score. This model showed that 42% of the variance in the safety SAQ dimension was accounted for by this HSOPS dimension.
We tested and compared the psychometric qualities of two safety culture surveys—HSOPS and SAQ—that were completed by the same caregivers. In general, both surveys had good reliability and both had evidence of predictive validity. While we were able to statistically convert scores from the HSOPS to the SAQ safety and teamwork dimensions, the conversion was not very strong due to differences in the content between the surveys, resulting in substantial variance left unexplained in our regression models.
All culture dimensions except staffing from HSOPS had adequate reliability (Cronbach's α ≥0.70). At the individual provider level, HSOPS's feedback and communication about error dimension was a significant predictor of frequency of event reporting and overall perceptions of patient safety. HSOPS's hospital management support for patient safety dimension was the strongest predictor of overall perceptions of patient safety, while the SAQ's safety climate dimension was the strongest predictor of patient safety grade based on the R2 accounted for by each predictor. Given that our results and those from previous studies were unable to predict the number of events reported within the last 12 months, researchers should test the effectiveness of predictors other than culture dimensions. It might be the case that reporting an event during the last 12 months is better predicted by specific attitudes about event reporting.
We were able to convert scores from the HSOPS into dimension scores for the safety and teamwork dimensions of the SAQ based on individual-level data. However, the per cent of variance explained was moderate. Furthermore, it is not readily apparent that the scales precisely match each other. For example, although SAQ safety and HSOPS organisational learning are the closest match, there are items in SAQ safety (eg, I would feel safe being treated here as a patient) that actually are more similar to HSOPS items in other HSOPS scales. Future research might find ways to convert scores, but our initial impression is that the surveys cannot be converted.
Consistent with previous researchers who have conceptualised culture as shared beliefs about the work environment, we intended to examine culture as a unit-level phenomenon. Unfortunately, we were unable to do so because of the poor levels we found for the aggregation indices. One possible explanation is that this reflects reality and that the units do not share a common view of culture as strongly as units examined in other studies. Another plausible, and we believe more likely, explanation is that the lack of variability between units resulted in poor agreement indices. Such lack of variability might have occurred because all units came from the same hospital system and perhaps are similarly impacted by a culture at the hospital and system level. This explanation is also consistent with Schneider et al's8 thinking that, over time, organisations hire and retain organisational members with similar thoughts and personalities. Similarity in thoughts and personalities might result in homogenous perceptions of culture across units, which is consistent with our results. Additionally, this hospital system has developed standard ways of measuring and providing feedback about quality and safety across all ICUs. Such standardised measurement and feedback might result in similar perceptions of culture across units.
Three primary limitations affected our study. First, while our response rate was adequate, the number of usable responses was disappointing. Given the length of the survey by combining both HSOPS and SAQ it is not surprising that many participants did not answer all questions, especially because the only financial incentive was a hospital lottery/drawing for which there were only a few ‘winners’. Perhaps individual incentives would have helped increase the usable response rate. Second, our data focused on employees and units from the same hospital system. Collecting data from multiple hospital systems would have allowed us to test whether homogeneity in unit-level responses at this particular hospital system was unique to it. Third, we focused on employees working in ICUs only. It is possible we would have found different results for employees working in other areas of the hospitals.
Our finding that individual attitudes about safety culture predicted different self-reported outcomes has practical implications. One needs to carefully select safety culture dimensions depending on the outcomes one is trying to predict. This might mean that, as opposed to having different standalone safety culture surveys, researchers can look across all current safety culture surveys and focus on the specific safety culture dimensions that are expected to predict their outcome of interest, then try to improve that aspect of safety culture in the hope of also improving the outcome. As a result, researchers might use some safety culture dimensions from the SAQ, some from HSOPS, and some from other safety culture surveys when trying to predict and prevent certain outcomes. For example, based on our results, the hospital system we surveyed might want to examine both SAQ dimensions and HSOPS's management support dimension and organisational learning dimension if trying to predict patient safety grade at the individual level.
While researchers have compared patient safety culture surveys via literature reviews,1 our study is the first to empirically compare two different patient safety culture surveys by administering them to the same participants. This data collection approach, while resulting in a longer survey that has potential implications for response rates, provides psychometric and predictive validity comparisons unavailable previously to safety culture researchers. We found that our estimates of reliability and predictive validity are consistent with previous researchers who collected data on only one of the safety culture surveys.2 3 9
Many safety climate surveys exist and this study is the first to directly compare the ability of these two surveys to predict self-reported safety outcomes and whether or not the survey scores can be converted. More research is needed with additional outcomes and additional safety culture surveys to be able to completely elucidate the strengths and weaknesses of HSOPS and SAQ. We found that domains of both SAQ and HSOPS are correlated with some outcome items of HSOPS, and that it is unlikely that converting scores from one survey to another will be useful. When selecting between the SAQ and HSOPS, researchers and practitioners might look at considerations other than reliability and predictive validity when deciding which survey to use. These factors include shorter survey length of SAQ, easier ability to benchmark HSOPS results, slightly different content, and whether or not the surveys are sensitive to change.9 10 If researchers continue to use the HSOPS and SAQ surveys in their current form, they need to be aware that the HSOPS measures many more dimensions than the SAQ. This difference means that each survey has distinct advantages. For example, the HSOPS survey is focused on unit-level and institutional-level results across a large number of culture domains, which helps hospitals prioritise quality improvement activities. Conversely, the SAQ is shorter and allows hospitals to efficiently trend data over time, benchmark and examine relationships with outcomes.
The authors thank M. Michael Shabot, M.D., Juan Jose Inurria, FACHE, FABC, CPHQ, and Kathryn VandeVoorde, Pharm.D., M.Ed. for assisting with survey administration.
Funding Funding for the first author was provided by a K02 award from the Agency for Healthcare Research and Quality (1K02HS017145-02).
Competing interests None.
Ethics approval University of Texas CPHS.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.