Article Text

Smart agent system for insulin infusion protocol management: a simulation-based human factors evaluation study
  1. Michael A Rosen1,2,
  2. Mark Romig3,
  3. Zoe Demko1,
  4. Noah Barasch1,
  5. Cynthia Dwyer1,
  6. Peter J Pronovost4,5,
  7. Adam Sapirstein3
  1. 1Armstrong Institute for Patient Safety and Quality, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
  2. 2Department of Health Policy and Management, Bloomberg School of Public Health, School of Nursing; Institute for Clinical and Translational Research, Baltimore, Maryland, USA
  3. 3Anesthesiology and Critical Care Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
  4. 4University Hospitals of Cleveland, Shaker Heights, Ohio, USA
  5. 5Anesthesiology and Critical Care Medicine, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
  1. Correspondence to Dr Michael A Rosen, Armstrong Institute for Patient Safety and Quality, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA; mrosen44{at}


Objective To compare the insulin infusion management of critically ill patients by nurses using either a common standard (ie, human completion of insulin infusion protocol steps) or smart agent (SA) system that integrates the electronic health record and infusion pump and automates insulin dose selection.

Design A within subjects design where participants completed 12 simulation scenarios, in 4 blocks of 3 scenarios each. Each block was performed with either the manual standard or the SA system. The initial starting condition was randomised to manual standard or SA and alternated thereafter.

Setting A simulation-based human factors evaluation conducted at a large academic medical centre.

Subjects Twenty critical care nurses.

Interventions A systems engineering intervention, the SA, for insulin infusion management.

Measurements The primary study outcomes were error rates and task completion times. Secondary study outcomes were perceived workload, trust in automation and system usability, all measured with previously validated scales.

Main results The SA system produced significantly fewer dose errors compared with manual calculation (17% (n=20) vs 0, p<0.001). Participants were significantly faster, completing the protocol using the SA system (p<0.001). Overall ratings of workload for the SA system were significantly lower than with the manual system (p<0.001). For trust ratings, there was a significant interaction between time (first or second exposure) and the system used, such that after their second exposure to the two systems, participants had significantly more trust in the SA system. Participants rated the usability of the SA system significantly higher than the manual system (p<0.001).

Conclusions A systems engineering approach jointly optimised safety, efficiency and workload considerations.

  • human error
  • human factors
  • medication safety
  • nurses
  • patient safety

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from


Glucose management improves outcomes for critically ill patients.1 Current management practices entail bedside insulin infusion protocols,2 3 but protocols are work intensive for nurses4 and error prone. To complete a protocol, nurses must retrieve multiple data points from one or multiple systems, make manual calculations to determine any dose rate change and manually enter that data into a different system or device. In the intensive care unit (ICU), insulin administration errors are common,5 reaching up to 185.9 events per 1000 patient days of insulin treatment.6 One safety recommendation is the independent human double check.7 However, this check does not guarantee errors will be caught and corrected8 9 and introduces further work for nurses.10 Compliance with the nurse double check is also low.10

To help manage blood glucose (BG), our hospital uses a modified version of the Yale protocol.11 Once initiated, the process requires a nurse to retrieve the patient’s current and previous BG levels from the electronic health record (EHR) and manually calculate an hourly rate change. Instructions are then found in a lookup table using the current BG level and rate of change. If an increase or decrease of infusion rate is recommended, the current infusion rate and a multiplier found in the first lookup table are used in a second lookup table to find the change in infusion rate. This change is then added or subtracted from the current infusion rate to determine new pump settings. A second nurse double checks this process before it is documented and changes made to the pump. This process requires manual retrieval and manipulation of discrete pieces of information and consequently is susceptible to errors arising from limitations of human memory and attention. The objective of this study was to compare insulin infusion management of critically ill patients by nurses using either the current manual protocol system (manual system) or a smart agent (SA) system. The SA integrates the EHR and infusion pump and automates data retrieval, calculation and insulin dose selection. The SA was developed using a systems engineering approach.12 We sought to explore the degree to which the SA system could improve safety and efficiency of insulin infusion practices.


Study design

This mixed methods study compared the manual and SA insulin management systems in a simulation-based, human factors evaluation. We used a within subjects design to evaluate the two systems across 12 simulation scenarios completed in blocks of 3. Participants were randomised to an initial system exposure and alternated the system used (manual or SA) in each block thereafter. Surveys quantitatively assessed workload and user trust after each block of scenarios, and usability once at the end of the session resulting in a 2 (manual vs SA system) by 4 (scenario block) design for workload and trust, and a 2 (manual vs SA system) group comparison for usability. Similarly, a two group (manual versus SA system) comparison was used for error rates. Efficiency was measured at the scenario level, resulting in a 2 (manual versus SA) by 12 (scenario) design. We chose this design to increase study power, by collecting repeated measures data, and to decrease the impact of sequence effects of first exposure to the same system. A qualitative debrief captured participant experiences and perceived barriers to use. The study design and flow of evaluation sessions is illustrated in the online supplemental material. The Johns Hopkins School of Medicine Institutional Review Board approved the study with informed consent required; data collection occurred May through July 2018.

Supplemental material

Participants and recruitment

All critical care nurses working in an ICU or step down unit within the Department of Surgery at an urban academic institution who had used the manual insulin infusion protocol were eligible for inclusion in the study (approximately 200 nurses at the time of recruitment). We recruited nurses through email notifications and flyers posted on the inpatient units. The emails and flyers indicated that the project aimed to evaluate the potential safety and efficiency of continuous drug infusions using automated dose adjustments and integration of the medication infusion pump with the medical record. Participants had no prior exposure to the SA system. Nurses volunteered and were compensated based on the high needs pay rate for critical care nurses ($50 per hour).

Intervention procedure

Consented nurses completed a brief background survey requesting professional history, unit type and surgical specialty and experience managing patients on insulin infusion protocols. A coin toss randomised the nurse to their initial system exposure. A simulation session comprised four blocks (three scenario evaluations per block) and nurses alternated systems by block, completing six scenarios per system (figure 1). A session facilitator (a research coordinator, ZD or NB; or a research nurse, CD) oriented the nurse participant to the simulation session. This included the Epic EHR test environment with simulated patient records, the Hospira 360 medication infusion pump, paper protocol for the manual system and the SA system. The pump used in this study was not in clinical use and not linked to drug libraries or dose error reduction software. The facilitator demonstrated completion of a training scenario with the manual protocol system and the same scenario with the SA system. Nurses were encouraged to ask questions throughout the orientation and training.

Figure 1

The primary study outcomes of safety and efficiency compared by manual system versus smart agent system. Panel A is the total error counts across nurses by exposure (safety measure). Panel B is the average time to complete a dose change by exposure. A trial is one of the six scenarios completed using one of the two systems (six trials of each of the two systems).

Simulation scenarios

Each scenario contained the required EHR values (current BG level, past BG level and current infusion rate) for the insulin infusion protocol. Nurses were given a name associated with a patient record in the EHR test environment for each scenario. For the manual protocol system, nurses searched the EHR for the three values, used the protocol to calculate the correct insulin dose change and adjusted the pump settings as needed. For the SA workflow, nurses started the SA in the EHR, selected the glucose value for calculations, viewed the dose calculation result, had the ability to accept or modify the dose and confirmed the rate at the infusion pump. Technical and clinical process details of the SA system have been previously published.12 13 The online supplemental appendix includes full information for each scenario. The order of scenarios remained constant across participants. We only included scenarios within the range of the protocol that would not require pausing or stopping the infusion or require going off the protocol (ie, the hyperglycaemic and hypoglycaemic ranges were avoided). We excluded the nurse double check from this study because of the added logistic complexity (eg, recruiting additional participants or a confederate) and our belief that this would not vary by system. An observer (ZD) used a custom android tablet application to time completion of each scenario and logged the final rate change entered by the nurse.

Surveys and debriefing

We administered three surveys during the simulation session to elicit nurse perceptions of each system. The workload and trust surveys were repeated after each block of scenarios and the usability scale was done after the session for each system. We randomised the order of system usability ratings to minimise any sequence effects in ratings. The facilitator engaged each nurse in a structured debriefing after the session. Surveys and the debriefing protocol are included in the online supplemental appendix.

Study outcomes

The primary study outcomes were error rates (safety) and task completion times (efficiency). Secondary outcomes were perceived workload,14 trust in automation15 and system usability,16 all measured with previously validated scales.

Measures and data sources

Safety was measured as the rate of errors. Specifically, an error was defined as an infusion rate entered into and confirmed at the pump that differed from the infusion rate recommended by the protocol, regardless of the magnitude of deviation. Errors were captured dichotomously for each scenario (ie, an error did or did not occur) by comparing the insulin dose entered into the infusion pump by the nurse to the correct dose rate. This approach did not allow differentiation between errors made during the calculation of changes in glucose, calculations of infusion rates or slips (eg, keystroke entry error) or mistakes (eg, memory failures after calculation and before pump entry). Much of the calculation work is done internally and not directly observable. We did not prompt for intermediate calculation results to avoid altering the participants’ work processes and efficiency.

Efficiency was measured as time (seconds) to complete each scenario, starting when the nurse was given the patient name and ending when pump programming was complete. We also timed specific tasks: information retrieval in EHR, performing manual dose calculation (eg, on paper, using calculator), performing dose calculation using EHR and changing infusion pump settings.

Workload was measured using the NASA Task Load Index (TLX).14 The index involves two steps: (1) participants rate six distinct dimensions of workload on a 21-point scale (higher numbers indicate more workload) and (2) participants complete pairwise comparisons of the six dimensions, indicating which workload dimension was more important to their overall performance. Nurses completed step 1 after each block of scenarios and step 2 at the end of the session. The pairwise comparisons were used to create a total weighted workload score, ranging from 0 to 100 (higher numbers indicated higher levels of workload).

User trust and usability. Trust in the system was measured using a previously validated 12-item instrument on human trust in automation.15 An 8-point response scale (0=not at all, to 7=extremely) measures level of agreement with statements. Five negatively worded items are reverse scored. A sum of the item scores (potential range, 0–96, higher scores indicated higher levels of trust) was used in analysis. The Systems Usability Scale is a 10-item, widely used survey in research and commercial product development.16 A 5-point response scale (0=strongly disagree, to 4=strongly agree) measures agreement with statements. Five negatively worded items are reverse scored. We counted item scores of 3 or 4 as a positive response indicating high levels of perceived usability. A composite score was calculated as the percentage of positive responses across the ten items (potential range, 1–100) and was used in analysis.

Qualitative debriefing comments. Data were collected during a postsession debriefing with each nurse. The debriefing was semistructured and asked about the utility of the SA system, and perceptions of ease of use, efficiency and safety relative to the manual system. Open-ended questions elicited nurse opinions of the SA system, barriers to use and uptake and suggestions for improvement. Debrief sessions were not audio recorded, but notes were taken using a structured data collection form. Data were analysed and presented by first categorising specific statements by the focus of the comment (either technology-focused or workflow-focused) and subsequently by their valence (positive, neutral or negative sentiments), perceived barriers to use or suggestions for improvements. This structure for reporting comments aligns closely with specific questions asked during debriefing.


All statistical tests for previously described study designs were conducted in R (V.3.5.1; R Foundation for Statistical Computing, Vienna, Austria). Using the G*Power (V. programme,17 we performed an a priori power analysis using a large (Cohen’s f of 0.40) effect size heuristic method. There were no prior studies available to provide an effect size estimate for this type of intervention. Based on informal piloting of the SA and practical constraints for recruitment, a large effect size was chosen for sample size calculations. The χ² test for independence tested for differences in error rates between the SA and manual system conditions. Power analysis for the two by four analysis of variance for workload and trust included study design parameters (two levels of insulin infusion systems and four measurement periods) and a Bonferroni correction for four main tests to maintain a familywise alpha of 0.05 (an alpha of 0.0125 for each test). Sample size requirements were 20 nurses (for an actual power of 0.96) for a large effect size. A paired samples t-test was conducted to determine differences in perceived usability between the manual and SA systems. An a priori power analysis using a large (Cohen’s dz of 0.8) effect size estimate yielded a required sample size of 19 for an actual power of 0.96.


Twenty critical care nurses completed the study. Professional experience ranged from <1 year to 37 years of nursing (mean (SD) of 5.03 (8.85)) and from <1 year to 35 years of critical care nursing (mean 4.48 (8.60)). Fifty per cent (n=10) of nurses reported once per month use of the manual insulin infusion protocol, 35% (n=7) reported once per week and 15% (n=3) more than once per week.

Error rates

The insulin dose accuracy calculation of the SA was significantly better than a single manual calculation (χ2=19.70, df=1, p<0.001). In 120 scenarios, nurses made zero calculation errors using the SA system and made 20 calculation errors (16.6%) using the manual system. Figure 1A compares the total errors for SA and manual systems by exposure.

Description of error types

Observed errors were classified in terms of how they differed from recommended pump settings (ie, changes that differ in magnitude or direction, unnecessary changes or omissions of recommended changes). Table 1 describes the wide range of error types observed with the manual system. Of the 20 total errors, 30%6 were an incorrect change in magnitude, 30%6 were an incorrect change in direction, 25%5 were failures to make a change when one was required, 10%2 were inappropriate stopping or pauses of the protocol and 5%1 were unnecessary changes (ie, pump adjustments made when the protocol recommended none). In eight scenarios, infusions were set too high (with a range of discrepancies from 0.5 to 2 units of insulin/hour) and too low in 12 scenarios (ranging from −4 to −0.5 units of insulin/hour).

Table 1

Descriptive information for each observed error (units of insulin per hour)


Nurses were significantly faster, completing the protocol using the SA system (mean 53.57 (20.30) s per scenario) than when using the manual system (mean 82.62 (24.71)), F(1, 19) =68.45, p<0.001 (figure 1B). The SA system achieved protocol finalisation, on average, 29.0 s faster than the manual system. Tasks consuming the most time were retrieving information in the EHR and performing manual calculations (figure 2). Compared with the manual system, the SA system performed significantly better for both of these tasks. Additionally, the SA system outperformed the manual system for changing infusion pump settings, but not for calculations performed within the EHR.

Figure 2

The time spent completing four (panels A through D) specific tasks involved in an insulin infusion rate change by trial and manual system versus smart agent system. A trial is one of the six scenarios completed using one of the two systems (six trials of each of the two systems). EHR, electronic health record.


Nurses rated overall workload of the SA system (mean 6.33 (2.38)) significantly lower than the manual system (mean 10.23 (2.74)); F(1, 19) =37.32, p<0.001 (figure 3A).

Figure 3

Nurses’ self-reported ratings of workload, trust and usability compared by manual system and smart agent system. (A) Workload was measured after each block of three scenarios using the NASA-TLX scale composite score and compared by exposure (first or second) to each system. (B) Trust was measured after each block using the trust in automation scale total score and also compared by exposure. (C) Perceived usability was measured once at the end of the session using the System Usability Scale total score.

User trust and usability

Trust ratings were similar between the systems after the first exposure. However, trust in the SA system increased significantly when compared with the standard system for the second exposure F (1, 19) =8, 41, p=0.009 (figure 3B). Nurses rated usability of the SA system (mean 86.88 (10.73)) significantly higher than the manual system (mean 52.00 (17.97)); t (19) =−6.23, p<0.001 (figure 3C).

Qualitative debrief comments

In general, study participants had positive remarks about the SA system. All 20 nurses thought the SA would be helpful and found it more efficient that the manual system, and 18 (90%) found it easier to use. Fifteen (75%) nurses believed the SA system was safer than the manual system, while 5 (25%) were unsure or believed both systems were about as safe. These results are consistent with those obtained from the workload, usability and trust surveys.

Table 2 summarises the qualitative comments on the SA system. Perceived barriers to use included interruption or connectivity failures to the SA and other electronic tools, and differences in some nurse’s level of comfort with and willingness to use the system. Suggestions for improvement mainly focused on the SA interface (improving access and reorganising the display) and optimising its connectivity with other components of the insulin infusion workflow.

Table 2

SA debrief comments

Differences in outcomes by protocol use frequency

Descriptive data for study outcomes by self-reported frequency of protocol use are provided in the online supplemental appendix. Outcomes are generally consistent across groups of more and less frequent protocol users with three potential exceptions. First, there appears to be a trend for higher error rates for participants with more frequent protocol use (0.22 for participants reporting protocol use more than once per week; 0.15 for those reporting use once per month). Second, the largest differences in perceptions of usability between the SA and manual systems occur for very high (more than once per week) and very low (once per month) protocol users. Third, frequent protocol users (more than once per week) report the highest levels of trust in the SA system and the largest difference in trust between SA and manual systems.


This study demonstrated the potential of automated tools enabled by systems engineering and integration to reduce human error in the administration of insulin by protocol, in a critical care setting. When compared in a simulated work environment to the current manual system, the SA system was safer, more efficient, less workload intensive and perceived by nurses as more usable and similarly trustworthy. Most striking, the SA system eliminated errors in insulin dosing changes while concurrently reducing protocol completion time by 35.2%, or an average of 29 s per scenario.

We excluded a double check in this study, which would presumably reduce human errors and also increase task completion time. However, nurse double checks in a busy ICU are challenging to coordinate, time consuming and may have poor integrity.18 19 Our finding of zero errors with the SA system suggests that clinical implementation of such a system could obviate the need for a human double check. Thus, SA implementation in a new workflow could provide workload relief and efficiency gains without compromising safety. We also found that nursing staff had greater trust after using the SA system and more positive perceptions of system usability. Trust and ease of use are important considerations in the adoption and use of technology.20 21

The current SA system is limited to insulin infusion management, but the approach employed here could be expanded to other medication administration tasks where a structured protocol exists. Heparin infusion is another example in which dose adjustments are often made by a protocol that is based on laboratory results.22 As more automation tools are developed, further design and evaluation work will be required to understand how to integrate these agents effectively to support nurse monitoring and management of the automated agents. For complex patients on multiple infusions, the design of interfaces for nurses will become more challenging, but also the interactions between multiple algorithms driving the administration of medications may become more challenging to predict. Complex (and potentially risky) behaviour emerges from systems whose component agents follow simple rules.23 Therefore, caution is warranted and rigorous software, system and human performance evaluation studies will be critical.

While reactions to the SA system were generally positive, participants did surface issues to address before this or similar systems integration based automation agents could be fielded effectively. Some potential barriers were common dispositional factors influencing technology adoption (ie, a general dislike or distrust of new technology), but many focused on the interplay of technology and workflow. Specifically, there was discomfort among some nurses with fully removing the second nurse from the double checking process (ie, a workflow change) particularly as there were a set of related concerns about the reliability of the ecosystem of devices (ie, smart pump, glucometer) and systems (ie, EHR, hospital network) required to function properly for the SA system to work. These are valid concerns that would require risk mitigation (eg, redundant systems) and management strategies similar to contingency plans developed for other forms of infrastructure failure (eg, reversion to paper charting when EHRs go offline).

The error summary provided in table 1 provides valuable insights into the clinical significance of errors that may be made using the manual system. In general, the protocol attempts to minimise the risk of intravenous insulin administration by requiring hourly glucose checks. This frequent safety check protects against both administration errors and variability in patients’ responses to insulin. Thus, errors of between 0.5 and 1 unit of insulin/hour are unlikely to cause severe hypoglycaemic or hyperglycaemic over the course of 1 hour. However, errors from incorrect direction of change and inappropriate stopping of the infusion lead to much higher differences from the correct infusion rates. These errors accounted for 40% of the errors sampled and it is very likely that they would significantly alter the serum glucose values. The magnitude of such glucose changes and their impact on patients cannot be generalised but are subjects for subsequent study.

Complacency with or over-reliance on automation is a well-documented phenomenon in the human performance literature,24 and a concern raised by participants as well. Over-reliance on automation occurs when novice or expert users manage multiple concurrent tasks, and manual tasks compete with automation for attentional resources.25 However, the tasks automated by the SA system are information management and calculation, two types of tasks known to be error prone for humans even outside of multitasking demands.26 The ultimate decision on whether or not to accept protocol recommendation remains with the end user as some action is required to accept and implement any changes. This study was not designed to assess the impact of automation on overall situational awareness of a patient’s trajectory. Additionally, complacency is not a problem unique to automation, but general to any work system. Experience can lead to overconfidence in performance (eg, in this study, more frequent protocol users tended to have higher error rates) or in the ability of other system components (eg, a double check) to catch and correct an error.

There are a number of limitations with this study. First, the study design isolated components of insulin infusion management that differed between the SA and manual systems and ignored aspects that did not change (nurse double check). We were surprised by the nearly 17% rate of errors, given nurses knew we were measuring their performance (which likely prompted maximum vs typical levels of performance27) and the simulation environment was free from distractions and competing work demands. This and other data indicating an overall insulin administration error rate over 18% suggest that evaluation in a real clinical setting may demonstrate an even higher primary error rate before double check. Second, we did not evaluate the effect of the nurse double check on error elimination. Therefore, we cannot say with certainty that the SA system is safer than the manual nurse-managed system with human double check. However, recent reviews and primary studies continue to raise doubts about the efficacy of the human double check for medication administrations.28 29 Third, the SA was explicitly designed to work with our hospital’s insulin management protocol and was evaluated using study participants who worked in this system and with this protocol. Thus, the SA system needs further testing in other hospital settings. Fourth, this simulation evaluation focused on insulin infusion administration. It did not represent the complexity of task management in a real clinical setting. A better estimate of the value of SA could be determined by performing in situ field studies or higher fidelity simulation-based evaluations. Finally, this study used one type of EHR and one smart pump. The functionality and interoperability of different EHRs and smart pumps varies widely, and the ability to replicate this type of system with different technologies is technically challenging.


Systems engineering approaches that integrate EHRs, medical devices and work systems can enable the types of automated systems evaluated in this study and benefit care delivery. This study illustrated the potential of this approach to jointly improve safety and efficiency and decrease workload by automating information management and calculation tasks. The SA eliminated errors and reduced task completion time, providing safety alongside efficiency in the ICU. Integrated management and automation systems such as the SA should be pursued to further enhance systems-level improvements in safety and clinical workflow.


We thank Christine Holzmueller for edits to an early version of this manuscript.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors MAR, PJP, MR and AS conceived of the study. MAR designed the study. ZD, NB and CD collected data. MAR and ZD analysed the data and drafted the manuscript. All authors revised the manuscript.

  • Funding This study was funded by Agency for Healthcare Research and Quality (Grant number: P30HS023553).

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval Johns Hopkins University School of Medicine Institutional Review Board: IRB00158348.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data are available by request from the corresponding author.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.