technology-mediated data, its integration and its impact on intensive care cognitive work · 2018....

Technology-Mediated Data, its Integration and its Impact on Intensive Care Cognitive Work

by

Ying Ling Lin

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Institute of Biomaterials and Biomedical Engineering University of Toronto

© Copyright by Ying Ling Lin 2018

ii

Technology-Mediated Data, its Integration and its Impact on

Intensive Care Cognitive Work

Ying Ling Lin

Doctor of Philosophy

Institute of Biomaterials and Biomedical Engineering University of Toronto

2018

Abstract

Intensive care clinicians face an ever-increasing burden of continuous data from monitoring and

therapeutic technologies. Under typically hurried and stressful conditions, these continuous

arrays of high-resolution data make interpretation even more challenging. Data integration

technologies that organize and visually communicate meaning may potentially improve team

decision making but have yet to show compelling evidence on the benefits to individual

performance, or team performance for that matter. Facets of decision making which are not well

understood are the role of contemporary intensive care technologies in decision making, the

technology-mediated cognitive processes, and the effects of dense, multi-parametric

visualizations on data retrieval, integration and interpretation tasks. Therefore, this thesis

investigates these facets of decision making in the contemporary intensive care unit from the

perspective of physicians, nurses and respiratory therapists.

The focus on clinicians in this particular sociotechnical setting is known as Human factors, an

area of research which seeks to understand the interaction between humans and technologies and

optimize overall system performance. Through the lens of these three types of clinicians we

inform the design of data integration technologies, specifically T3™, a state-of-the-art data

iii

integration and visualization technology. It enables tasks related to Tracking of physiologic

signals, displaying Trajectory, and Triggering decisions. This thesis consists of a systematic

review of literature related to data integration and visualization technology for intensive care

decision-making and three experimental phases.

First, the systematic review was conducted to identify studies that looked at decision making

processes using technological sources and the facilitation of these processes using decision

support tools. The systematic review identified qualitative studies which described physicians’

and nurses’ cognitive processes during clinical tasks and quantitative studies which measured

differences in human performance in terms of time, accuracy of decisions, and cognitive load.

Collectively, the most mature technologies had been developed over decades and were informed

by both qualitative and quantitative studies. A meta-analysis, or aggregation of data from

multiple studies, found that perceived mental and temporal demands were lower, and

performance was better with new data visualizations compared to traditional paper-based

systems.

Second, the cognitive processes of physicians, nurses and respiratory therapists, were analyzed

using the macrocognition framework, a taxonomy for cognitive processes occurring in complex,

real-world settings. The framework was used to analyze interview data of critical decision-

making and the role of technology-mediated sources. Among ten macrocognitive processes,

Sensemaking was heavily informed and influenced by technology. For Sensemaking, physicians

utilized all sources available and compartmentalized the data sets according to different

physiological systems. Nurses were the most active in their manipulation of technology and

devoted much of their cognition to communicating information to physicians and respiratory

therapists. Respiratory therapists made sense of data specific to the respiratory system and had

iv

in-depth knowledge of respiratory support data. These findings suggest that to improve team

care, it is essential that data integration technologies be designed for nurse usability and that

Sensemaking should be tailored to each type of clinician.

Third, a heuristic evaluation method, a low-cost method to test interface compliance with

usability design principles, was conducted on T3™. Evaluation, by a team of two clinicians and

two human factors specialists found 50 usability issues associated with 194 heuristic violations.

Issues included (1) difficulty with choosing the time period of the patient data signals, (2)

distinguishing between several patient signals and (3) imperceptible changes in physiological

values; both issues could lead nurses to misinterpret the timing and/or the physiological status of

the patient (e.g., time of shock and exact value of vitals). Timescale manipulation and rapid

visualization of out-of-range signals were identified as catastrophic issues that should be

addressed.

Fourth, usability testing identified interface facilitators and barriers to the use of T3™ by

physicians, nurses and respiratory therapists. The current interface facilitated simple tracking and

trajectory tasks when a small set of parameters were displayed simultaneously. The barriers

included: (1) difficulty with acquiring multiple parameter data from data-dense visualizations

and perceiving out-of-target data and (2) limited clinical context of integrated continuous data

due separate clinical notes (e.g., in the electronic medical record). Though T3™ integrated and

condensed large amounts of data, visual pattern overload and poor data recall obfuscated the raw

data and thus, hindered data interpretation. While this study tested T3™, findings and design

recommendations may be applied generally to technologies that display data in a similar format

or to the same degree of integration, as the T3™ version studied.

v

Overall, this thesis contributes to the understanding of how fractured clinical data and

information systems and their integration impact intensive care cognitive work.

vi

Acknowledgments

This dissertation is dedicated to Steven, Lucien and my parents, Yen Chi Liu and Jiunn Long

Lin. Steven your steady support carried me throughout these last years and I thank you for

always being my sounding board. Lucien, you are my most important motivating factor. Mom

and Dad, I will always appreciate your unwavering support and I love you both very much.

My deepest gratitude to Patricia Trbovich and Anne-Marie Guerguerian, who are my role

models, my mentors and together provided much invaluable support and inspiration. Patricia,

your work ethic, humility and insightfulness make you remarkable and I was very lucky to have

your guidance. Anne-Marie, your enthusiasm, foresight and high macrocognitive process

switching have impressed me every day. I thank you both for giving me the chance to carry out

this amazing project.

I am grateful to my committee members Tony Easty and Kim Vicente, for their support, time and

patience. I feel lucky to have had your guidance throughout these years.

I also thank Peter Laussen for welcoming me to the intensive care team and bringing T3™ to

SickKids research. I thank my SickKids scientific committee members Alex Floh and Briseida

Mema for keeping diligent eyes on the quality of my research at SickKids. I am indebted to my

study participants, the clinical managers for promoting my studies and staff at SickKids’

Department of Critical Care Medicine.

I would like also to thank members of the Guerguerian lab at SickKids and HumanEra at

University Health Network for their help in experimental design and insightful discussions.

A huge thanks to Etiometry’s Dimitar, Mike, Evan and their engineers who provided the IT

infrastructure to carry out my most important experiment.

I thank also several insightful research students: Lauren Kolodzy, Jessica Tomasi, Bojan

Gavrilovic, Kevin Yang, Daniel Diethei, and Katja Heunig, and visiting research fellow, Ana

Almeida. Your data collection and analysis was invaluable to the work contained in this

dissertation. A grateful thank you to my readers Josianne Lefebvre and Antigona Ulndreaj. I am

also grateful to Cheri Nickel for her kindness and high quality research standards. I would also

vii

like to thank Mathias Görges, Shilo Anders and Grace Dal Sasso, who generously shared their

raw data on cognitive load measurements and made possible the meta-analysis of Section

2.3.3.1.6.

viii

Table of Contents

Acknowledgments .......................................................................................................................... vi

Table of Contents ......................................................................................................................... viii

List of Tables ................................................................................................................................ xii

List of Figures .............................................................................................................................. xiv

List of Abbreviations .................................................................................................................. xvii

List of Appendices ..................................................................................................................... xviii

Chapter 1 Introduction .................................................................................................................... 1

1.1 Rationale ............................................................................................................................. 1

1.2 Thesis Overview ................................................................................................................. 4

Chapter 2 Systematic Review of the Human Factors Literature on Data Integration Technologies for Intensive Care ................................................................................................ 8

2.1 Introduction ......................................................................................................................... 8

2.2 Methods ............................................................................................................................... 9

2.2.1 Systematic Search Strategy ................................................................................... 10

2.2.2 Inclusion and Exclusion Criteria ........................................................................... 11

2.2.3 Review Process ..................................................................................................... 12

2.2.4 Separation Criteria ................................................................................................ 13

2.2.5 Data Extraction ..................................................................................................... 14

2.2.6 Data Analysis ........................................................................................................ 14

2.3 Results and Discussion ..................................................................................................... 17

2.3.1 Study Selection ..................................................................................................... 17

2.3.2 Review of Qualitative studies ............................................................................... 17

2.3.3 Review of Quantitative studies ............................................................................. 32

ix

2.3.4 Research Gaps ....................................................................................................... 51

2.3.5 Longitudinal Studies with Qualitative and Quantitative Components ................. 52

2.4 Conclusions ....................................................................................................................... 54

Chapter 3 Technology-Mediated Macrocognition of Intensive Care Teams: Investigating How Physicians, Nurses, and Respiratory Therapists Make Critical Decisions ...................... 56

3.1 Abstract ............................................................................................................................. 56

3.2 Background ....................................................................................................................... 58

3.3 Methods ............................................................................................................................. 59

3.3.1 Study Design ......................................................................................................... 59

3.3.2 Setting ................................................................................................................... 59

3.3.3 Participants ............................................................................................................ 59

3.3.4 Procedure .............................................................................................................. 59

3.3.5 Data Analysis ........................................................................................................ 60

3.4 Results ............................................................................................................................... 62

3.4.1 Study Participants ................................................................................................. 62

3.4.2 Inter-Rater Reliability ........................................................................................... 63

3.4.3 Macrocognition Processes .................................................................................... 63

3.4.4 Sources of Data and Information .......................................................................... 70

3.4.5 Macrocognitve Processes as a Function of Sources of Data and Information ...... 73

3.4.6 Compound Macrocognitive Processes .................................................................. 81

3.5 Discussion ......................................................................................................................... 83

3.5.1 Macrocognition of Individual and Team Decision-Making ................................. 84

3.5.2 Expert Macrocognition and Pattern Recognition .................................................. 85

3.5.3 Implications for Team Macrocognition ................................................................ 86

3.6 Limitations ........................................................................................................................ 87

3.7 Conclusion ........................................................................................................................ 88

x

Chapter 4 Heuristic Assessment of Continuous Data Integration and Visualization Software .... 89

4.1 Abstract ............................................................................................................................. 89

4.2 Introduction ....................................................................................................................... 90

4.3 Materials and Methods ...................................................................................................... 93

4.4 Setting ............................................................................................................................... 93

4.4.1 Data Integrating and Visualization Software ........................................................ 93

4.4.2 Heuristic Evaluation: Applying Usability Heuristics for Medical Devices .......... 94

4.5 Results ............................................................................................................................... 95

4.5.1 Example #1 - Catastrophic Problem ..................................................................... 96

4.5.2 Example #2 - Major Usability Issue ..................................................................... 98

4.5.3 Example #3 - Major Usability Issue ..................................................................... 99

4.5.4 Example #4 - Minor Usability Issue ..................................................................... 99

4.5.5 Example #5 - Positive Features ........................................................................... 100

4.5.6 Example #6 - Positive Features ........................................................................... 100

4.6 Discussion ....................................................................................................................... 100

4.7 Limitations ...................................................................................................................... 102

4.8 Conclusion ...................................................................................................................... 103

Chapter 5 Usability of Continuous Data Integration and Visualization Software ...................... 104

5.1 Abstract ........................................................................................................................... 104

5.2 Background ..................................................................................................................... 105

5.2.1 Data Integration and Visualization Software ...................................................... 106

5.2.2 Overview of Project Phases ................................................................................ 107

5.3 Method ............................................................................................................................ 109

5.3.1 Study Design ....................................................................................................... 109

5.3.2 Setting ................................................................................................................. 109

5.3.3 Software .............................................................................................................. 110

xi

5.3.4 Scenarios and Tasks ............................................................................................ 111

5.3.5 Participants .......................................................................................................... 112

5.3.6 Procedure ............................................................................................................ 112

5.3.7 Data Analysis ...................................................................................................... 113

5.3.8 Usability Issue Severity Level ............................................................................ 114

5.4 Results ............................................................................................................................. 114

5.4.1 Participants .......................................................................................................... 114

5.4.2 Interrater Reliability ............................................................................................ 115

5.4.3 Software Strengths (Aid to Task Completion) and Usability Issues (Hindrance to Task Completion) ........................................................................................... 115

5.5 Discussion ....................................................................................................................... 127

5.5.1 Transforming Numerical Point Data to Long-Term, Time-Scaled Visualizations ...................................................................................................... 127

5.5.2 Integrating Data Trends: Visual Pattern Overload .............................................. 128

5.5.3 Data Trustworthiness .......................................................................................... 130

5.5.4 Usability Testing with Diverse Clinician Groups ............................................... 131

5.5.5 Proposed Iteration and Improvements ................................................................ 131

5.5.6 Improvements Over Existing Work .................................................................... 134

5.5.7 Limitations .......................................................................................................... 134

5.6 Conclusions ..................................................................................................................... 135

Chapter 6 Conclusions ................................................................................................................ 137

6.1 Key Findings ................................................................................................................... 137

6.2 Contributions to the Field ............................................................................................... 138

6.3 Future Work .................................................................................................................... 138

References ................................................................................................................................... 141

xii

List of Tables

Table 1. Extraction items .............................................................................................................. 14

Table 2. Evidence table for qualitative studies on intensive care clinician use of information

sources. .......................................................................................................................................... 18

Table 3. Summary of qualitative studies’ information sources used for intensive care ............... 20

Table 4. Second-order constructs and conceptual categories of clinician use of information in the

intensive care. ............................................................................................................................... 25

Table 5. Evidence table of study data integration and visualization technology .......................... 34

Table 6. Metrics used in quantitative human factors studies of data integration and visualization

technologies .................................................................................................................................. 40

Table 7. Summary of study realism, scenario description, and types of tasks performed. ........... 42

Table 8. Study technology realism, comparator and temporal representation .............................. 44

Table 9. Study completeness, quality and performance analysis. Studies in bold have a positive

added performance where the study quality is higher than the projected quality based on study

completeness. ................................................................................................................................ 48

Table 10. Lifespan of Quantitative Testing for a Given Data Integration and Visualization

Technology ................................................................................................................................... 53

Table 11. Macrocognition process codes, adapted from Klein et al. and Schubert et al.134,135,

*new process ................................................................................................................................. 60

Table 12. Source codes ................................................................................................................. 61

Table 13. Demographics, years of experience, specialization ...................................................... 63

Table 14. Macrocognitive processes and associated sources of information or data, *main data

sources are presented as the proportion of clinicians and the number of references .................... 74

xiii

Table 15. List of selected patient signals viewable on the data integrating and visualization

software ......................................................................................................................................... 93

Table 16. The 14 usability heuristics for medical devices as defined by Zhang et al.125 ............. 95

Table 17. Severity rating as defined by Zhang et al.125 ................................................................ 95

Table 18. Description, parameters available, key data features of three scenarios, based on real

patients, used to test the T3™ software functions. ..................................................................... 112

Table 19. Use error rating definitions, shown as nominal and numerical codes ........................ 113

Table 20. Demographics, clinician specialization, training, current use, and awareness of data

integration software. * CCCU: Cardiac Critical Care Unit; ** PICU: pediatric intensive care unit

..................................................................................................................................................... 115

Table 21. Usability tasks tested with severity levels and use error ratings. ................................ 116

Table 22. Practical Improvement Suggestions for Data Integration and Visualization Software

..................................................................................................................................................... 132

Table 23. Bowling’s completeness checklist for healthcare research ......................................... 161

Table 24. Summary of major usability issues ............................................................................. 186

Table 25. Summary of minor usability ....................................................................................... 188

Table 26. List of usability tasks tested and representative questions posed to the participants.

Checked box indicated a pass rate of less than 50%. .................................................................. 190

Table 27. Usability tasks tested with pass rates as percentage and fraction of total users. ........ 193

xiv

List of Figures

Figure 1. User-centered design cycle for interactive systems specified by ISO 9241-210 ............ 4

Figure 2. Summary of thesis objectives, research questions and related chapters. *O: objective;

RQ: research question ..................................................................................................................... 7

Figure 3. Flowchart of systematic search and inclusion for review ............................................. 13

Figure 4. Phases of the meta-ethnography research process. ........................................................ 15

Figure 5. The six dimensions of cognitive load (mental demand, physical demand, temporal

demand, performance, effort, and frustration) for four display conditions: paper control (Dal

Sasso 2015),115 electronic controls (Görges 2011 and 2012, Anders 2012, Dal Sasso 2015),

tabular or bar graphs (Anders 2012, Görges 2011 and 2012) and novel visualizations (Anders

2012, Görges 2011 and 2012). *Pcontrol: paper control; Econtrol: electronic control; TabBar:

tabular or bar graph visualization; NewVis: integrated visualization with clock and infusion

representations or integrated visualization .................................................................................... 47

Figure 6. Distribution by number of verbal references and percentages, within specialties, of

macrocognitive processes. ............................................................................................................ 64

Figure 7. Distribution of sources of data and information among all technological sources for

each specialty ................................................................................................................................ 72

Figure 8. Distribution of technological data sources among macrocognitive processes, within

specialties. ..................................................................................................................................... 80

Figure 9. Relationships between macrocognitive processes in intensive care for physicians,

nurses and respiratory therapists with strength of relationships indicated by the number on the

double arrows ................................................................................................................................ 81

Figure 10. The four main screens of the integrating software ...................................................... 92

Figure 11. Frequency of heuristic violations of the data integration and visualization software . 96

xv

Figure 12. Screenshot of single patient view showing last two-week trend; the ovals show “pull-

in” or “pull-out” functionality used to select the time window .................................................... 97

Figure 13. Screenshot of patient signals with shading to indicate out-of-range patient vitals.

Graph 1 shows overlapping out-of-range signals ......................................................................... 98

Figure 14. User-Centered Design and Evaluation Process of an Existing Data Integration and

Visualization Platform in Accordance with the ISO 9241-210 Standard ................................... 108

Figure 15. Representation of time series fictitious data and triggering visual aids: 1) shading, 2)

sparklines, and 3) bar graph of single indicator IDO2 algorithm ................................................ 111

Figure 16. Variation of use error ratings across clinician disciplines for all tasks related to

tracking, trajectory, and triggering as well as other software functions ..................................... 118

Figure 17. Usability issue of time manipulation interface .......................................................... 120

Figure 18. Time series data visualization of multiple physiological signals and therapeutic

interventions ................................................................................................................................ 123

Figure 19. Usability issue of auto-fit scaling resulting in misinterpretation of when the medical

infusion ceased ............................................................................................................................ 124

Figure 20. Proportion of qualitative and quantitative studies from 2001 to 2018. ..................... 163

Figure 21. Physician common macrocognitive process, based on normalized frequency of coded

references, with upper 50% bold and underlined. Non-paired processes have a cell value of 0. 180

Figure 22.Nurse common macrocognitive process, based on normalized frequency of coded

references, with upper 50% bold and underlined. Non-paired processes have a cell value of 0. 181

Figure 23. Respiratory therapist common macrocognitive process, based on normalized

frequency of coded references, with upper 50% bold and underlined. Non-paired processes have

a cell value of 0. .......................................................................................................................... 182

Figure 24. Charts showing distribution of macrocognitive processes within specialties. .......... 183

xvi

Figure 25. Charts showing distribution of technologies according to macrocognitive processes,

within specialties. ........................................................................................................................ 184

Figure 26. Distribution of technological information sources, within each specialties .............. 185

List of Abbreviations

ABP: Arterial Blood Pressure

AH: Abstraction Hierarchy

CIT: Clinical Information Technology

DIVT: Data Integration and Visualization Technology

EEG: Electroencephalography

EtCO2: End Tidal CO2

FiO2: Fraction of inspired oxygen

HF or HFE: Human Factors Engineering

HR: Heart Rate

ICU: Intensive Care Unit

LOS: Length of stay

NIRS: Near Infra-Red Spectroscopy

pCO2: Partial pressure of carbon dioxide

RCT: Randomized control trial

REB: Research Ethics Board

SpO2: Saturation of pulse oximetry

T3™: Tracking, Triggering and Trajectory

UCD: User-centered design

xviii

List of Appendices

Appendix A: Systematic Search Strategies ................................................................................. 157

Appendix B: Qualitative Study Assessment Tools ..................................................................... 161

Appendix C: Quantitative Study Assessment Tools ................................................................... 163

Appendix D: Cognitive Load Assessment Tool and Statistical Analysis ................................... 166

Appendix E: Critical Decision Method Sample Questions ......................................................... 179

Appendix F: Summary of major and minor usability problems ................................................. 185

Appendix G: Usability Tasks, Checklist and Detailed Data ....................................................... 189

1

Chapter 1 Introduction

1.1 Rationale

Intensive care operates in a complex sociotechnical environment tightly bound to highly

specialized human work and advanced monitoring and intervention technologies. These multiple

monitoring and automated therapeutic devices increasingly overtake the bedside space and has

been described as “congestion.”1-4 Interdisciplinary teams of specialized clinicians use the

abundance of continuous, high-resolution data, often supplied by a fragmented technological

infrastructure, to make critical decisions. Physicians, nurses and respiratory therapists must

communicate and make decisions as a team but individually, mentally integrate continuous data

from at least eight continuous parameters in intensive care, and thousands of pieces of

documented clinical information.5-7 Research suggests that in this setting clinicians experience

data overload, mental fatigue and make errors, in part due to these unharmonized, disparate

technologies.4,6,8-10 These consequences could explain the high turnover and burnout experienced

by intensive care clinicians.11-13 The mental integration from these multiple, continuous data

streams is beyond human capabilities. Thus, clinicians may always feel they are missing data,

and therefore reflecting judgement under uncertainty.14 Technologies that integrate and display

dense clinical data are seen as a boon to critical decision making because they address the issue

of seemingly perpetually incomplete data.

At the individual clinician level, two important benefits of data integration through software are:

1) reduction of the number of physical monitors present at the bedside and 2) decrease in the

mental burden of gathering and processing data and information from multiple sources. Since

13.7% of common errors responsible for system failures were due to reporting or communicating

information15 and 37% of preventable errors occurred during verbal communication of

information,4 team intensive care may be improved by systems-based information technologies

(IT) that support data communication. Passing dense streams of data and accumulated

information through rotating shifts of clinicians is no longer efficient. The use of technology and

improved information accessibility has been recommended as a key strategy to prevent medical

errors and related adverse events.16 However, implementating technology does not guarantee

2

improved information communication. Clinicians must be able to efficiently navigate IT

interfaces to complete their tasks. Rapidly developing technologies require clear descriptions of

user needs and thought processes to be well designed for cognitive work.

Visual analytics, defined as “discovery, interpretation, and communication of meaningful

patterns in data,” may support clinicians with complex decision making but current versions in

healthcare have shown poor usability.17-20 Research in cognitive psychology has repeatedly

indicated that our working memory can hold between five to nine chunks of information21,22 and

that we can effectively perceive relationships between no more than two parameters.23,24 For

these reasons the ICU environment in particular has been described as “cognitively complex.”8,25

Therefore, to better design data integration and visualization technologies, or DIVTs,

manufacturers must be in tune with the needs, capabilities and limitations of the clinician.26,27

This thesis project centered on T3™, a DIVT which was deployed, for the first time, in the

pediatric medical surgical and cardiac critical care units of the largest Canadian pediatric

hospital. It dynamically displays continuous patient data and its primary functions are to Track

physiologic signals, display Trajectory, and Trigger decisions, by highlighting data or estimating

risk of patient instability. The recent T3™version displays all parameters from the physiological

monitor including vitals, end-tidal CO2, and intracranial pressure and, more recently parameters

from mechanical ventilators and infusion pumps. Findings from this thesis may be used to

inform design of similar DIVT for intensive care.

To support clinicians with their cognitive work, we use a Human Factors (HF) approach, a

scientific discipline that studies how users interact with tools, technologies, processes and

environment. According to the Clinical Human Factors Group, “Human factors, […] in a work

context, are the environmental, organisational and job factors, and individual characteristics

which influence behaviour at work.”28 Human factors research, applied to the clinical setting,

aims to design ICU technologies for improved usability, effectiveness, efficiency, satisfaction

and enhance clinical performance.28 The characterization of HF issues related to DIVTs may

increase the awareness of risks with regards to clinician decision-making and lead to design

improvements. It is widely accepted that the user-centered design (UCD) framework, initiated

early, leads to lower costs, successful implementation, and prevention of medical errors.29-42

3

Adapting technology to the human component of sociotechnical systems can be achieved

through a widely-accepted UCD framework. The International Standards Organization’s UCD

standard, ISO 9241-210, is illustrated in Figure 1. The UCD process is iterative and driven by

testing the technology with the intended users until it meets their needs. T3™, version 1.6, was

the DIVT deployed at this study’s main clinical site, and the work described in this thesis is one

cycle within the UCD process.

4

Iterate, where

appropriate

Understand and specify the

context of use (ICU, MMM, patients)

Specify the user requirements

(ICU team and their

tasks)

Produce design solutions to meet user

requirements(prototypes of data

integration technology)

Evaluate the design against requirements

(human factors studies)

Plan the human-centred design

process

Designed solution meets user

requirements

Figure 1. User-centered design cycle for interactive systems specified by ISO 9241-210

The ISO 9241-210 standard,43,44 “user-centered design (UCD) for interactive systems”, consists of six phases: 1) planning the UCD process, 2) understanding the context of use, 3) understanding the user needs, 4) designing solutions as prototypes, 5) evaluating the prototypes and finally, 6) designed solution meets user needs. Additional iterations are triggered by the results from design evaluation and are indicated by the dashed arrows. Components related to the integration of multimodal monitoring (MMM) data, by clinicians, in the intensive care unit (ICU) are specified at the bottom of each box.

1.2 Thesis Overview

This thesis describes how explicit and discrete technological sources of data and information

impact the decision-making process. Furthermore, findings from the evaluation of an existing,

commercial DIVT that aims to comprehensively integrate the multiple technologies may inform

the future design directions of this and similar technologies. This thesis consists of three

components: 1) a description of the cognitive processes clinicians use when making data-driven

5

decisions using monitoring and DIVTs (systematic review); 2) an investigation of the nature of

cognitive individual and team work within the constraints of disparate information environment;

and (cognitive task analysis) 3) one heuristic assessment and one usability evaluation of the

DIVT T3™.

The overall objective of this thesis is to advance the design of DIVTs as decision-support for

intensive care, using T3™ as a test case, by improving its intuitiveness, practicality, and ease-of-

use. The three secondary objectives and the associated research questions are:

Objective 1. To describe current human factors research of data and integration technologies for

intensive care (O1).

Research Question 1a. In critical care, what and how does data and information from

continuous monitoring technology impact clinician decision making?

Research Question 1b. In critical care, what is the measured impact of DIVT on

clinician performance?

Objective 2. To describe the cognitive process of critical decision making and the role of

technological sources of information (O2).

Research Question 2. How does the distributed technological ICU environment impact

physicians’, nurses’ and respiratory therapists’ critical decision making?

Objective 3. To evaluate the interface of a state-of-the art DIVT, T3™, using Heuristic

Evaluation (O3a) and usability testing, with clinicians (O3b).

Research Question 3a. How well does the software interface adhere to accepted design

principles?

Research Question 3b. How can the software interface be improved to support

physicians, nurses and respiratory therapists with tracking of the patient state, communicating its

trajectory, and triggering data-informed decisions?

6

Chapters 2 to 5 of this thesis represent the phases of the research project. Chapter 2 provides a

current understanding of how technological information sources are used in the ICU and their

impact on clinician decision making. Chapters 3 describes the cognitive processes involved in

expert decision making. Chapters 4 and 5 are two evaluation phases of T3™, both with

clinicians, at different levels of technological data integration. As some of these chapters are

reproduced from stand-alone entities, certain information in the introduction and methods

sections may be redundant. Chapters 4 and 5 of this thesis are reproduced verbatim from

manuscripts. Chapter 6 of this thesis summarizes the key findings and major original

contributions of this work. A graphical depiction of the relationship between the research

objectives, methods to address research questions, and chapters of this thesis is shown in Figure

2.

7

Systematic Search of Human Factors of ICU Data Integration technologies

[RQ2] Cognitive Task Analysis with Critical Decision Method

[RQ3a] Heuristic Assessment

[RQ3b] Usability Testing

[RQ1a] Systematic Review of Qualitative Studies

[RQ1b] Systematic Review of Quantitative Studies

Chapter 2

Chapter 2.1

Chapter 2.2

Chapter 5

Chapter 4

Chapter 3

OBJECTIVES

O1.To describe current

human factors research of

data and integration

technologies for intensive

care

O2. To describe the

cognitive process of critical

decision making and the role

of technological sources of

information

O3a. To evaluate the T3™ interface

CORRESPONDING THESIS

CHAPTERS

METHODS USED TO ADDRESS

RESEARCH QUESTIONS

Future High-Fidelity Simulation or In-Situ Usability Testing

Chapter 6.3

Und

erst

an

din

g t

he

fragm

ente

d d

ata

con

text

Tes

tin

g t

he

data

inte

gra

tion

solu

tion

Est

ab

lish

ing n

eed

Figure 2. Summary of thesis objectives, research questions and related chapters. *O: objective; RQ: research question

8

Chapter 2 Systematic Review of the Human Factors Literature on Data

Integration Technologies for Intensive Care

The objectives of this chapter are to summarize the published evidence on how data is used for

clinical decision making and the impact of data integration and visualization technologies

(DIVTs) on cognitive performance. Here, the term “impact” defines changes in performance

efficiency, cognitive strategies and any other related aspects of cognitive work. Findings from

qualitative and quantitative studies are aggregated using meta-ethnography and meta-analysis,

respectively. Qualitative and quantitative literature reviews were guided by the ENTREQ and the

PRISMA guidelines, respectively.

2.1 Introduction

The integration of continuous, intensive care data, for a given patient, is a priority for designers

of clinical information technology (CIT) systems. As multi-disciplinary intensive team care

becomes more expansive so too does the scope of data integration to meet individual and team

needs.45,46 Technologies need to be both comprehensive and customizable. “Integration” not only

includes data streams from multiple devices, but its condensation over the entire length of stay

(LOS) and onto a single screen for physical convenience. These requirements imply device

interconnectivity, large data storage capacity and visualizations that meaningfully display data

and information. Efforts to meet these challenging requirements have been underway for

decades.47

In 1992, Cunningham et al. introduced MARY™, an interactive computerized trend monitoring

system, an extension of the physiological monitor, into their neonatal ICU.48 The system

displayed multiple physiological data trends of 7 minutes to 3 days, on a single screen. In

comparison, modern bedside physiological monitors display continuous waveforms of

approximately 15 seconds. Clinicians reported believing that MARY™ would help manage

neonatal care and improve their understanding of patient physiology.48 Six years later, a

randomized control trial (RCT) with MARY™ as the technological intervention found no

9

improvement in patient outcomes.49 One explanation for the system’s ineffectiveness was the

“poor presentation of intensive care data [leading] to late or poor interpretation of developing

pathology.”49 In 1994, a study with respiratory therapists, on the visualization of respiratory data

using visual metaphors, found that decisions could be made twice as fast with the same level of

accuracy.50 Cunningham and Cole each explored physiological monitor and ventilator data but

not their integration with each other and its impact on decision-making. The proposed solutions

were to make data trends both visually appealing and flexible, that is to provide a customizable

and responsive interface with new visual representations.49

For decades, much of the research has focused on specialties outside of intensive care (e.g.,

anesthesiology) so it remains unclear if CIT systems improve intensive care and critically-ill

patient outcomes. New technologies offer denser data visualizations, more responsive interaction

and computationally-intensive algorithms which resolve issues of limited customization and

interaction with massive data streams but without proven benefits to clinicians.51 Studies by

Cunningham et al. collected clinician perceptions and impact on patient outcomes but did not

investigate how these technologies changed data perception and its use.48,49,52 To understand the

impact of technologies on clinicians’ decision-making, this review sought studies which centered

on the technology end-users and thus emphasized a human factors (HF) approach. This review

summarized the published HF literature on data and information used for intensive care and the

impact of data integration and visualization technologies (DIVTs) on clinician decision-making.

To this end, we looked at studies which combined the social attributes of intensive care and

details on the variety of technological data sources. Social attributes included multi-professional

intensive care teams, their use and sharing of information and their clinical tasks, while technical

attributes included use of multiple technologies for critical decision-making.37,45,53 From this

review, designers and technology procurers may gain an understanding of how data is currently

used by clinicians and what the ideal DIVT should do if it is to support clinical decision-making.

2.2 Methods

The systematic search of the HF literature found studies with either quantitative or qualitative

evidence, or a mix of both. Therefore, this methods section contains a common search strategy,

inclusion and exclusion criteria and data extraction categories. Methods for completeness,

quality and synthesis are separated. Results and discussion are also presented separately. Meta-

10

ethnography was used to aggregate and synthesize qualitative findings and meta-analysis was

used to aggregate raw data for the common metric of cognitive load. Finally, research gaps and

general conclusions are combined based on findings from both groups of studies.

The terms “data” and “information” describe the variety of clinical patient information. In this

review, “data” refers to numerical values and visual representations displayed by technologies

while “information” refers to the processing of data by clinicians for the continuation of

intensive care. In addition, the terms “data source” and “information source” imply all

technological sources that contain the clinical patient information and comprise the clinical

information technology (CIT) system.

This review used the preferred reporting items for systematic searches, reviews and meta-

analysis (PRISMA) guideline, a standard framework in healthcare.54 To our knowledge there was

no PRISMA-guided review of HF research on data integration, by clinicians or mediated by

technologies.55-58 This review focused on clinical decision-making when clinicians mentally

integrate data from multiple technological sources (fragmented CIT systems) or when they use a

single technology to integrate data (integrated CIT system).

2.2.1 Systematic Search Strategy

The systematic search was carried out by a qualified librarian, trained in medical research. It

returned studies with three common themes: 1) having a viable DIVT, 2) having intensive care

clinicians as participants, and 3) relating to the intensive care setting. Results from this initial

search were one of two types, either qualitative, describing the impact on clinical decision-

making, or quantitative, measuring change in human performance. Studies using mixed methods

(i.e., generating qualitative and quantitative data) were categorized as quantitative studies.

De Georgia et al. provided a detailed history of computers in intensive care, and trace their

wholesale introduction to 2003, at which point we assume there were concerted efforts to

integrate data from multiple technological sources.47 Consequently, the search spanned literature

published from 2004 onward. Searches were conducted in May 2014 and updated in January

2018, in five databases: MEDLINE, Embase, Cochrane Central Register of Controlled Trials,

PsycINFO and Web of Science. Numerous database specific subject headings were selected to

11

capture the concepts of “intensive care”, “data display” and “human factors.” The Boolean “OR”

was used to combine all intensive care terms, all data display terms, and all human factor terms.

These three sets of terms were combined together using “AND”, limited by publication date and

to English or French language articles only. In all databases, both truncation and adjacency

operators were used to capture variations of word stems and variant spellings. Database subject

headings were exploded, when applicable, to include narrower terms. Database “Used For” terms

were used to generate text word searches to combine with the selected database subject headings.

Two search strategies are included in Appendix A. Google scholar was used to complement the

systematic search using the terms “human factors”, “data integration” and “intensive care.”

Existing reviews on human factors studies of displays, physiological monitors or data

representations were screened for additional studies to our review.

2.2.2 Inclusion and Exclusion Criteria

Since the concept of human factors is not well defined in the medical literature, inclusion and

exclusion criteria were used to identify veritable human factors studies. Inclusion criteria were

that studies be original research; were set in, simulated or had participants from the intensive

care unit; and that described the technology type and its functional capabilities as they related to

clinician work. Exclusion criteria were non-ICU applications, settings or participants; having no

prototype DIVT, no focus on explicit sources of data and information or did not integrate more

than one type of data parameter; conference papers, editorials; opinion pieces or reviews.

Furthermore, studies must have had as a goal to develop or improve the design of a CIT that

integrated continuous and/or intermittent clinical data and explicitly defined those technologies.

Examples of excluded studies were those which focused on the development of the technology

without clinician participants (e.g. not human factors) or on the effect of technology on patient

outcomes (e.g. LOS or rates of infection).59-61 For relevance close to use in the real setting,

studies must have had a tangible, interactive technology and described interface features which

explained the impact on clinician performance. Engineering studies, focused on the back-end

design of the technology, and imperceptible at the clinician interface, were excluded. Examples

of excluded studies were those focused on computational advances in DIVT (e.g. algorithm

development or validation),62 or involving medical device interconnectivity.63,64

12

2.2.3 Review Process

The reference management software EndNote™ was used to manage all citations (Clarivate

Analytics, PA, USA). Duplicates were removed using the reference program function and by the

main author (YL). Articles were screened by title, abstract and full-text. Full-text articles were

screened independently by two authors (YL and PT) who then decided on the final articles which

met inclusion criteria and relevance to each of the research questions. The references of these

final articles, obtained through the systematic search, were hand screened for additional articles.

If there was disagreement, articles were discussed for inclusion among four of the authors

(YL/LK/PT/AMG). References from known reviews of similar DIVT were hand screened by the

lead author (YL). Figure 3 shows the flow chart. The study was registered on Prospero (CRD#

42015020324).

13

Records remaining after duplicates removed(n=11,213)

Extraction/Quality appraisalSS SS SS SS ((((nnnn====22228888))))Refs Refs Refs Refs ((((nnnn====4444))))

ReferenceScreening of included records from

systematic search (SS)(n=1,104)

Systematic Search of databases(n=16,351)

Excluded on the basis of title (n=10,672)

MeSH terms for MeSH terms for MeSH terms for MeSH terms for 3 3 3 3 conceptsconceptsconceptsconcepts:1. ICU2. Integrating displays3. Human factors

RQ2. Quantitative Studies(n=20)

RQ1. Qualitative Studies

(n=5+7) original and updated

searches

Records screened by full-text (n=252)

Meta-Ethnography

Meta-Analysis for validated tool (NASA-TLX)

Quantitative Extraction

Qualitative Extraction

Assess Quality (QUASII)

Completeness Score (Peute)

Completeness Score (Bowling)

Research Question 2. In critical care, what is the measured impact of data integration and visualization technology on clinicians?

Research Question 1. In critical care, what and how does data and information from continuous monitoring technology impact clinician decision-making?

Excluded on the basis of full-text (n=224)

Records identified through other sources(n=311)

Records screened by abstract (n=541)

Excluded on the basis of abstract (n=289)

Figure 3. Flowchart of systematic search and inclusion for review

2.2.4 Separation Criteria

In this two-part systematic review qualitative and quantitative studies were separated using

Chaudhry et al.’s definitions for qualitative and quantitative studies.18 Qualitative studies focused

on explorative barriers and facilitators, while quantitative studies tested hypotheses by

comparing two groups or across time periods using statistical tests to find differences.18

14

Specifically, separation was made based on whether outcome measures where subject to

statistical analysis. If a study contained both quantitative (e.g., time to complete task or number

of errors) data and qualitative data (e.g., post-simulation interview or open-ended survey

questions) the study would be categorized as quantitative. The final set of 25 articles included 5

qualitative studies and 20 quantitative studies. The wide net search and final study selection

resulted in a 0.002% inclusion rate.

2.2.5 Data Extraction

Two authors (YL and LK) extracted information, from all articles, for the 13 categories listed in

Table 1.

Table 1. Extraction items

Item # Title

1 Source (Author, Year)

2 Study Design

3 Technology description, extent of data integration, and clinical implementation

4 Theory(ies)/Framework(s) used

5 Methods/Procedure

6 Scenarios, Tasks

7 Comparator/control, if any

8 Dependent Variable

9 Setting, Country

10 Clinical specialty and sample size

11 Key Findings

12 Statistical Analysis

13 Validity Domain/Limitations

2.2.6 Data Analysis

2.2.6.1 Methods for Qualitative Studies

2.2.6.1.1 Qualitative Study Completeness

Two reviewers (YL/LK) applied Bowling’s 12-item checklist to assess study completeness,

Table 23 found in Appendix B.65,66 Discrepancies were discussed and the final completeness

score was reached through consensus. We did not appraise the quality of qualitative studies at the

risk of excluding valuable conceptual insights, as suggested by Toye et al.67

15

2.2.6.1.2 Qualitative Data Synthesis: Meta-Ethnography

We reported our findings according to the ENTREQ statement, a 21-item checklist, from the

perspective of health technology research. Specifically, we looked at how technological

interventions instead of health interventions, affect clinical staff instead of patients. The meta-

ethnography technique was used to synthesize qualitative findings because it was widely used

and suitable for small numbers of studies.68,69 Meta-ethnography is a seven-step procedure, first

proposed by Noblit and Hare, and used Schütz’s concepts of first, second and third order

constructs.69-71 The technique was useful in breaking down individual studies, validate study

author interpretations, aggregate findings and generate new conceptual insights from the

collection of studies.67,70,72 The general steps are shown in Figure 4.

Figure 4. Phases of the meta-ethnography research process.

Noblit and Hare’s seven-step meta-ethnography technique with the first, second and third order constructs. First order constructs are the participants’ own statements, often found in the results section of an original research study. Second order constructs are the author(s)’ interpretation of participant statements, found in the discussion and results sections of an original research study. Third order constructs, are the insights of the authors of a meta-ethnography and is their synthesis output.

For each study, Schütz’ first order constructs (participant statements) were extracted from the

results and discussion sections and second order constructs (study author interpretations) were

extracted mostly from discussion and conclusions sections, by the two of the review authors

(YL/LK)71. The identified second order constructs were organized into conceptual categories,

and rephrased for clarity, consistency and fidelity to the original study. To visualize the

frequency of themes each article was coded using NVivo 8 software and a concept code tree. The

process of coding and rereading the studies revealed other concepts which were rephrased for

clarity and added to the original tree nodes, organized under the original conceptual categories.

Extract 1st and 2nd order constructs Generate insight through 3rd order constructs

16

Third order constructs are the interpretations of the original studies’ authors’ interpretations

(second order constructs), or as Toye describes: “interpretation of an interpretation.”67 Third

order constructs are the insights from this meta-ethnography by the systematic review authors.

2.2.6.2 Methods for Quantitative Studies

2.2.6.2.1 Quantitative Study Completeness and Quality

Two authors (YL and LK) rated completeness of quantitative studies using Peute et al.’s 52-item

checklist for human factors studies of health information technologies. Discrepancies were

discussed and the final completeness score was reached through consensus. The original

checklist was developed through expert consensus and consisted of essential elements study

authors should report within defined sections of their article.73 Examples of essential study

elements included referencing results from previous human factors studies in the introduction,

including a screenshot of the technology, providing participants’ level of IT experience, etc. The

original checklist was applied to all studies and items absent or rarely present were removed.

Some examples were: release date, linguistic and culture background and potential disabilities.

Two items were added to increase the validity of human factors studies in clinical settings: 1) if

study received ethics approval and 2) if the Delphi or another expert consensus process was used.

The final checklist consisted of 47 items and resulted in a maximum completeness score of 47.

The list is found in Appendix C.

Quality of human factors studies was assessed using a modified version of the QUality

ASessment Instrument (QUASII).74 This tool was originally designed for health informatics

research but, for human factors studies, three of the original 18 questions were modified.

Specifically, the phrase “implementation of the information system” was changed to “technology

implementation” in item 3; the term “patients” was changed to “clinicians” in item 7; and the

phrase “type of providers” was changed to “technology implementation” in item 8.

Discriminating between points on the original 7-point scale was found to be challenging due to

both the scale and the inappropriateness of the anchor statements to the human factors domain.

With a view to diminish perceived subjectivity, the scale was reduced to 5-points. Guided by the

threats to validity, described by Shadish et al., anchor statements for each item were added at the

midpoint and two endpoints.75 Modifications to QUASII were finalized through author

17

consensus prior to the assessment of all quantitative articles. The maximum modified QUASII

score was 90. Low IRR for QUASII scores may be explained by its 5-point Likert scale which

requires matched ratings to achieve a higher IRR. Interclass correlation is better suited to

evaluate the match between raters who scored using the QUASII tool. The complete tool can be

found in Appendix C.

2.2.6.2.2 Meta-analysis of Quantitative Studies

We combined study NASA-TLX cognitive load data from 4 studies using R V3.4.4

(http://www.R-project.org) statistics software and ggplot2, psych, pgirmess, pastecs and car

packages.

2.3 Results and Discussion

2.3.1 Study Selection

The searches returned a total of 9,508 articles from the May 2004 search and 6,843 articles from

the January 2018 search. Figure 3 is the flow chart depicting the study search and selection

process which resulted in 18 articles, from the original search, and 10 articles from the updated

search. Furthermore, hand screening references of reviews on human factors of displays,

physiological monitors or novel data representations resulted in 4 additional articles.47,55,56,58,76 In

this chapter, the 20 quantitative (17 from the orginal search and 3 from the updated search) and 5

qualitative articles (all from the original search) will be presented. The 7 qualitative articles from

the updated search will be integrated into a manuscript for submission to a journal and will be

based on this thesis chapter. Figure 20 in Appendix B shows the proportion of both types of

studies over time.

2.3.2 Review of Qualitative studies

2.3.2.1 Results

Study setting, participant sample population, methods, tasks, main findings and study

completeness are presented in Table 2. Study research focus, ICU data and information sources,

and theories discussed are presented in Table 3. The 21-item checklist of the Enhancing

Transparency in Reporting the synthesis of Qualitative research (ENTREQ) statement,69 is

presented in Appendix B.

18

Table 2. Evidence table for qualitative studies on intensive care clinician use of information sources.

Study First

Author

(Year of

publication)

Country,

Setting

Participant

Sample

Methods/Procedures Tasks Main Findings Completeness

Score

Alberdi (2001)77

UK, Neonatal ICU, Simpson Maternity Hospital

34 physicians (interviews) 10 physicians (simulations)

1. Interviews (individual) 2. Observation (NICU, 8 sessions, 13.5 hours total) think-aloud procedure 3. Off-ward simulations with think aloud protocol

Diagnosis - Study technological system required additional information to be useful. - Expertise differences are not so much due to different processing skills but to differences in domain knowledge, - Experts are able to focus on relevant domain features better than less experienced subjects, and - Experts' problem solving is opportunistic.

9/12

Doig (2011)78 USA, burn-trauma ICU, medical ICU, and surgical ICU

14 nurses 1. Semi-structured interviews 2. Cognitive task analysis

Hemodynamic monitoring

- Nurses perform 4 types of cognitive tasks: 1) selective data acquisition, 2) data interpretation to develop mental models, 3) controlling hemodynamics with monitoring data, and 4) monitor complex trends - Trends should be related to treatment goals

10/12

Kannampallil (2013)79

USA, Medical ICU of a large urban hospital

8 physicians (mix of staff, fellows, and residents)

Think-aloud protocol during the review of a patient case using each form of patient chart

Information gathering

- Information seeking process was exploratory and iterative and driven by the contextual organization of information - Greater relative information gain and retrieval from EMR than paper records

11/12

Koch (2012)80 USA, three clinical practice settings

19 nurses 1. Observations of ICU nurses in clinical practice settings 2. Iterative discussion and categorization of field notes 3. Affinity diagram for observational data

1. Communication 2. Medication management 3. Patient awareness

- Nurses perform five types of tasks, in order of highest to lowest frequency: 1) Communication, 2) Medication management, 3) Patient awareness. 4) Organization and 5) Direct patient care

9/12

19

Sharit (2006)81

USA, surgical ICU and trauma ICU

11 nurses and 6 physicians

1. Semi-structured interviews 2. Hierarchical task analysis (HTA) 3. Task simulation 4. Verbal protocols 5. Questionnaires 6. Post-task interviews

Information gathering for treatment plan (task simulation, only)

- eliminate or reduce the tendency for a source to provide incomplete information - manually documented information sources were highly rated for completeness, nonredundancy, ease of access and organization by both nurses and physicians

6/12

20

Table 3. Summary of qualitative studies’ information sources used for intensive care

Study First

Author (Year of

publication)

Research Focus Technological Context and

Types of Data Integrated, if

Mentioned

Information Sources Theoretical frameworks

Supported (+) or

Refuted (-)

Alberdi (2001)77 Data used for diagnosis and hypothesis testing

MARY™ computerized physiological trend monitoring system

1. Patient’s physical appearance 2. Procedures conducted 3. Settings of the machinery attached to the patient (ventilator and incubator settings 4. Clinical tests and examinations (arterial blood samples and X-rays) 5. Changes to the computerized monitor display (change the axis scale or requests to scroll back to previous data blocks) 6. Colleagues knowledge about the patient 7. Calibration of probes or leads

+ Opportunistic reasoning by experts + Skills and processing differences between novices and experts + Expert possess superior domain knowledge and domain representation + Experts make more use of biomedical knowledge

Doig (2011)78 Design for four cognitive tasks of hemodynamic monitoring: 1. selective data acquisition 2. data Interpretation 3. controlling Hemodynamics 4. monitoring complex trends

Existing hemodynamic monitoring displays (hemodynamic variables from main monitor with data from the electrocardiogram, pulse oximeter, and ventilator display)

1. pulmonary artery (PA) catheter 2. Physical and cognitive assessment findings 3. Urine output 4. Laboratory data.

- Abstraction hierarchy and Ecological Interface Design - Information processing approach to behavior modelling + functional approach to behavior modelling

Kannampallil (2013)79

Comparing sources of information and enriching integrated electronic health record systems

Paper and electronic medical records (with greater detail and 24h trends)

1. Paper charts 2. Electronic medical charts 3. Bedside physiological monitors 4. Support personnel

+ Information foraging theory

Koch (2012)80 Determining most common nursing tasks, the data required to support tasks

Information displays at the bedside (undisclosed commercial developer)

1. electronic health record 2. electronic medication administration record 3. vital signs monitor 4. intravenous pumps 5. ventilator

+ Three-level situation awareness

21

Sharit (2006)81 Perceived usefulness of all types of data and information, available in the ICU, ranked using a Likert 5-point scale by participant

The use of various paper-based and electronic information sources in ICU

1. Bedside flowsheet 2. Kardex 3. Patient chart 4. Hospital health information system 5. Clinical administrative research and education system 6. Telephones, Pagers, face-to-face communication, bedside physiological monitors

+ Hammond’s model of using both intuitive processing and analytical processing

22

2.3.2.1.1 General Description of Qualitative Studies

There was very good agreement between two reviewers (YL/LK) with no more than a 2-point

difference in completeness scores and an interrater reliability of 83.3%. Four out of the 5 studies

had a completeness score of at least 9 out of 12.

Geographically, 4 studies were set in the USA and 1 in the UK. All studies were conducted at

universities or university-affiliated hospitals and 1 study included a community hospital among

its three sites.80 Study participants in 4 studies were either physicians or nurses 77-80 while 1 study

had a mix of both.81 In 1 study, respiratory therapists were mentioned as exclusive receivers of

blood gas data to the disadvantage of nurses’ patient awareness.80 Respiratory therapists typically

employ this type of physiological data to drive their decision-making regarding mechanical

ventilation, for example. The difficulties with current CIT systems experienced by both

physicians and nurses, and possibly respiratory therapists, could signal inefficiencies in team

care. No studies compared how these obstacles affected the overall efficiency on the team.

Four studies used multiple methods to triangulate data.77,78,80,81 The number of methods ranged

from one to six. The most common method was interviews,77,78,81 followed by simulations with

think-aloud protocol,77,79,81 and observations.77,80

2.3.2.1.2 Description of Qualitative Study Technology

The aim of all qualitative studies was to inform the design of CIT for intensive care decision-

making. While all studies described commercial CIT only 1 study described the systems with

enough detail to relate interface features to that hindered or supported task completion.77 Table 3

provides details of study technologies, information sources and theoretical bases. From 2000 to

2014, there was a wide variety of DIVTs.

The CIT systems described were fragmented and even essential vitals were sometimes

unobtainable. In all studies, more data than available was requested suggesting the current

systems did not integrate sufficient information sources. Clinicians requested information

including patient appearance, historical procedures, ventilator settings, arterial blood gas results

and changes to the physiological monitor.77 For the task of hemodynamic monitoring, nurses

23

always required blood pressure, heart rate, and cardiac output data.78 Thus, clinicians could not

easily access and perceive pertinent information in the fragmented data environment.

Due to the continuous care provided to patients in the ICU, DIVTs should span time horizons for

the entire LOS to be useful to all clinicians. Of the 5 studies, 1 was a dedicated physiological

monitor that integrated data from the entire LOS, while 2 study technologies were EMR systems

with trend functions, presumably for the entire LOS.78,79 The other 2 studies described

fragmented data systems in which physicians required the past 10 days data and nurses the past

few hours of data.80,81 Thus, the literature established some baseline requirements for the

appropriate time horizons for physicians and nurses.

In 2006, physicians and nurses preferred paper records over other information technologies

because they were more complete, non-redundant, easier to access and better organized.81

Increased experience with information technologies and more technological integration was

expected to change this preference.81 In 2013, time spent using either paper or electronic medical

records was equivalent but more information was gained using the EMR owing to better

structuring of data and information.79 This suggests EMR systems have not yet been optimized to

phase out paper records, perhaps due to sub-optimal electronic interface interaction.

Technologies assessed in these studies did not offer responsive, real-time interaction now

possible.60,82-84

Studies emphasized the indirect nature of clinical decision-making. For example, setting rigid

goals were inappropriate. Doig et al. proposed designing technologies that support processes

such as “achieving stability as efficiently as possible” instead of the single goal of “achieving

stability in the system.”78,85-87 Similarly, information gathering is a continuous process of

accumulating and discarding information. Physicians used a process of local optimization by

seeking sources which maximized their information gain.79 They also outlined three limitations

of local optimization: 1) switching between sources, 2) time and expertise required to develop a

successful search strategy; and 3) that varied across physicians. Thus, the literature suggests the

process of information use is ongoing and that technologies should be designed for

individualized exploration of data; that is, they should be flexible.

24

2.3.2.2 Discussion of Meta-Ethnography of Qualitative Studies

The synthesis technique of meta-ethnography “does not aim to summarize the entire body of

available knowledge” but to generate conceptual insight and to develop ideas.67 By

deconstructing studies into first and second order constructs translation of all studies into each

other was possible. In total, 58 references were associated to first order constructs, direct quotes

from clinicians, and 511 references were associated to second order constructs, that is the study

author interpretation of first order constructs. Second order constructs were organized into four

conceptual categories: 1) information and data processing by clinicians (152 code references), 2)

features and factors of clinical decision-making (153 code references), 3) features of work and

environment (34 code references), and 4) technology design (172 code references). The four

conceptual categories are summarized in Table 4 and will be discussed in each of the following

sections.

25

Table 4. Second-order constructs and conceptual categories of clinician use of information in the intensive care.

Second-order construct

Information and Data

Processing by Clinicians

Features of Clinical Decision-

Making

Work and

Environment Technology Design

Study Acq

uis

itio

n

Fil

teri

ng

Pri

ori

tiza

tion

Inte

gra

tion

Inte

rpre

tati

on

Exper

tise

Cognit

ive

Pro

cess

es

Hypoth

eses

Men

tal

Model

s

Err

or

& U

nce

rtai

nty

Cli

nic

ian T

asks

Work

Envir

onm

ent

Tre

nds

Info

rmat

ion S

truct

ure

Info

rmat

ion Q

ual

ity

Fea

ture

s &

Funct

ions

1 Alberdi (2001)77

✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

2 Doig (2011)78 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

3 Kannampallil (2013)79

✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

4 Koch (2012)80 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

5 Sharit (2006)81 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

26

2.3.2.2.1 Information and Data Processing by Clinicians

This section explored technology-mediated steps of information and data processing by

clinicians. Skilled clinicians use the following cognitive tools during critical decision-making:

information seeking, discriminating, analyzing, transforming knowledge, predicating, applying

standards, and logical reasoning.88,89 Meta-ethnography identified five technology-mediated data

processing steps performed by clinicians: 1) acquisition, 2) filtering, 3) prioritization, 4)

integration and 5) interpretation. Information seeking, discriminating and analyzing are reflected

in the five technology-mediated steps. Technology-mediated information retrieval and filtering

were time-consuming since physicians must “sift through a large amount of unwanted data” from

the hospital information system (e.g., CIT system)81 or must iteratively develop their process of

local optimization (e.g., by access resources found to maximize information gain) to get

meaningful information with the least amount of effort as possible.79 Making data easily

accessible was an important technological feature to support this initial step of data acquisition.80

Data acquisition, filtering and prioritization can be off-loaded to DIVTs that collect multi-device

data streams and pre-select or highlight data streams based on clinician preferences or specialty.

Integration is a data processing step that involves the mental integration of continuous data

streams by the clinician. This process can be partially supported by both the physical integration

of data to a single unifying DIVT. The first order construct terms associated with “integration”

were “data integration”, “big picture”, “long-term trends”, “trends generated by clinicians”,

“parametric interrelationships” and resulted in 61 references, from two studies.78,81 In particular,

fragmented technologies were limited in their ability to support clinicians with relating

parameters and may even hindered clinician integration and decision-making. From Alberdi: “A

desirable feature of decision support would be the presentation of data in such a way that

relevant links amongst parameters are highlighted” (second order construct).77 From Doig:

“[Novice nurses] can look at a number, some can tell me, [expert nurse], the normal range, but

then how it relates to a patient’s normal physiology is a difficult concept” (first order

construct).78 Display format was one solution to support multiparametric data interpretation

tasks. Using visual metaphors to represent human physiological system by “depicting variables

as shapes that visually resemble and behave in a similar manner to the physiological system”

were suggested.78,90 Aesthetic features included familiar color schemes, better representation of

27

alarming values, increased font size of vital sign values, and providing patient trend over the last

hours.80

The challenges with data integration were compounded by the difficulties with the interpretation

of data by clinicians, defined as the use of “biomedical knowledge to make sense of surface

physiological patterns that can be plausibly explained by more than one hypothesis.”77,91 EMR

system data, available as “tables and graphs aid in easier interpretation and comprehension of

information.”79 However, foreseeable challenges involved the contextualization and

conceptualization of data.77,78,81 To contextualize data senior physicians more frequently

requested supplemental information than junior physicians.77 Also, continuous data that could

not be abstracted into concepts risked being ignored: “if a nurse cannot conceptualize the

meaning of a parameter or see the big picture, then the data will go unused.”78,79 Further

qualitative inquiry is necessary to understand how composite data from tables, graphs and novel

visualizations can support contextualization and conceptualization.

In sum, meta-ethnography revealed five technology-mediated steps of data processing found in

five qualitative studies. Clinicians can be supported by designing technology features to support

each step or a group of steps. First, the clinician’s data and information acquisition (Step 1) can

be supported by technologically integrating all data and information streams available to them

from multiple IT sources (e.g., from medical devices, paper charts and electronic medical

records). Second, the clinician’s filtering and prioritization (Step 2 and 3) of data from the

environment can be supported by technological functions such as artefact-cleaning algorithms or

pre-defined data subsets. Integration of data (Step 4), may only be partially supported by

technology, to help clinicians “understand the big picture” through visual trending with clinician-

specific temporal windows and cognitive supports for contextualization and conceptualization of

the data. Finally, interpretation (Step 5) leading to action or a critical decision-making remains

the responsibility of the clinician as no study has explored shifting decision-making to an

artificial intelligence.

28

2.3.2.2.2 Features of Clinical Decision-Making

The literature suggests that technology-mediated clinical decision-making was affected by

differences in expertise, cognitive processes, use of hypotheses and mental models, and control

for error and uncertainty.

Expertise is a term encompassing types of clinical specialties, their domain knowledge, their

experience with technology, their seniority and any training they received. Comparisons of the

utility of technological information sources were between either nurses and physicians or

between junior and senior clinicians. In 2001, junior doctors interacted the least with the

computerized monitor compared to nurses and senior doctors.77 While in 2006, physicians

compared to nurses viewed computer-based sources as being more accurate than paper records.81

Most recently, in 2011, novice nurses were “get[ting] caught up in the numbers” and required

support to understand the “relationships between constrained hemodynamic variables, […] links

between treatments and the range of possible hemodynamic responses.”78 Over time, the

increasing amount of data available and decreasing usability of technologies could explain these

observed differences in which there may be a critical amount data that can meaningfully inform

clinical decision-making.

In summary, the literature suggests an element of chance in the use of data in the fragmented CIT

system. Physician reasoning and problem-solving was opportunistic meaning that they exploited

“whatever knowledge sources [were] available in the task”77 and adaptable by “synchronizing

with the choices available in the environment.”79 These processes emerged naturally from sub-

optimal CIT systems and are reasons for the high rate of preventable errors.8 Efforts to address

these random characteristics of critical decision-making should be emphasized when designing

CIT systems.

2.3.2.2.3 Features of Clinical Work and Environment

The ICU is a time-pressured, data-intense, cognitively complex and interruption-laden

environment further complicated by the intensity of care and the severity of illness.8,92

Descriptions of the ICU data environment reiterated the constraints of time-pressure78,81 and data

fragmentation.79 79 Physicians and nurses had difficulty correlating information between

information sources and differentiating between complete and incomplete information because of

29

the disparate sources and the persistent use of paper records.81 As a consequence to the physical

distances, an inefficient strategy of “iterative back-and-forth switching” between information

sources was described.79 These features affected the efficiency of physician diagnosis77,79 and

nurses’ patient monitoring and medication management.78,80

A limited number of clinician tasks was investigated in this pool of studies. In particular, Doig

focused on the general task of hemodynamic monitoring and sub-tasks of: 1) selective data

acquisition, 2) data interpretation, 3) controlling hemodynamics and 4) monitoring complex

trends.78 Koch described the five types of nursing tasks: 1) Communication, 2) Medication

management, 3) Patient awareness, 4) Organization, and 5) Direct patient care.80 Both Koch and

Doig suggest that to support nurses information should be integrated based on their tasks.78,80 All

studies emphasized that their technology design recommendations were specific to the tasks

described.

2.3.2.2.4 Technology Design Recommendations from Qualitative Studies

Research on technologies to support continuous multimodal monitoring indicate a move towards

automated integration, new multi-parameter indicators and new visualizations of large arrays of

raw data.5,51,60,93,94 Qualitative studies recommended that ideal clinical information technologies

should provide trending functions, structured and organized data and information, artefact free

data and somehow support interpretation of multiple parameters.

2.3.2.2.4.1 Trending

Trending was a general term often described in most studies. Physicians would integrate 10

days’ worth of flowsheets to understand the patient trends81 while nurses preferred short-term

trends between a 12- to 24-hour period.78 The temporal condensation of data onto a single

platform requires interaction with a high degree of control to result in fluid data navigation.

Simply providing all data without aiding its interpretation has not worked.77 Algorithms to

highlight and condense multiple parameters were suggested in several studies.77,79 The literature

also indicated limitations of CIT system to facilitate access to multiple data streams and long-

term time horizons.78,80,81 That is, data were physically and temporally fragmented and clinicians

had difficulty accessing them. The literature indicated that clinicians already integrated data

30

across multiple sources and that current technologies did not mirror their cognitive data

integration processes.

Suggestions were also made for new types of data displays.78 Automatically generated visual

trends was a prominent desirable feature because this helped clinicians’ ”assessment of the data

and propitiate rapid and effective decision-making in emergency situations.”48,77-79 A study on

how clinicians used long-term time-series trending found that further technological advancement

was necessary.51 Specifically, their physiological monitoring system should be supplemented

with 1) algorithms to automatically identify and interpret relevant monitored patterns, 2)

“intelligent” alarms that use the system’s interpretation to warn staff, and 3) summarize

monitored events over extended periods at a high-level of abstraction.77 Nurses preferred their

selective, handwritten, and memory-based trending strategy over automated trending because it

showed interrelationships between physiological variables and interventions.78 This acquisition,

filtering, prioritization and integration should be replicated by DIVTs to support understanding

trends.

2.3.2.2.4.2 Information Structure and Quality

Paper flowsheets offered a higher rate of information gain (information gain/time spent) than the

EMR due to its single location, containing nursing notes and flagged physiological data. More

unique information was gained from EMR suggesting lowered redundancy of data.79 Integration

of multiple physiological parameters offered single source functionality but technological

maturity resulted in artefacts which rendered the data unreliable.77 The two observations suggest

that the continuous data is made meaningful through the input of the various clinical expertise

typically responsible for writing the clinical notes, or “contextualizing” the data. This leads to an

eventuality that EMR systems containing comprehensive clinical notes (i.e., interpreted data)

must necessarily populate continuous data technologies for this data to be useful to clinicians.

To this end, another important aspect of technology design is the augmentation of commercial

CIT systems, such as EMR systems with continuous physiological trends. The monopoly of the

physiological monitor on the data they generate means that new technologies encounter obstacles

to the integration of patient data. Therefore, it is important to study commercial systems and

define the medical devices from which the data streams are combined. This knowledge gap in the

31

understanding of commercial DIVTs has been highlighted by Chaudhry and De Georgia.18,47

Commercial technologies were the focus of the qualitative studies but no screenshots were

provided making it difficult to more deeply understand difficulties encountered by clinicians at

the CIT interface. Specifically, the details of type of technologies (e.g., advanced physiological

monitors, EMR system with trend capabilities), sources of information across medical devices,

or descriptions of the software packages were absent. Qualitative studies rich in technological

detail can thus, supply a perspective on interface design which may explain effects on clinical

decision-making.

2.3.2.3 Limitations

All studies had limited generalizability since the observed patterns of information use for clinical

decision-making were highly dependent on the task and clinical specialty. Moreover, they

focused on routine decision-making and tasks, such as hemodynamic monitoring78 or physician

diagnosis under simulated scenarios.77 It may be useful to study technology use under novel,

unpredictable and dynamic situations. Meta-ethnography illustrated a wide variety of facets

emanating from a fragmented information environment. To explore expert decision-making in

novel situations analysis techniques such as the critical decision method may be used to follow

the cognitive steps and related critical data.

Expert clinicians must dedicate a greater proportion of their cognitive skills to the last step of

information processing (interpretation). DIVTs could support earlier information processing

steps. To understand how clinicians can better interpret data in novel situations, future qualitative

studies should center around technologies with a high level of data integration, containing both

physiological monitoring and intervention data, spanning specialty-specific time windows and

have the ability to contextualize data trends.

Due to the multiple problem facets of intensive care decision-making and the small number of

qualitative studies, meta-ethnographic synthesis was not amenable to refutational syntheses

(where findings contradict each other) or reciprocal syntheses (where findings are directly

comparable). Instead findings were taken together and interpreted as a line of argument.67,70 As

clinical data streams converge on complete integration, further qualitative research is expected

and more opportunities to refute or reciprocate findings are expected.

32

The flexibility of qualitative methods made extraction of first and second order constructs

difficult. When more than three methods were used less data was presented leading to unclear

relations between study author interpretation (second order constructs) and clinician reported

perceptions (first order constructs).81 Therefore, when using multiple qualitative techniques, first

order, clinician perceptions should be provided to enable triangulation of data and validation of

author interpretations.

2.3.2.4 Summary of Qualitative Findings

In sum, the strategies to complete the steps of data acquisition, filtering, prioritization,

integration and interpretation were opportunistic and time consuming. To gather information and

understand trends error-prone work-around strategies were used, by both nurses and physicians.

While some CIT systems offered automated trending of data, these needed to be contextualized

to support conceptualization. In addition, the nature of qualitative studies necessarily explores

the human subjects’ perception of their work and environment and so does not place importance

on technological features to the disadvantage of guidance on design. Therefore, meta-

ethnography on perceptions of information technologies requires that qualitative studies include

comprehensive descriptions of technologies and an intent to relate cognitive work to

technological functions (tasks performed by the technology).

2.3.3 Review of Quantitative studies

2.3.3.1 Results

This section describes quantitative studies that reported a measured impact on clinician

performance with greater technological detail than qualitative studies. The initial systematic

search from May 2014 resulted in 17 quantitative studies which met inclusion criteria and the

updated search in January 2018 resulted in 3 additional studies.

2.3.3.1.1 Study Completeness and Quality

For the 20 quantitative studies, study completeness ranged from 27 to 43, out of a maximum

score of 47. Study quality ranged from 46 to 79, out of a maximum score of 90. Both scores are

presented in Table 5. Interrater reliability, between two raters, was 69.5%, for study

completeness. The lowest QUASII scores were associated with a low degree of technology

33

implementation (not implemented in the ICU), limited generalizability to a single site settings or

participant population, simple statistical analysis (reporting only averages and standard

deviations) and minimal discussion of confounders.

34

Table 5. Evidence table of study data integration and visualization technology

Study Country,

Setting

Sample Technological

Context

Study Design/Methods/Measures Completeness

and QUASII

scores

1 Ahmed

(2011)95

USA, Intensive Care Unit, in an academic tertiary referral center

6 attending physicians and 14 residents/fellows

Novel User Interface Electronic Environment - Task-specific user interfaces - EMR user interface showing

Design: Prospective, unblinded, randomized cross-over study, Methods: Low-fidelity simulation Measures: Cognitive load (NASA-TLX), Number of errors in cognition, Time to task completion (in seconds), Total quantity of data presented

38/47 78/90

2 Anders

(2012)96

USA Two University teaching hospitals intensive care units

32 ICU nurses, 16 at each site

Integrated graphical information display

Design: Repeated measures Simulations study Methods: Low-fidelity simulation Measures: Percent correct detection of abnormal patient variables, Nurse task load (NASA-TLX), Display usability rating

35/47 74/90

3 Drews

(2014)97

USA, Applied and Basic Cognition Laboratory

42 ICU nurses Configurable vital signs (CVS) display

Design: between-subjects experimental design Method: Low-fidelity simulation Measures: Response time, Accuracy of data interpretation, Workload (NASA-TLX), Clinical desirability of CVS display

37/47 74/90

4 Dziadzko (2016)98

USA, ICUs at two tertiary hospitals

246 before (existing EMR) and 115 after (existing EMR + novel EMR interface) surveys

AWARE, a novel .NET based application extracts and organizes relevant patient data using ranking and decision rules

Design: Before and after implementation survey Methods: Survey Measures: User satisfaction and usability

38/47 72/90

35

Study Country,

Setting


Context


and QUASII

scores

5 Effken

(2006)99

USA, Arizona Health Sciences Center

20 novice ICU nurses, 13 medical residents with ICU rotation, 3 expert ICU nurses 3 expert ICU physicians

Etiologic potentials display (EPD)

Design: Mixed design 2 (order) × 2 (display) × 4 (scenario) × 3 (level) Methods: High-fidelity simulation with interactive simulator Measures: Treatment initiation time, percentage of time patient variables were kept within a target range

36/47 70/90

6 Effken

(2008)100

USA, University of Arizona

32 adult ICU nurses Ecological display (ED) with variables ordered

Design: Mixed experimental design Methods: Written pretests and simulation Measures: Critical event recognition, Treatment efficiency, Cognitive workload, User satisfaction

40/47 76/90

7 Ellsworth (2014)101

USA, Neonatal ICU, Mayo Clinic

23 NICU respondents: 8 attending physicians, 2 fellows, 4 nurse practitioners, and 9 pediatric residents

98 unique data items available from the EMR

Design: Web-based survey Methods: Web-based survey Measures: Mean importance score for data items used in NICU routine clinical decision-making

38/47 70/90

8 Forsman

(2013)102

Sweden ICUs of a university hospital and two general hospitals

15 physicians (ethnographic study) and 8 physicians (usability testing)

Integrated information display for patient infection status to inform antibiotics use

Design: Ethnographic studies and participatory design Methods: Observations, Semi-structured interviews, Focus group, Usability testing with eye tracking Measures: Time

36/47 59/90

9 Görges

(2011)103

USA University of Utah Health Sciences Center, break room

16 medical ICU nurses

2 far-view physiological monitoring displays: - strip-chart/bar display - circular, clock-like display

Design: Repeated-measures within-subject experimental design Method: Low-fidelity simulation, randomized repeated measures within-subject design Measures: Decision time, Decision

37/47 76/90

36

Study Country,

Setting


Context


and QUASII

scores

accuracy, Workload scores, Display preference

10 Görges

(2012)104

USA University of Utah Health Sciences Center, break room

15 ICU fellow physicians

2 far-view physiological monitoring displays: - strip-chart/bar display - circular, clock-like display

Design: Randomized repeated measures within-subject design Method: Low-fidelity simulation Measures: Decision time, Decision accuracy, Workload scores, Display preference

34/47 74/90

11 Koch

(2013)105

USA Nurses' break room, burn Trauma Intensive Care Unit (BTICU)

12 experienced BTICU nurses

Integrated information display with information used for comparable tasks was shown in spatial proximity

Design: Counter-balanced (on display order), repeated-measures design Method: Simulations requiring participants identify information about medication management, patient awareness, and team communication Measures: Situation awareness (accuracy of the participants’ answer) and Task completion time

43/47 74/90

12 Law

(2005)106

UK Neonatal Intensive Care Unit (NICU)


displays presented in a research version of BADGER trend monitoring system

Design: Mixed design Method: Simulations requiring participants perform 8 types of actions: order chest X-ray, intubate or re-intubate, re-apply transcutaneous probe, start dopamine, treat with surfactant, put baby on High Frequency Oscillatory Ventilation (HFOV), start Continuous Positive Airway Pressure (CPAP), or No Action Measures: Speed of responses, quality/appropriateness of responses, reported preference

33/47 73/90

37

Study Country,

Setting


Context


and QUASII

scores

13 Liu (2004)107

Sweden ICU of University Hospital and Usability laboratory

Interviews: 6 ICU nurses Usability testing: In20 medical nursing students

Graphical circular display prototype of a ventilator machine display

Design: Within subject design Method: Interviews and usability testing

Measures: Detection time and error rates

32/47 70/90

14 Miller

(2009)108

Australia ICUs of two major metropolitan tertiary teaching hospitals


Work domain analysis (WDA) based paper prototype (PP) and electronic prototype (EP) of a clinical information system

Design: Within-participants, 2 (control and prototype) x 4 (four patient data sets) counterbalanced design Methods: Simulated scenarios where nurses detected changes in patient parameters and physicians completed diagnostic tasks

Measures: Detection of patient change (nurses) and Physician diagnostic agreement

35/47 66/90

15 Peute

(2011)109

The Netherlands Clinical workspace

12 participants (unspecified specialty)

Web-based Data Query Tool of the Dutch National Intensive Care Evaluation (NICE)

Design: Pre-post design, think-aloud usability study Method: Usability testing Measures: Number of usability issues and task efficiency

34/47 46/90

16 Pickering

(2010)110

USA Remote testing facility

6 off-duty critical care fellows and residents

AWARE (Ambient Warning and Response Evaluation), program which extracts a subset of predefined cues based on relevance

Design: Prospective, randomized, cross over pilot study Methods: Simulated scenarios to extract a sequence of decision-making cues

Measures: Cognitive load (NASA TLX), number of medical error and efficiency (time and accuracy)

33/47 76/90

17 Pickering (2013)111

USA, 3 ICUs 1, 277 physician-patient interactions and 925 questionnaires.

Institutional EMR system integrating vital signs, microbiology, medications, laboratory results, fluids, nursing

Design: Prospective observational study and retrospective chart review Methods: Observations and

Questionnaires

Measures: Frequency of data elements

39/47 79/90

38

Study Country,

Setting


Context


and QUASII

scores

flow sheet items, and clinical notes.

used per physician-patient interaction episode.

18 Pickering (2015)112

USA, 4 ICUs 375 clinicians (physicians, nurses, respiratory therapists and pharmacists), 169 survey respondents

AWARE (Ambient Warning and Response Evaluation), program deployed in the ICUs compared to EMR system

Design: Step wedge cluster randomized control trial Methods: Direct observations and surveys Measures: Time to gather information

41/47 79/90

19 van der Meulen (2010)113

UK Neonatal Intensive Care Unit (NICU)

18 physicians and 17 nurses

Natural language generation (NLG) using BT-45 computer program, summarizing physiological data

Design: Mixed experimental design Methods: Simulations where participants must select appropriate actions Measures: Response time and scores: through expert consensus of three clinicians’ actions were appropriate, inappropriate and neutral

30/47 64/90

20 Wachter (2005)114

USA, Medical Intensive Care Unit (MICU), University of Utah Hospital

32 clinicians (physicians, residents, nurses and respiratory therapists) attending to 2 ventilator-dependent patients

Pulmonary function graphical and numerical display with fraction of inspired oxygen (FiO2), end tidal carbon dioxide (EtCO2), tidal volume (VT) and anatomical representation of intrinsic positive end expiratory pressure (iPEEP)

Design: Observational study design Methods: Observations and questionnaires Measures: Number of glances at display, perceived usefulness, acceptance, desirability and accuracy of display

27/47 50/90

39

2.3.3.1.2 Study Settings, Participants or Patient Data Sets

Of the 20 studies included in this review, 14 were conducted in the USA, of which 6 were at the

University of Utah and 6 at Mayo Clinic, 3 in Europe, 2 in the UK and 1 in Australia. Four

studies were conducted at off-site laboratories,97,100,107,110 11 in clinical spaces that were used as

ad-hoc simulation rooms,95,96,99,102-106,108,109,113 and 2 at the bedside or the unit.112,114 Four studies

involved 2 or more sites,96,98,102,108,111,112 of which 1 was the prototype development site and the

other the test site.102

Half of studies focused on a single profession, either nurses (6) 96,97,100,103,105,107 or physicians

(4).95,102,104,110 Four studies involved both nurses and physicians,99,106,108,113 1 with the addition of

respiratory therapists,114 and 1 involved the complete ICU bedside team.112 Participant sample

size ranged from 6 to 375. When specified, 4 studies specialized in adult,95,100,108,110 2 in

neonatal,106,113 2 in burn trauma 102,105 and 1 with a mix of medical, surgical and trauma units

intensive care.112

2.3.3.1.3 Study Design, Methods, Metrics and Overall Outcomes

Most studies used prospective, repeated measures design and utilized simulation methods.

Among the 20 studies, 10 types of metrics were used to measure the impact of the DIVT on

clinician performance. Table 6 summarizes the metrics used by each quantitative study. The 4

most common metrics were related to time (14) either for task completion or making a decision,

quality of decision (11), cognitive load (7) and user preferences between traditional systems and

the new technology (13). The first 2 metrics were objective measures while the last 2 were

subjective. Time was measured in the context of action (e.g. time to initiate decision), waiting

(e.g., time within target range) or gathering information (time to complete data gathering tasks).

Quality of decision was typically evaluated according to a scorecard devised by a team of expert

familiar with the scenarios.

40

Table 6. Metrics used in quantitative human factors studies of data integration and visualization technologies

+ indicates positive impact; (ns) indicates not significant, – indicates negative impact; m indicates mixed impact or ranked; nc indicates not compared; *Calculated in two ways from number of correct responses and incorrect responses; ** comparison between types of groups or categories

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Metric Description Ah

med

(20

11)9

5

An

der

s (2

01

2)9

6

Dre

ws

(20

14

)97

Dzi

adzk

o 2

01

6)9

8

Eff

ken

(2

00

6)9

9

Eff

ken

(2

00

8)1

00

Ell

swo

rth

(2

014

)101

Fo

rsm

an (

201

3)1

02

Gö

rges

(2

01

1)1

03

Gö

rges

(2

01

2)1

04

Ko

ch (

20

13

)105

Law

(2

005

)106

Liu

(2

00

4)1

07

Mil

ler

(20

09)1

08

Peu

te (

20

11

)109

Pic

ker

ing

(20

10)1

10

Pic

ker

ing

(20

13)

111

Pic

ker

ing

(20

15)1

12

van

der

Meu

len

(20

10

)113

Wac

hte

r (2

005

)114

To

tal

1 Task completion or Decision Time

✓ +

✓ +

✓ ns

✓ ns

✓ **

✓ +

✓ +

✓ +

✓ ns

✓ -

✓ +

✓ +

✓ +

✓ ns

14

2 Task completion rate ✓ **

✓ +

✓ ns

3

3 Time within target range (time and accuracy)

✓ ns

✓ ns

2

4 NASA Task Load Index ✓ +

✓ ns

✓ ns

✓ ns

✓ ns

✓ +

✓ +

7

5 Accuracy (Appropriate actions or Errors)

✓ +

✓ +

✓ +

✓ +

✓ +

✓ +

✓ -

✓ -

✓ +*

✓ +

✓ -

11

6 Quantity of data/information on screen or used

✓ +

✓ **

2

7

Self-reported usability/Satisfaction/Preference (on scale)

✓ +

✓ +

✓ **

✓ +

✓ **

✓ **

✓ +

✓ +

✓ +

✓ +

✓ **

✓ +

✓ ns

13

8 Number of usability issues

✓ m

1

9 Looking or accessing display

✓ ns

✓ **

✓ +

3

10

Questionnaire/Interviews (comments)

✓

✓ **

✓ ✓ 4

Total number of types of metrics 4 3 5 1 2 4 1 5 4 4 2 3 4 1 3 3 2 2 3 3

41

Time, as a measure of efficiency, was used in 14 studies. In 8 of those studies, clinicians were

more efficient with the new technology compared to a traditional or previous version of the

DIVT.95,97,103-105,110,112 In 2 studies by Effken, the composite measure of time within target range,

measured both time and accuracy of decision.99,100 In 1 study, accuracy rates and completion

time were compared at three levels of situation awareness, that is those of perception,

comprehension and projection.105 Peute adjusted time to complete task using the system

designer’s known optimal, or fastest possible, time to complete task.109 Decision-making

accuracy improved, in 8 out of 11 studies, with the new DIVT compared to traditional data

information system95-97,103-105,110 or when electronic charts were compared to paper charts.108

Subjectivity of scoring decision quality was minimized through independent expert consultation;

in 3 studies the Delphi process was used.96,100,111

Seven studies measured cognitive load using the NASA-TLX instrument95-97,100,103,104,110 and 10

reported usability or preference, on a scale.96,97,100,102-104,106,107,112,114 Cognitive load was

improved in 3 out of 7 studies while the remaining studies reported non-significant change. In 7

out of 9 studies, participants preferred new technologies over existing systems.

Other measures included heat maps of eye gaze,102 number of glances at the display from the

bedside,114 and number of usability issues.109 The reporting of a mix of positive or non-

significant overall outcomes suggested publication biased was not present in this pool of studies,

see Table 6.

2.3.3.1.4 Study Scenarios, Tasks and Realism Fidelity

All studies used simulated clinical scenarios except 2 which used direct observations112,114 or live

patient data feeds from the ICU.110 Of the 6 studies with scenario descriptions95-97,99,100,108 5

described at least one scenario involving sepsis or septic shock.96,97,99,100,108 Table 7 summarizes

each study’s level of simulation realism, scenario characteristics and types of tasks.

42

Table 7. Summary of study realism, scenario description, and types of tasks performed.

Study

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Ah

med

(20

11)9

5

An

der

s (2

01

2)9

6

Dre

ws

(20

14

)97

Dzi

adzk

o 2

01

6)9

8

Eff

ken

(2

00

6)9

9

Eff

ken

(2

00

8)1

00

Ell

swo

rth

(2

014

)101

Fo

rsm

an (

201

3)1

02

Gö

rges

(2

01

1)1

03

Gö

rges

(2

01

2)1

04

Ko

ch (

20

13

)105

Law

(2

005

)106

Liu

(2

00

4)1

07

Mil

ler

(20

09)1

08

Peu

te (

20

11

)109

Pic

ker

ing

(20

10)1

10

Pic

ker

ing

(20

13)1

11

Pic

ker

ing

(20

15)1

12

van

der

Meu

len

(2

010

)113

Wac

hte

r (2

005

)114

Study realism level

Simulated ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

In-situ/direct observations ✓ ✓ ✓

Questionnaire/Survey ✓ ✓ ✓

Study assessment of realism ✓ ✓

Scenario descriptions (if described)

Sepsis/septic shock ✓ ✓ ✓ ✓ ✓

Pulmonary embolus/edema

✓ ✓

Stable patient ✓

Actively bleed ✓

Post-operation ✓ ✓ ✓

Acute resp. distress s. ✓ ✓

Abnormal cardiac rhythm ✓ ✓

mild peri. tamponade ✓

Tasks

use of antibiotics ✓

mechanical ventilation ✓ ✓

database query ✓

continuous infusions ✓ ✓ ✓

43

2.3.3.1.5 Study Technology Characteristics

2.3.3.1.5.1 Study Technology Extent of Data Integration, Prototype Maturity and Temporal Representation

New DIVTs aim to efficiently integrate across longer time spans and multiple discrete devices.

In 2006, Effken noted the need for their study technology to display trends,99 studies after 2008

provided this functionality.96,97,102-104,108 Qualitative studies stress the need to integrate the

medical intervention “causes” to the physiological response “effects.” Study technologies in this

review integrated intervention data from infusion pumps,103,104 antibiotic use records102 and

mechanical ventilation107,114 with basic vitals. When vitals were reduced to averages, they

became insufficient. Physicians requested that “systolic and diastolic values be added on the

[metaphor visualizations of] mean arterial blood pressure plot of both bar and clock displays.”104

Nine studies compared their designed technological intervention to the traditional data source

and all but one showed improvement, see Table 8. A common requirement was the addition of

multiple physiological data streams. These examples illustrated the contemporary progress of

technological integration and a requirement that technologies should have a comprehensive,

minimum set of parameters, beyond the traditional physiological monitor of a few waveforms

and constantly “refreshed” numerical values. Systems must be able to show everything while

highlighting the basic vitals.

Critically-ill patients fall outside of the normal patient population and require their current state

to be compared to their previous, unique, stable state, for example. Therefore, historical temporal

data representations enable clinicians to understand what is “normal” for each patient and assess

personal improvement or deterioration. Temporal data representation could be inferred when

study authors used the term “trending,”97 “trajectory”96 or “projection.”105 Table 8 summarizes

study technology realism levels, type of comparator and temporal representation.

44

Table 8. Study technology realism, comparator and temporal representation Study

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Ah

med

(20

11)9

5

An

der

s (2

01

2)9

6

Dre

ws

(20

14

)97

Dzi

adzk

o 2

01

6)9

8

Eff

ken

(2

00

6)9

9

Eff

ken

(2

00

8)1

00

Ell

swo

rth

(2

014

)101

Fo

rsm

an (

201

3)1

02

Gö

rges

(2

01

1)1

03

Gö

rges

(2

01

2)1

04

Ko

ch (

20

13

)105

Law

(2

005

)106

Liu

(2

00

4)1

07

Mil

ler

(20

09)1

08

Peu

te (

20

11

)109

Pic

ker

ing

(20

10)1

10

Pic

ker

ing

(20

13)1

11

Pic

ker

ing

(20

15)1

12

van

der

Meu

len

(2

010

)113

Wac

hte

r (2

005

)114

Technology realism

Paper ✓

Computer static ✓ ✓

Paper slide deck

Computer slide deck, with some interaction

✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Fully-interactive, dynamic ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Comparator

Traditional data source ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Other display ✓ ✓

Other specialty ✓ ✓

Temporal representation

Current ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Historical (explicit)

✓

all

stay

✓

12h

✓

12h

✓

30m,

53m

✓

5d

✓

45m

Historical (implicit) ✓ ✓ ✓ ✓

45

2.3.3.1.5.2 Study Technology Visualization

Nine studies compared integrated displays with new visualizations. As more parameters and their

continuous data streams are integrated onto a single DIVT so may the possibility of visual

pattern overload increase. Solutions include selective text summaries and visual metaphors.

Computerized natural language text summaries did not show improvement over time series

visualizations.106,113 Given that the comparator was the visual time-series data trend, a desirable

feature, the text summary may have not provided the objective data clinicians consistently rely

on. Time series was the primary visual representation of data trends except for 3 studies which

integrated data for patient status assessment.95,107,110,112 Metaphor representations were used in 6

studies.96,97,103,104,107,114 Representations for physiological data improved accuracy of decisions

but not for respiratory parameters.107 This suggests that while basic vitals and new visualizations

were typically studied, respiratory data has not been explored as much and with their typical end-

users, respiratory therapists. This may also point to a research gap in the integration of other

physiological systems, include neuromonitoring using cerebral oxygenation and other composite

indicators. While hemodynamic monitoring was studied, further research in the post-cardiac

surgery population and cardiac intensive care specialties may also be explored in the future.

2.3.3.1.6 Meta-Analysis of Cognitive Load Impact

Since effect size was not typically calculated a meta-analysis could not be performed in the

traditional way. Due to the variety of controls, tasks, displays and clinical professions the meta-

analysis was an aggregation of the raw NASA-TLX data directly obtained from study authors,

and using baseline data for paper systems.96,103,104,115 The median, interquartile range and sample

size, for each of the six dimensions of cognitive workload, on self-reported on a scale with 21

gradations, are reported below and presented in Figure 5. The scale anchor points are presented

in Appendix D. For each of the cognitive load dimensions, in pairs of display conditions, we

tested if there were differences in cognitive load scores. Significant differences were found

between all three electronic displays and paper controls, for the dimensions of mental demand,

physical demand, temporal demand, performance and effort. There was no significant difference

in frustration scores between any electronic display and paper. The median mental demand

scores were lower for all electronic visualizations compared to paper with scores of 10 for

46

electronic display (7-13, n=89), 8 for tabular and bar displays (6-11.5, n=63), and 8 for new

visual metaphors (6-12, n=63), compared to 17 for paper (14-19, n=26). The median temporal

demand scores were also lower for all electronic visualizations compared to paper with scores of

8 for electronic display (6-11, n=89), 7 for tabular and bar displays (6-11, n=63), and 7 for new

visual metaphors (5-11, n=63), compared to 16 for paper (14.25-19, n=26). The median

performance scores improved (lower score means better self-reported performance) for all

electronic visualizations compared to paper with scores of 6 for electronic display (3-11, n=89),

6 for tabular and bar displays (4-11, n=63), and 6 for new visual metaphors (4-11, n=63)

compared to 14 for paper (11-16, n=26). The median effort scores were lower for two of the

electronic displays with 8 for tabular and bar displays (5.5-11.5, n=63), and 7 for visual

metaphors (6-11, n=63) compared to 10 for paper (10-12, n=26). Physical demand and

frustration did not improve with any electronic displays compared to paper, except for a small

improvement in physical demand score with a score of 1 for new visualizations (1-2, n=63)

compared to 3 for paper (1-3.75, n=26). Comparing electronic displays with each other, only

tabular and bar displays had a significantly lower effort score of 8 (5.5-11.5, n=63) compared to

11 for the electronic display, or electronic control (7-14, n=89). Mental and temporal demands

were higher, and performance was poorer, for all three visualizations compared to paper. Effort

was lowest with new visualizations. Except for the paper control, averages of other display types

used data from 2 or 3 studies. Descriptive statistics and non-parametric statistical test results are

included in Appendix D.

47

Figure 5. The six dimensions of cognitive load (mental demand, physical demand, temporal demand, performance, effort, and frustration)

for four display conditions: paper control (Dal Sasso 2015),115 electronic controls (Görges 2011 and 2012, Anders 2012, Dal Sasso 2015),

tabular or bar graphs (Anders 2012, Görges 2011 and 2012) and novel visualizations (Anders 2012, Görges 2011 and 2012). *Pcontrol:

paper control; Econtrol: electronic control; TabBar: tabular or bar graph visualization; NewVis: integrated visualization with clock and

infusion representations or integrated visualization

Mental Demand Physical Demand Temporal Demand

Performance Effort Frustration

48

2.3.3.2 Benchmarking Quantitative Studies

Given the high resource demands of HF studies, especially for intensive care, reporting studies

with a high degree of completeness and quality is essential. Peute et al.’s completeness checklist

and the QUASII tool were used to evaluate studies. Furthermore, the two scores can be used to

both design and assess study performance. Each item on the completeness checklist can be used

to extrapolate quality of the study. In theory, for every 1 completeness point, 1.91 points in

quality can be gained (e.g. the maximum quality divided by the completeness scores, 90/47). The

difference between extrapolated and actual quality can be taken as the added performance, see

Table 9.

Table 9. Study completeness, quality and performance analysis. Studies in bold have a positive

added performance where the study quality is higher than the projected quality based on study

completeness.

Study Author, year Completeness

(max 47)

Quality

(max 90)

Projected

quality

Added

performance

Pickering, 2010 33 76 63.2 12.8

Law, 2005 33 73 63.2 9.8

Gorges, 2012 34 74 65.1 8.9

Liu, 2004 32 70 61.3 8.7

Anders, 2012 35 74 67.0 7.0

van der Meulen, 2010 30 64 57.4 6.6

Ahmed, 2011 38 78 72.8 5.2

Gorges, 2011 37 76 70.9 5.1

Pickering, 2013 39 79 74.7 4.3

Drews, 2014 37 74 70.9 3.1

Effken, 2006 36 70 68.9 1.1

Pickering, 2015 41 79 78.5 0.5

Effken, 2008 40 76 76.6 -0.6

Dziadzko, 2016 38 72 72.8 -0.8

Miller, 2009 35 66 67.0 -1.0

Wachter, 2005 27 50 51.7 -1.7

Ellsworth, 2014 38 70 72.8 -2.8

Koch, 2013 43 74 82.3 -8.3

Forsman, 2013 36 59 68.9 -9.9

Peute, 2011 34 46 65.1 -19.1

Studies with a positive added performance score were defined as having a higher quality than its

projected quality. Positive added performance studies used realistic interfaces, had some

interactivity and used objective metrics. Surveys, observations and convoluted study designs had

49

negative added performance. To increase experimental validity new HF studies could use Peute’s

checklist and the QUASII tool to check and report their experimental design.

As new technologies and their iterations are launched and tested assessing the impact on

decision-making efficiency research could benefit from standard protocols using scenarios and

tasks by the established research group referenced in this review. Methodologies, including

scenarios descriptions, test data sets and scoring metrics, can be made available for researchers

aiming to test their own candidate technologies. Efforts can also be reduced by referencing

standard scenarios and annotated data sets from collective databases such as the IMPROVE

database,103,104,116 and MIMIC II.117 As a starting point, the data libraries should be referenced,

merged and grown to include a greater variety of data sets, parameters and patient scenarios,

beyond the initial 7 parameters and 50 oxygen-transport-related annotated patient scenarios, in

the case of the IMPROVE database. Basic parameters such as heart rate and blood pressure, are

starting to be benchmarked according to subset ICU populations such as pediatric patients.5

Collaborations among the research groups can accelerate design of promising technologies

though these benchmarking simulation protocols and creation of repositories containing

annotated continuous data sets, to populate DIVT for HF testing. Also, a large repository could

facilitate creation of cohort patient data sets to target intensive care specialties.118 Furthermore,

prioritizing standard scoring metrics for accuracy of decision-making can greatly support

research groups with their assessment of new metaphor visualizations.

2.3.3.3 Discussion of Meta-Analysis and Findings of Quantitative Studies

Meta-analysis was used to compare the cognitive load of four types of visualizations, from 3

studies and a baseline study on paper systems.96,103,104 Pickering et al. found a strong correlation

between cognitive load and the number of medical errors.110 As a proof-of-concept, meta-

analysis of the cognitive load metric, the NASA-Task Load index (NASA-TLX), was performed

by calculating averages of groups of technologies with a baseline paper process. Figure 5 showed

that metaphor visualizations lowered perceived mental, temporal demand and effort, compared to

paper systems, electronic controls and tabular and bar visualizations. Although, there was

variation in experimental conditions and technologies, it was useful to gain an initial

understanding of the effects of denser data visualizations, on each of the six facets of cognitive

load.

50

2.3.3.4 Limitations

A main limitation of most studies was the control groups which did not comprehensively

represent the complete clinical information system (e.g., paper charts, EMR system, and other

dedicated monitors). In reality, the information system may consist of combinations of an EMR,

a paper chart, other stand-alone monitoring and intervention devices, etc. Identifying these

technological components, and designing human factors studies that compare all sources with

those integrated in the novel DIVT would support a process of divestment from unnecessary

technologies. In addition, as several platforms are at an advanced development stage or have

been commercialized,51,77,110,119 future studies could compare DIVTs to each other to find the

most appropriate DIVT.

Another limitation, in the meta-analysis, was the description of the electronic control and

whether some of the data representations were in tabular form. We also used a paper-controlled

nursing process to provide a baseline measurement of cognitive load, and thus generalized the

paper-based data presentation.

As with qualitative studies, quantitative studies focused on physicians and nurses. Multiple

perspectives of the intensive care team contribute to continuous care for a given patient,

therefore, additional perspectives should be included in future studies so that findings may be

generalized the complete ICU team.

2.3.3.5 Summary of Quantitative Findings

This systematic review of quantitative human factors studies encompasses a variety of study

designs, methods, task scenarios and outcome measures. The main limitation to understanding

the interaction between clinicians and a central DIVT (study technology) is that none integrate

all clinical data parameters. The most common study design was simulation based and used

outcomes measures of time efficiency, accuracy of decision, cognitive load, and self-reported

user satisfaction or preference. Studies published from 2004 to 2016, described a trend from

paper prototypes to fully-implemented ICU technologies. These information integration

technologies supported either an instantaneous assessment or a dynamic, trend-based summary

with algorithm-based projections of patients’ evolving status. All technologies aimed to provide

a high degree of interface interaction but less than half offered real-time interaction. As the

51

technology matures and is optimized to support clinical decision-making, their effects may be

viewed as viable interventions, akin to medical interventions, with clear outcomes on patient care

(i.e., LOS, and cost savings to hospitals).120 Access to a vast ICU network was why one

technology in particular was systematically tested, refined and used in an in-situ randomized

trial.112 In the absence of such a network, researchers, designers and hospital technology

managers may need to look to the published literature and reviews such as this one to find

guidance on benchmarked outcome measures, scenarios and data sets, and preferred reporting

items for human factors studies.

2.3.4 Research Gaps

Several research gaps were found in the systematic review:

• The highly-specialized work force and team decision-making required the additional

perspective of respiratory therapists and pharmacists, for example, be included in future

studies. Specifically, few studies included respiratory therapists or pharmacists, but both

contribute to the decision-making process by using data from intervention technologies

(e.g., ventilators and infusion pumps) or when comparing changes in performance

efficiency.

• The five data and information processing steps were defined for individual decision-

making, but an understanding of how teams share, transfer information and communicate,

should be studied.

• Qualitative studies lacked explicit descriptions of information sources and features of

DIVT that could integrate to guide user centered design.

• Prioritize the information sources to be integrated based on the cognitive processes

employed by different clinicians and according to their needs.

• Describe the dynamic process of decision-making using clinical data mediated by

technology.

• Provide findings on commercial clinical information systems and their features.

52

• Human factors testing of commercial EMR systems’ data visualization functions and

interactions for measured impact on decision-making, in terms of time efficiency,

accuracy of decision, and cognitive load.

2.3.5 Longitudinal Studies with Qualitative and Quantitative Components

Within the scope of this two-part systematic review, we found interrelated qualitative and

quantitative studies, see Table 10. For example, Law and van der Meulen based their quantitative

studies on Alberdi’s 2001 and Cunningham’s 1992 qualitative studies, thus spanning 18 years of

published research. Another example is the quantitative study on integrated displays for intensive

care nurses by Drews and Doig 97 based on a qualitative study three years earlier by Doig, Drews

and Keefe.78 Finally, the AWARE system spanned 8 years with multiple surveys, a qualitative

study and a patient- and hospital-level study101 and extended to the general inpatient ward.121

These examples show how qualitative and quantitative research informs technology design.

Indeed, Carayon described human factors projects as long-term, multi-phase in nature.122 This

review suggests studies are geographically, academically and institutionally grouped.

The need for engineering expertise when designing user-friendly interfaces has not been

addressed in our study because we excluded studies at the technology level that focused on the

information technology infrastructure or algorithm development, for example. Our own

experience shows that these types of technologies not only span human factors studies but also

engineering studies which shape the back-end computational demands required to modify the

interface and address usability issues. For example, published research on one commercial DIVT

for intensive care has included algorithm development and human factors testing.60,123,124 The

quantitative human factors studies reported in this systematic review dated from 2004 to 2018.

Over this period, the dramatic evolution of technology interactive capabilities was facilitated by

advancements in storage and retrieval processes. This could benefit technology acceptance which

relies on timely responsiveness of a technological system (i.e., less than 0.1 second

instantaneously reacting).125 As prototypes become more interactive, quantitative and qualitative

studies further inform an understanding of changes in human cognition and performance.

53

Table 10. Lifespan of Quantitative Testing for a Given Data Integration and Visualization

Technology

Year of Study Publication

Technology

Code Name

and Studies 20

04

20

05

20

06

20

07

20

08

20

09

20

10

20

11

20

12

20

13

20

14

20

15

20

16

1

BABYTALK

Law (2005)106 van der Meulen (2010)113

5 years

(1992)48,126

2

Ecological

Display

Effken (2006)99 Effken (2008)100

2 years

(1993)127

3

Far-View

Display and

Integrated

Display Görges (2011)103 Görges (2012)104 Koch (2013)105

3 years

(S)*

4

Integrated

Graphical

Information

Display (IGID) Miller (2009)108 Anders (2012)96

3 year

(2004)128s

5

AWARE Pickering (2010)110 Ahmed (2011)95 Pickering (2015)112

6 years

Multiple qualitative studies in 201698

6 Circular Display

Prototype Liu (2004)107

X (S)

7

Pulmonary

Graphical

Display

Wachter (2005)114

X

(2002)

8 Data Query Tool

for NICE Peute (2011)109

X

(2010)129

9

Visual Tool

Prototype Forsman (2013)102

X

(S)

10 Configural Vital

Signs Display

Drews (2014)97

X (2008)130*

Parenthesis indicates year of first published seminal work. S: indicates the study is seminal work; * indicates seminal work was found in qualitative part of this systematic review.

54

2.4 Conclusions

This review addressed our two key research questions. The first was to describe the technologies

that supplied continuous data and information and how they were used for critical decision-

making. Qualitative findings described five data processing steps and various cognitive

strategies, developed by physicians and nurses, to navigate a fragmented and inefficient data

environment. Decision support technologies required data contextualization to support cognitive

conceptualization of the current and projected patient states. Meta-ethnography was successfully

applied to the synthesis of five studies. However, first and second order constructs extraction was

challenging with multi-method studies. Diagnosing, information gathering, hemodynamic

monitoring and medical management were the most common tasks studied. These studies

describe the landscape of clinician decision-making in the ICU.

The second research question aimed to determine the measured impact of DIVTs on clinician

performance. Though there was a variety of outcome measures for human performance,

measurements of time efficiency and accuracy of decision-making were most common in the 20

quantitative studies. Cognitive load, measured consistently using the validated NASA-TLX

instrument, was the only measure that could be aggregated. This cognitive load outcome measure

indicated that new visualizations were an improvement over paper-based data processes.

Due to the complex nature and high-risk decision-making of critical care a single focused study

on the problem of data management represents a small piece of the puzzle. Understanding team

decision-making requires that the diverse clinical specialties be reflected in the studies. As

technologies are implemented, any human factors study on them should be synthesized to further

understand the cognitive impact on team care. To this end the strength of the meta-ethnography

technique to distill meaning from qualitative studies was demonstrated for physicians and nurses

only. If qualitative human factors research is to guide technology developers and hospital

technology procurement details of the DIVT’s features and functionalities should be provided.

Also, individual qualitative studies should strive to follow a similar reporting structure to be

amenable to meta-ethnographic synthesis, using the ENTREQ statement, for example.

Designing technologies to account for complex human factors including the cognitive work can

benefit from contextualized data and quantitative test phases. This review highlights the variety

55

of integrating technologies, comparators, study designs, clinician-participants, settings,

scenarios, tasks and outcome measures. We found that although technological descriptions were

limited studies did offer insight into technology design. Human factors studies which test

technological solutions may also provide qualitative findings on the impact at a given design

cycle but more may be understood through qualitative studies with rich technological detail.

Technology-focused qualitative studies may provide a deeper understanding of why performance

improved or worsened or the cognitive processes technology can support. We found that as a

whole the collection of studies from this review describe the requirements for effective data

integration and visualization technologies.

To accelerate development and standardize clinical information systems, at the point-of-care,

academic and commercial developers should share findings about their human factors research.

We propose a checklist for reporting qualitative data of technologies used in clinical settings

with a view that such research would be periodically synthesized into systematic reviews. We

encourage the sharing of scenarios, descriptions, data sets and custom quantitative metrics, all

developed by a laborious process of experimental development with ethics approval, to be

categorized for ICU specialty and professions, and shared in a common database for future

human factors testing.

56

Chapter 3 Technology-Mediated Macrocognition of Intensive Care Teams:

Investigating How Physicians, Nurses, and Respiratory Therapists Make Critical Decisions

The objective of this study was to identify the macrocognitive processes, and related technology-

mediated information sources, occurring during critical decision-making by physicians, nurses

and respiratory therapists. The chapter is formatted as a manuscript and was submitted to a peer-

reviewed journal.

3.1 Abstract

Importance: It remains unclear how different intensive care specialties make data-driven critical

decisions using technology.

Objective: To identify the technology-mediated cognitive processes used in critical decision-

making by physicians, nurses, and respiratory therapists.

Design: Open-ended interviews were conducted, recorded, and transcribed. Technology-

mediated cognitive processes were analyzed using deductive and inductive coding based on the

macrocognition framework.

Setting: Interviews were conducted at locations convenient for participants: either in closed

hospital rooms or nearby research offices.

Participants: Four physician intensivists, four nurses, and four respiratory therapists were

interviewed.

Main outcomes and Measures: Themes included macrocognitive processes, and relationships

between them, during critical decision-making, and explicit references to technological data

sources.

57

Results: Across specialties, over half of critical decision-making macrocognition was devoted to

Sensemaking, Anticipation, and Communication, and most macrocognitive processes were

technology-mediated. Physicians primarily used technologies to extract information whereas

nurses and respiratory therapists also used them to input information and manipulate settings. Of

particular note, physicians and respiratory therapists extracted information for their own use,

while nurses extracted information to communicate to others.

For physicians, all ten macrocognitive processes were interrelated (with Problem Detection being

the central process), suggesting that data integration and visualization technologies (DIVTs)

should support their need to shift between cognitive tasks during critical decision-making. For

example, detection of a potential problem may necessitate (a) monitoring a certain parameter, (b)

anticipating tests, and (c) managing uncertainty and risk. Managing Complexity was central to

nurses’ macrocognition and involved managing direct care while attending to family, team, and

organizational requirements. These results highlight the need for nursing DIVTs that expedite the

input and extraction of information required to support other team members. Uncertainty and

Risk Management was the central macrocognitive process for respiratory therapists, often

involving troubleshooting ventilation-related technologies, indicating that DIVTs should support

them in this task.

Conclusions and Relevance

Using the macrocognitive framework, we dissected critical decision-making from representative

perspectives of team care. Sensemaking and Anticipation were found to be highly technology-

mediated and therefore, amenable to technological solutions. This study provides evidence to

systematically address multiple facets of decision-making by defining specialist-specific

macrocognition and its related technological components.

58

3.2 Background

High acuity and medically complex patients require clinicians to make critical decisions under

hurried and stressful conditions. To make informed data-driven decisions, clinicians are under

increased pressure to integrate multiple data streams with their own and their colleagues’

specialist knowledge. New visualization technologies support the process of decision-making by

representing data (e.g., heart rate on the physiological monitor) and information (e.g., clinical

notes organized in the EMR).131 However, clinicians using these technologies continue to

experience cognitive overload or misinterpret signals presented to them,110,115,124 which may lead

to suboptimal patient care. Therefore, to support efficient data-driven decision-making, we must

understand what technological sources clinicians access, how they use them, and how best to

integrate them for meaningful, high density data visualization.

The critical decision method (CDM) is a knowledge elicitation method used to understand expert

decision-making in complex situations.132 The macrocognition framework is a set of dynamic

and simultaneous cognitive processes, said to occur during situations in which pre-existing rules

are lacking and decisions are not straightforward.133 According to this framework, the primary

macrocognitive processes are: Naturalistic Decision Making, Sensemaking, Planning,

Adaptation, Problem Detection, and Coordination.134 Supporting macrocognitive processes are:

Maintaining Common Ground, Developing Mental Models, Mental Simulation and

Storybuilding, Managing Uncertainty and Risk, Identifying Leverage Points, and Managing

Attention.134 Schubert, et al. used this framework to understand the differences between novice

and expert decision-making in emergency medicine.135 The macrocognition framework has been

used to analyze the effect of care management on primary care practices.136 We chose the

macrocognitive framework to understand how clinicians make decisions in intensive care.

The purpose of this study was to assess how technology-mediated data influenced intensive care

decision-making in physicians, nurses, and respiratory therapists, by understanding the

macrocognitive processes employed by each specialty. We proposed solutions to guide data

integration and visualization through technologies or policies and procedures so that

macrocognitive processes most prominent to each discipline are facilitated.

59

3.3 Methods

3.3.1 Study Design

The CDM was performed using semi-structured interviews with physicians, nurses, and

respiratory therapists as they recalled when they had to make a critical decision. To deepen the

understanding of the process leading to the critical decision, the timeline of the incident was

revisited several times. By supplementing technology-focused questions, we related cognitive

processes with explicit data and information sources (i.e., colleagues, patient assessment,

monitoring devices, therapeutic devices, or documentation technologies).

3.3.2 Setting

Interviews were conducted in clinical office spaces located either near or within the participants’

ICU. As schedules between physicians, nurses, and respiratory therapists varied, choice of

location was based on clinician convenience.

3.3.3 Participants

Twelve participants, of which four were physician intensivists, four were registered nurses, and

four were respiratory therapists, participated in this study. All of them were from the same

pediatric intensive care unit. Each received a $50 gift certificate. They were full time equivalent

staff.

3.3.4 Procedure

The expert knowledge elicitation technique (CDM) was applied in one-hour semi-structured

interviews conducted in June and July of 2014. A researcher trained in interviews asked

participants to recall a situation where they made a critical decision. Probe questions were based

on those from Crandall’s original CDM 132 and those modified by Baxter, et al.137 Supplemental

questions also prompted recollection of the sources of data and information deemed relevant to

their decision-making processes. Sample probe questions can be found in the Appendix E.

Interviews were audio recorded, de-identified, and transcribed verbatim. Research ethics

approval was obtained from the clinician participants’ hospital.

60

3.3.5 Data Analysis

3.3.5.1 Macrocognitve Processes Coding

Transcribed interviews were coded using NVivo version 8 software for 1) macrocognitive

processes and 2) sources of data and information. Two reviewers (YL and JT) coded interview

transcripts deductively based on a priori macrocognitive processes identified from Klein and

Schubert's frameworks,134,135 and inductively based on emerging processes identified. Raters

reviewed their inductive codes for overlap (i.e., both raters may have identified the same

emerging process principle, but needed to come to consensus on the wording moving forward).

Coding discrepancies were discussed among the coders (YL and JT) and principal investigators

(PT and AMG) until consensus was reached. The set of agreed upon macrocognitive processes

comprised the "analytical framework" that was used by the coders to independently code

subsequent interviews. Cohen’s kappa for inter-rater reliability was calculated using NVivo’s

coding comparison query function. Once inter-rater reliability was above 0.4 on four

independently coded transcripts, the remaining eight were coded by one coder (YL). The 20

macrocognitive processes were reduced to nine from the original set of processes, plus one new

process, shown in Table 11.

Table 11. Macrocognition process codes, adapted from Klein et al. and Schubert et al.134,135, *new

process

Macrocognitive

process

Definition Number of verbal

references and

proportion (in %)

Anticipation Anticipate how a situation might unfold, see potential problems or needs of the patient, team, and unit, and adjust plans and actions accordingly

394 (16%)

Interprofessional and interteam communication

Communication and coordination with the ICU team and across internal and external services

390 (16%)

Managing attention The use of perpetual filters to determine the information a person will seek and notice

190 (8%)

Managing complexity Track and manage multiple patients with complex conditions while attending to family needs, the healthcare team, and organizational and systems requirements.

74 (3%)

Managing uncertainty and risk

The use of skills for coping with uncertainty which may arise from missing data, from data whose validity is unclear, from ambiguity over competing situation assessments, and from complexity that interferes with sensemaking

136 (6%)

Problem detection The ability to spot potential problems at an early stage 129 (5%)

Self-awareness and self-management

Awareness of own knowledge, capabilities, and vulnerabilities

80 (3%)

Sensemaking Deliberate, conscious process of fitting data into a frame 667 (28%)

61

Macrocognitive

process

Definition Number of verbal

references and

proportion (in %)

Technology management*

Process of troubleshooting problems arising from the function of technology or the management of multiple technologies used in a given situation.

199 (8%)

Time management Skill of anticipating how long things take and how timing affects patient care

146 (6%)

3.3.5.2 Data and Information Sources Coding

In the ICU, data and information were sourced from colleagues (e.g., clinical knowledge), the

patient (e.g., their work of breathing), and technology (e.g., vital signs on the physiological

monitor). Specifically, technology sources were any monitors detecting and displaying vitals,

other physiological parameters, or organ support data. These technologies influenced

macrocognition by providing data that clinicians perceived and used to make decisions. To

deconstruct this broad category, sources of data and information, specifically the medical devices

and software, were coded according to the code list of Table 12.

Table 12. Source codes

Code Description Terms

Human sources

Colleague(s) Other clinician, consultation with other department, or calls with other institution

“nurse”, “RT”, “fellow”, “surgeon”

Patient Assessment of the patient through clinical assessment, including visual assessment, auscultation, palpation/feeling pulse, hearing

Patient/physical/clinical assessment, examination, appearance or behavior, “listening”, “looking”, “feeling”, “watching”

Parent The person who receives care by the intensive care team

“parents”, “mom or dad”,

Monitoring technological sources

Blood analysis Analysis of blood composition including dissolved gases, electrolytes or proteins

“blood gas analysis”, “blood work”, “blood test”

Blood pressure cuff Manual, non-invasive device used to obtain intermittent blood pressure

“blood pressure cuff”, “non-invasive cuff”, “cuff pressure”

Chart (physical) The physical chart where forms and signed sheets are found.

“order sheet”, “order”

Electrocardiogram (ECG)

Technology to monitor heart activity “ECG”, “12-lead”, “3-lead”, “formal ECG”, “telemetry”, “rhythm telemetry”, “full disclose telemetry”, “main telemetry”

Electroencephalography (EEG)

Technology to monitor electrical activity in the brain

“EEG”

Electronic medical record (EMR)

Electronic medical record accessed through a computer

“EMR”, software platform name

Fluid balance The recorded change in input and “Urine output”, “fluid balance”, “catheter

62

Code Description Terms

output of fluids in the patient. and see what’s coming out”, “volume status”, “look at drains”

Intracranial pressure monitor (ICP)

“ICP monitor”

Imaging Technologies to visualize the internal organs of the patient.

“X-rays”, “CT scan”, “Echocardiogram”, “ECHO”

Lab results Referring to lab analysis from central lab, as opposed to blood gas analysis conducted at ICU, typically by RTs

“Blood cultures”, “lab work”

NIRS Near infrared spectroscopy “NIRS”, “cerebral monitoring”

Physiological monitor The physiological monitor continuously monitoring the patient vitals using one or several of the following detectors: oxygen saturation probe, invasive arterial lines, transcutaneous CO2 probe, temperature probe

“bedside monitor”, “physiological monitor”, “look up the screen”, “primary monitor”, “vitals”, “CVP”, “end tidal CO2”, “saturations”

Intervention technologies sources

Dialysis circuit, Extracorporeal membrane oxygenation, Infusion pumps, Mechanical ventilation

Various organ support technologies and medication supply devices

“dialysis machine”, “circuit”, “ECMO”, “circuit”, “the pump, the pressure monitors, all the boxes”, “ECMO pump and all of its values, so there’s about six numbers on there”, “I look at the drugs that I’m infusing”, “ventilator”, “bi-PAP”, “ventilation”, “CPAP”, “end tidal CO2” (mentioned in the context of ventilator), “nitrous oxide”

To understand how one macrocognitive process led to another (compound macrocognitive

processes) during critical decision-making, matrix queries were used. The matrix query function

of the NVivo software returns the frequency of two nodes (codes) located close together in the

transcribed interviews. The function has been used to find paired relationships between cognitive

and metacognitive behaviors.138 Here, the macrocognition process pairs occurring within a given

paragraph in the transcript reflect the sequence of macrocognitive processes within a clinician’s

line of thought. The frequency of the paired occurrence was normalized using their respective

proportion in each specialist macrocognition distribution. In this study, likely paired

relationships were identified using a 50% cut-off rate of the maximum normalized value, within

each clinical specialty.

3.4 Results

3.4.1 Study Participants

Table 13 shows the participant demographics.

63

Table 13. Demographics, years of experience, specialization

Physicians Nurses Respiratory Therapists

Total

Number, (n) 4 4 4 12

Gender, n Male 2 0 1 3

Female 2 4 3 9

ICU Experience, n <=5 year, fellow 2 2 2 6

>5 years, staff 2 2 2 6

ICU Specialization, n

CCCU* 1 1 N/A 2

PICU** 1 3 N/A 4

PICU/CCCU 2 - 4 6

* CCCU: Cardiac critical care unit; ** PICU: pediatric intensive care unit, N/A; respiratory

therapists in this ICU department serve both the CCCU and PICU

3.4.2 Inter-Rater Reliability

The unweighted average inter-rater reliability, between two coders, for coding macrocognitive

processes was 0.43, considered adequate. The sample contained interviews from two physicians,

one nurse, and one respiratory therapist. Sources were coded by one coder (YL) since the verbal

fragments were synonyms of the codes, see Table 12.

3.4.3 Macrocognition Processes

There were 2,405 verbal references to 10 macrocognitive processes, see Figure 6. Across all

three disciplines, the ranking of the macrocognitive processes were similar with Sensemaking,

Anticipation, Interprofessional and Interteam Communication (subsequently referred to as

Communication), accounting for approximately 60% of all macrocognition. Interestingly,

physicians and respiratory therapists exhibited an equal distribution of macrocognitive processes

and devoted 34% of macrocognition to Sensemaking. Nurse macrocognition was more evenly

distributed among the three processes of Sensemaking (23%), Anticipation (18%), and

Communication (20%). Figure 1 shows similar ranking of macrocognition processes across

specialties. Table 3 summarizes how each specialty executes each macrocognitive process.

64

Figure 6. Distribution by number of verbal references and percentages, within specialties, of macrocognitive processes.

65

3.4.3.1 Sensemaking

Sensemaking is the understanding of a patient’s current status using data and information from

people and technologies. This process was most common among all clinicians, and was highly

technology-mediated, see Figure 6 and Figure 8. Physicians and respiratory therapists attributed

one third (34%) of their macrocognition to Sensemaking while nurses attributed about one

quarter (23%). All specialties used continuous data from the physiological monitor to understand

the patient’s current state while other technologies differed between specialties, see Table 14.

Physicians:

Physicians separated systematically data according to physiological systems (e.g., respiratory,

circulatory, neurological). They grouped blood analysis data to understand organ status and

overall metabolic function. This suggests they assess at three levels: the cellular level, the organ-

level, and the physiological system level. For example, a physician would use the near-infrared

spectroscopy value, measured on the surface of the patient’s forehead, to partially deduce the

status of brain oxygenation: “I’d also look at the NIRS monitor to see what had happened to the

oxygenation in terms of brain.”

Nurses:

Nurses, assessed status using basic vitals, at the physiological system-level but also incorporated

qualitative features such as behavior, pain and level of consciousness, which was at the patient-

level.

Respiratory Therapists:

Respiratory therapists limited their parameters to those associated to the respiratory system while

incorporating response to respiratory support technologies, to varying levels of technical

complexity (e.g. gas mixture from face mask to extracorporeal membrane oxygenation).

Respiratory therapists also emphasized patient comfort related to intubation, skin color (e.g.,

looking “blue”’ or “mottled”), vitals and qualitative assessment of work of breathing and chest

breathing sounds. In this way, different clinicians assessed the patient at varying levels of

abstraction.

66

3.4.3.2 Anticipation

Anticipation, in critical care, is the process of predicting how a situation might unfold, see

potential problems or needs of the patient, team, and unit, and adjust plans and actions

accordingly. Anticipation was the second most common macrocognitive process, for all groups.

Physicians, nurses and respiratory therapists devoted 14%, 18% and 16 % of their

macrocognition to this process, respectively.

Physicians:

Physicians described using the process of Anticipation to predict the effect and duration of

interventions (e.g. simultaneous medication infusions and boluses, ventilation, organ support

technologies) through their medical and pharmaceutical knowledge (e.g., the half-life of the

medication). For example, a physician would simultaneously observe and change conditions of

an interventions: “So we were doing interventions and seeing the response and processing what

we could change and not change etc.” Close monitoring allowed them to adapt the plan as they

observed changes in the patient.

Nurses:

Nurses predicted which protocols may be invoked by physicians and planned their support

actions. Some examples include a diabetic patient who would automatically be monitored for

electrolytes and blood gas exchange, or a patient with a Glasgow Coma Scale score of less than 8

who would be intubated.

Respiratory therapists:

Similar to nurses, respiratory therapists described using Anticipation to predict the effect of their

respiratory support.

In sum, the predicted actions by nurses and respiratory therapists were based on the treatment

protocols they felt would be likely applied by physicians. Physicians anticipated ordering more

data, riskier invasive procedures, medical interventions, or coordination with other departments

(e.g., surgery or catheter lab for more data).

67

3.4.3.3 Interprofessional and Interteam Communication

Communication was the third most common process with nurses devoting 20%, and both

physicians and respiratory therapists devoting 11% of their macrocognition.

Physicians:

Physician communication extended beyond the ICU team to consulting services but was less

technology-mediated.

Nurses:

Nurses, more than other clinical specialties, devoted most of their macrocognition to

communicating patient information since they provided the most direct patient care and acted as

gatekeepers to their patients. They communicated within the ICU team, with allied health

professionals but also during transport within the hospital. Nurses were also most likely to use

technologies with Communication, with 14% of technology references compared to physicians

(2%) and respiratory therapists (5%).


Respiratory therapists communicated within the ICU team and their specialized group as they

juggled multiple patients during their shift. They required communication with physicians, in

particular, because they shared control of respiratory support technologies.

3.4.3.4 Technology Management

Technology management both supported and hindered decision-making. However, when

technology hindered decision-making it required clinicians to troubleshoot artefact readings by

checking them with supplemental clinical examination (e.g. palpating for the pulse) or alternative

technologies (e.g., use of blood pressure cuff, or escalating from 3-lead ECG to 12-lead ECG),

thereby creating inefficiencies. All specialties differed in the types of technologies they

managed, see Table 14.

68

Physicians:

Physicians used blood analysis and a large variety of imaging technologies much more

extensively than other disciplines. Blood analysis was used to understand organ function while

imaging was used to troubleshoot the operation of technology. For example, physicians used

echography to check the position of the arterial line used for infusions pumps. When bedside

data heart rate resolution was insufficient, physicians relied on more detailed sources such as

electrocardiogram, obtained as a print-out at the beside or away from the bedside at a dedicated

terminal, to assess heart function. These examples, show how data was supplemented to manage

the risk and uncertainty from the readily available data.

Nurses:

Since nurses are stationed at the bedside where ICU technologies are concentrated, they devoted

twice as much of their macrocognition to technology management (12%) than physicians and

respiratory therapists (6%). Nurses were largely responsible for Communication, as previously

mentioned, and their high use of technology management appeared to be associated with nurses’

responsibility to validate and communicate clinical data to the rest of the team. For example, one

nurse believing an out-of-range heart rate from a saturation probe was an artefact verified its

accuracy by feeling the pulse and getting an ECG trace. The verification steps to assess the

validity of the saturation probe value can be taken in two ways. If the value was true it supported

early detection and appropriate planning and actions. If it was false, it was regarded as a false

alarm and added to technology mistrust.


Respiratory therapists focused on the physiological monitor, ventilation technologies and blood

analysis. They were responsible for respiratory support but also required vitals to understand the

patient status. Since bedside nurses controlled the alarms on the physiological monitor

respiratory therapists may experience a higher cognitive load because they do not benefit from

programming their own alarm thresholds. For example, one respiratory therapist checked on a

patient s/he suspected was deteriorating by periodically checking, from the doorway, the oxygen

saturation displayed on the physiological monitor. S/he commented that it was not low enough to

69

trigger the alarm set by the nurse. As the respiratory therapist remarked: “you see that the oxygen

is lower but not necessarily low enough to trigger an alarm, then maybe something’s going on.”

In this case, the inability to control the technology required the respiratory therapist to develop a

mental trend of the raw data they perceived from the doorway.

In sum, Technology management differed across specialties and was most used by nurses. Some

of the issues requiring technology management emerged from imperfect technology design such

as noise in the data or exclusive control of the data alarms by one specialty. Due to the

complexity of the patient state multimodal monitoring of an equivalent or reinforcing parameter

confirmed deterioration. This process relates to several other macrocognitive processes and was

inextricably linked to the technologically-intense critical care setting.

3.4.3.5 Managing Attention

Managing attention is the process of filtering or selecting information. In intensive care, this

process helped clinicians focus on subsets of patients or parameters. It was ranked fourth most

common process for physicians and respiratory therapists (10%) and fifth for nurses (7%).

Perpetual filters included type of patient, timeline in the ICU, intensity of technology at the

bedside, and if colleagues had flagged particular patients.

Physicians:

For example, knowing patients’ diagnosis helped all clinicians prioritize patients. As one

physician stated: “You just know that single ventricles [patients] are generally more fragile than

any other patient.” Repeated experience or “patient scenarios” guided data selection or filtering

by all clinicians. One physician stated:” [I]n my mind I’m always thinking about it [in terms of]

“a, b, c”, [or] airway, breathing, circulation, neurological, electrolyte.” In this way, grouping

data helped clinicians manage the overwhelming data.

Time of events such as post-surgery or post-extubation, guided the expected duration of

heightened attention. One physician stated: “after cardiac surgery you should be always on your

guard.”

70

Nurses:

Nurses caring for a post-surgery patient were most concerned with alleviating pain and closely

monitoring pain-related parameters. Various personalized strategies were employed to manage

attention. For example, a senior nurse, when starting their 12-hour shift with a new patient would

set the alarms on the bedside monitor close to the target range until s/he felt they knew the

patient’s baseline range.

Though vital parameters were the focus of nurses’ monitoring, disconcerting physiological

processes further narrowed their focus of parameters. For example, one nurse concerned with

brain oxygenation would pay closer attention to preductal oxygen saturation which s/he

explained affected the brain, rather than post-ductal oxygen saturation which affected the left

arm, abdomen and torso. For nurses, new monitoring parameters, such as NIRS, that did not fit

with the pattern of acute deterioration established from familiar basic vitals, were ignored or

given less attention.


For respiratory therapists, the level of ventilation support helped them prioritize patients.

Respiratory therapists, among themselves, indicated which room and patient they were most

concerned with. One respiratory therapists used pen and paper to write summaries of their and

colleagues’ patients to keep track of initial status, changes of respiratory function and changes to

ventilation support, as their shift progressed.

To summarize, managing attention was a challenge due to highly context-specific filters and

uniquely complex patients. While there was prioritization of subsets of patients there was also an

internal reminder that the status of every patient in the ICU could drastically change. One

physician remarked that s/he should ”first of all not underestimate a stable patient.”

3.4.4 Sources of Data and Information

Colleagues represented 14%, 22% and 18% of information sources for physicians, nurses and

respiratory therapists, respectively. The patient presentation represented 8%, 20% and 23% as a

source of information for physicians, nurses and respiratory therapists, respectively. However,

71

probe questions emphasized the technological sources of data and information and thus, the

current analysis focused on the non-human sources of data and information, that is the medical

devices and software. Thirteen sources of data and information were accessed during critical

decision-making. The rankings for each source are shown in Figure 7. Common to all specialties

were the physiological monitor, intervention technologies, blood analyses, imaging, the EMR

and fluid balance. Physicians used the most variety of information sources when making critical

decision.

72

Figure 7. Distribution of sources of data and information among all technological sources for each specialty

73

3.4.5 Macrocognitve Processes as a Function of Sources of Data and Information

Although the most frequent processes were the same for all disciplines, their meaning and

sources differed across professions, see Table 14. The distribution of data and information

sources, among all macrocognitive processes, are shown in Figure 8. Overwhelmingly,

technologies were used during Sensemaking. Physicians attributed 57% of technological sources

to Sensemaking while nurses attributed 32%, and respiratory therapists attributed 51%. The

sources of information for Sensemaking were similar for all disciplines. For Anticipation,

physicians used technologies to “extract” information, (e.g., blood analysis and imaging), while

nurses and respiratory therapists used technologies to control technologies they were directly

responsible for (e.g., set the alarms on physiological monitors and manipulate ventilator

settings). Also, for the process of Anticipation, nurses and respiratory therapists mentioned

colleagues as sources of information, suggesting they often plan with other team members. For

the process of Communication, physicians, nurses, and respiratory therapists used technologies

to extract information. Specifically, physicians and respiratory therapists extracted information

for their own use (e.g., to make decisions about medical interventions or mechanical ventilation)

whereas nurses extracted information to subsequently update colleagues away from the bedside.

All clinicians communicated directly to each other and/or used the EMR to provide information

about the patient. Physicians, however, reported less use of EMR for patient communication

compared to nurses and respiratory therapists. Technology Management involved layers of data

verification with nurses taking charge of the initial data validation and physicians conducting a

subsequent validation when critical decisions were required. Respiratory therapists’ use of the

technologies was independent of physicians’ and nurses’ use of the technologies as they were

largely responsible for the ventilation support technologies. Finally, Managing Attention differed

between the groups with physicians relying mostly on colleagues while nurses and respiratory

therapists relied on the physiological monitor and the patient.

74

Table 14. Macrocognitive processes and associated sources of information or data, *main data sources are presented as the proportion of clinicians and

the number of references

Physicians Nurses Respiratory therapists

Characteristic Example Characteristic Example Characteristic Example Sensemaking Systematically reflect on the different organ systems and then seek out necessary data or order the required tests or monitoring devices Main sources 1. Patient (4/4, 12) 2. Physiological monitor (4/4,

11) 3. Blood analysis (4/4, 9)

“So I was trying to work out what was going on, what phenomenon did I have that could explain all of these clinical findings, and then would just systematically try to go: OK this is the heart problem? What tests should I do for that? This is the brain problem? What tests should I do for that?”

Physically examine the patient and use their knowledge to conclude patient stability Main sources 1. Physiological monitor

(4/4, 17)/ Patient (4/4, 31)

2. Parent (2/4, 2)/ Blood analysis (2/4, 7)/ ECG (2/4, 4)/ EMR (2/4, 3)/ Interventions (2/4, 10)

“[W]hen you have a very sick patient you’re not only looking at all the technologies that the patient is hooked up to, but I have to look at my patient as well”.

Gather data primarily from the patient and from the detail of the waveforms from mechanical ventilation support. Main sources 1. Patient (4/4, 24) 2. Colleague (3/4, 4)/

blood analysis (3/4, 11)/ imaging (3/4, 8)/ physiological monitor (3/4, 23)/ interventions (3/4, 15)

“[I]nspiration looks a certain way, expiration looks another way, there are times when you see certain alterations in the waveforms that are difficult to explain. For instance, you can sometimes see prolonged exhalation, where you wouldn’t expect it”

Technology Solutions

• Make specialty-specific data available.

• “Fill-in” the data trend “picture” caused by broken timeline of observations.

• Abstract to the cellular-, organ-, and system-level.

• Rank data and information according to nurses’ ranked information needs, their routines/protocols.

• Abstract to the patient-level.

• “Fill-in” the data trend “picture” caused by broken timeline of observations.

• Abstract to the patient and organ-level, specific to respiration.

Anticipation Foresee the patient’s response to therapeutic (surgical or medical) interventions or illness progression. Main sources 1. Blood analysis (3/4, 7)/

physiological monitor (3/4, 6)

2. ECG (2/4, 2)/ Imaging, ECHO (2/4, 4)/ intervention, ventilation (2/4, 4)

“No, it was predicable because we introduced a new medication that may contribute to this.”

Used their experiences to mentally simulate the possible scenarios and plan for possible interventions the attending physician would order. Main sources 1. Physiological monitor

(4/4, 6) 2. Colleagues (2/4, 6)/

Imaging (2/4, 2)

“I was getting blood [requisitions] out so we could get some more blood up and the doctor came and I said, “we need some blood. The[n] the doctor says, “Give a unit right now, give it a unit.” So, we did that and then I went over and got the racks and [I] said, “Do you want some more blood?” He says, “yes, yes I want FFP.”

Plan for escalation, de-escalation or duplicate replacement technological respiratory support for critically-ill patients Main sources 1. Intervention,

ventilation (3/4, 12), Colleague (3/4, 3)/ Patient (3/4, 6)

“Once you’re in a low [ECMO circuit] flow state and there’s a clot within the circuit we already knew […] the whole [circuit] can fail.” “I remember one child that they described, it almost sounded like it was a twin.”


75


Characteristic Example Characteristic Example Characteristic Example

• Receive feedback on the patient outcome and commit to memory an enriched patient “pattern” that leads to earlier recognition with fewer information cues.

• Technological mechanism such as displaying data trends, identifying similar patterns and calling up historical cases may help close the loop on individual Sensemaking and support nurses and respiratory therapists with pattern recognition of non-routine situations.

Interprofessional and interteam communication Collect data and information from the bedside team or the previous multidisciplinary team. Main sources 1. Colleagues (4/4, 32) 2. Patient (1/4, 1)/ Blood

analysis (1/4, 1)/ dialysis circuit (1/4, 1)/ imaging (1/4, 1)/ EEG (1/4, 2)

“So, we would constantly talk about our goals […] and what were the problems with this patient.”

Select and relay data and information based on their colleagues’ speciality. Main sources 1. Colleagues (4/4, 44) 2. Patient (2/4, 3)/ EMR

(2/4, 5)/ physiological monitor (2/4, 3)

“I won’t tell the surgeon that [the pH has changed]. He wants to know if they’re draining. From the surgical side, if there’s any wound problems. He might come in [to] look at the drugs. He’ll want to know if we had to escalate on epi[nephrine]”

Delivery of ventilation support to multiple patients means they rely on colleagues’ respiratory specific summaries that highlight important patient details. They are the most prone to losing context when changes are made. Main sources 1. Colleague (4/4, 22) 2. Patient (3/4, 4) 3. EMR (1/4, 1)/

physiological monitor (1/4, 1)/ imaging (1/4, 1)/ intervention (1/4, 3)

“[PEEP has] been changing and you’re […] not sure why it’s been changed, whether it’s a colleague, a fellow RT, has put it up and not put [why] anywhere in the chart, or your doctor’s come in and put it up because they’ve caught [it] before you’ve caught it and they haven’t had a chance to tell you yet that it’s gone up.”


• Changes to settings on shared technologies should have a record in the shared DIVT interfaces.

• Support nurse hand-off, post-shift. • Changes to therapeutic technologies, if controlled by different clinicians, should be recorded and made visible.

Technology management Troubleshoot or combine with confirmatory technologies seemingly faulty data if they detect a problem during analysis of that data. Main sources 1. Physiological monitor (4/4,

7) 2. Blood analysis (3/4, 3)/

Imaging (3/4, 4)

“[…] if you really want to see the rhythm, then you have to go to the full disclose telemetry [to] be able to see how it transitions, but a shortcut using the [bedside] monitor is to go to the graphical trend [and] see how the heart rate transitioned.”

Take responsibility for the validity of data collected from most bedside monitoring technology and are first to troubleshoot or confirm readings before communicating it to colleagues or committing to the EMR. Main sources 1. Physiological monitor

“if you notice that the three leads [ECG] now looks weird and [is] obviously not normal for the patient, I’d call the physician and then we would do a 12- lead [ECG].”

Take responsibility for the respiratory support technologies and carry out orders to escalate or reduce support. A primary goal is to wean off the support technologies, especially invasive ventilation. Main sources 1. Interventions (2/4, 3) 2. EMR (1/4, 1)/

“Investigations [helped us] realize that there were software upgrades that could be done on our homecare ventilator [which] allowed us to see waveforms on the ventilator [that] we didn’t have before.”

76


Characteristic Example Characteristic Example Characteristic Example (4/4, 12)

2. ECG (3/4, 3) 3. NIRS (2/4, 6)



• Improve continuous data reliability, in particular artefact recognition should decrease the level of technology management for all disciplines.


• Qualitative patient assessment, specific to nurses, may require special decision-support to detect problems on a qualitative scale (e.g., pain and level of consciousness).


• Qualitative patient assessment, specific to respiratory therapists, may require special decision-support to detect problems on a qualitative scale (e.g., work of breathing).

Managing attention Prioritize patients based on the subset ICU population(s) they belong to or criticality of illness based on amount and type of support technologies and, at the patient-levels, organ systems-based issues and seeking data or information currently missing by prescribing orders. Main sources 1. Colleague (3/4, 4) 2. Imaging (2/4, 2)/

Interventions (2/4, 3)

“There’s always one or two patients that will take more of my attention or more of my time”; “I focus on the things that are problematic just for that patient.” “your goal should be to focus on what are the things that I can’t afford to miss?

Constantly watch the patient and look for the abnormal values that fluctuate beyond thresholds. Values which fit the pattern of “normal” will not be given as much attention. They divide their monitoring attention with timely delivery of interventions Main sources 1. Physiological monitor

(4/4, 12) 2. Patient (2/4, 8)/ EMR

(2/4, 2)/ NIRS (2/4, 3)

“I don’t worry […] if I see normal results, that’s good. I focus on what’s not going right.” “the priority obviously is the patient and making the interventions that need to be done within five minutes”

Monitor for the escalation of respiratory support or episodes of desaturation signals patients of increased concern. Main sources 1. Physiological

monitor (2/4, 8) 2. Colleague (1/4, 2)/

patient (1/4, 4)/ blood analysis (1/4, 1)

“generally, if there’s a particular patient who’s been acting out maybe desaturating all night, maybe they’ve progressed from room air to BiPAP and they’re going to get a tube. I think we’re probably going to have some problems with that patient, and they’re definitely indicated”


• Provide user specific, patient load customization. • Highlight out of range values.

• Facilitate likely protocols.

• Highlight patients with escalation of respiratory therapy modalities.

Time management Carry out initial assessments followed by periodic reviews to efficiently update the status of multiple patients. These reviews entail foreseeing effect of interventions, illness evolution or

“[B]ecause we were so focused on instantaneous changes, we were actually [obtaining] blood gases in the unit [since] RTs […] giv[e] us the print out

Schedule interventions and coordinate them when they are at different intervals. The interventions range from bedside care to out of unit

“Feeds were less than two and a half hours, we’re feeding him every three hours and so it wasn’t enough, but he was waking up exactly half an

Manage scheduled patient visits and calls to the bedside. When de-escalating from invasive ventilation, RTs may encounter unforeseen

“you can go anywhere from half an hour to two to three hours before [the patient is settled]. Maybe not solid for three hours, but you’re back and forth

77


Characteristic Example Characteristic Example Characteristic Example estimated time of hospital-based processes. Main sources 1. Colleague (2/4, 2)/

blood analysis (2/4, 4) 2. Interventions (1/4, 1)/

EEG (1/4, 1)/ EMR (1/4. 1)/ physiological monitor (1/4, 1)

much earlier than [if the blood gas results] would have appeared on [the EMR through the central lab].

care, including surgeries and discharge, and always involve documentation. Main sources 1. Colleague (1/4, 1)/

Patient (1/4, 1)/ Blood analysis (1/4, 1)/ physiological monitor (1/4, 2)

hour before a feed.” “This hour I’m preparing my medications, I’m starting the feeds, I’ve got to document.”

complications requiring them to re-organize their patient load and travelling between patients. Main sources 1. Colleague (1/4, 1)/

blood analysis (1/4, 1)/ physiological monitor (1/4, 2)

in there most of the time just trying to get them settled.”


• Highlight increased frequency of data demand. • Set-up auto-pilot documentation, or request scribe, to support nurses patient care when documenting overtakes time available for direct patient care.

• Provide communication to RT team when one is occupied with a high acuity patient, for longer time than usual.

Managing uncertainty and risk Decrease uncertainty and risk by increasing and selecting data and information. Main sources 1. Imaging (4/4, 7) 2. Patient (3/4, 5) 3. Blood analysis (2/4, 2)

“you increase your level of monitoring [by] doing blood work much more frequently, […] doing assessments much more frequently to try [to] anticipat[e] what’s going on”

Decrease uncertainty of abnormal data by combining with patient assessments. Main sources 1. Patient (2/4, 4)/


2. Colleague (1/4, 2)/ ECG (1/4, 3)/ imaging (1/4, 1)

“you’re assessing [whether] this is an accurate reading on the monitor […] by palpating the pulse [to check if it] matches”

Gather information from bedside nurses relative to respiratory function. Patient assessment will involve auscultation of the lungs or drawing blood for gas analysis. Main sources 1. Colleague (2/4, 4) 2. Patient (1/4, 1)/

blood analysis (1/4, 1)/ EMR (1/4, 1)

“[A] good indication if they’re not happy and they’re not settling [is] auscultation. They’ve completely decreased [and] have collapsed [on one side of the lungs] but their sats are fine”


• Support the analysis of higher frequency data through dense data visualizations.

• Increase reliability data by automatic cross-referencing redundant data and reduce uncertainty.

• Combine auscultation data, or other RT preferred data types, to help their decision-making.

Problem detection Combine data and information from bedside staff to identify problems. Main sources 1. Physiological monitor (3/4,

4) 2. ECG (2/4, 3)

“this is where the ECG and the CVP come into effect, it became clear that something had changed and the patient was having, now clearly differently a dysrhythmia.

Focus on patient vitals and appearance but pay close attention to subtle fluctuations of values if trending negatively. Main sources 1. Physiological monitor

“desaturation shouldn’t happen this frequently[..] I actually had to intervene because I saw her fluctuating a little bit around [the lower target threshold], but it was

Focus on specific vitals and blood gas indicators related to oxygenation, patient’s overall appearance and use of muscles for breathing, and the level of

“over the course of the night they had a slow increase in the amount of oxygen they needed and occasionally were dropping their saturations more than normal and

78


Characteristic Example Characteristic Example Characteristic Example 3. Colleague (1/4, 1)/ patient

(1/4, 1)/ blood analysis (1/4, 1)/ interventions (1/4, 1)/ NIRS (1/4, 1)

Heart rates that had come down to normal, 150s, 160s, now back up again to 170s, 180s, but not sinus anymore, a clear arrhythmia that’s associated with this type of surgery”

(4/4, 5) 2. Patient (3/4, 5) 3. ECG (1/4, 3)/

Interventions (1/4, 2)

fluctuating but heading down.”

ventilation support. Main sources 1. Physiological

monitor (4/4, 6) 2. Patient (2/4, 6)/

intervention, ventilator (2/4, 2)

listening to them they were getting a little bit quicker just like their lungs sounded different but not necessarily significantly different but just a little bit different”


• Use redundancy and frequencies of data to automatically detect likely patient problems.

• Use redundancy of data, from non vitals data, to automatically validate vitals data and detected anomalies in these data streams.

• Use redundancy of data, from non respiratory data, to automatically validate respiratory data and detected respiratory issues.

Self-awareness and self-management Reflect on how they feel when faced with uncertainty regarding patient care or team management Main sources 1. Colleague (2/4, 2) 2. Patient (1/4, 1)/ ECG (1/4,

1)/ physiological monitor (1/4, 1)

“[When you have] a very sick patient with multi-organ failure but without knowing the cause you can’t target the therapy. So, that makes you feel very uneasy”

Develop personalized strategies to manage their responsibilities. Main sources 1. Physiological monitor

(1/4, 1)/ Intervention (1/4, 1)/ Imaging (1/4, 1)

“If I forget anything [during handover], it’s usually the first 10 minutes driving [after my shift]. And then I pull over and then I call. [O]nce I tell, I’m finished.”

Keenly aware of their specific domain knowledge and will seek help from other specialties to enable them to focus on the respiratory aspect of patient care. Main sources 1. Colleague (1/4, 2)/

patient (1/4, 1)/ intervention, ECMO (1/4, 1)/ physiological monitor (1/4, 1)

“I quickly decided that we probably need some extra help. […] I can’t manage the patient and manage the problem with the ECMO circuit”


• None suggested.

Managing complexity Trust colleagues and parents to handle monitoring tasks to then focus on different patient aspects or different patients requiring their more urgent attention. Main sources 1. Colleagues (2/4, 2)/ 2. Parents (1/4, 2)/ EEG (1/4,

[If] they’re stable [then] some patients’ parents take care of them while they’re in the unit and they don’t get monitored at all, the parents follow the monitoring and we see them once every 12 hours.

Gatekeepers to the patient for the coordination of different services ordered. They also facilitated parental involvement in the ICU. Main sources 1. Parent (1/4, 1)/

“[When] parents are […] staring at the monitors I often push the monitor away or turn it to the side [to help them focus on their child]”

Being highly mobile and with high patient loads, rely on the bedside team or other colleagues to coordinate patient care in cases where there are other urgent situation(s). Main sources

“it’s challenging when you’re trying to be in three places at once which becomes hard, which is where you’re relying on so many […] to get things done.”

79


Characteristic Example Characteristic Example Characteristic Example 1)/ physiological monitor (1/4, 2)

So […] we tailor the monitoring [by] how sick or potentially sick the patient is.

Patient (1/4, 1)/ Physiological monitor (1/4, 1)

1. Patient (1/4, 1)/ Intervention, ECMO (1/4, 1)


• None suggested.

80

Figure 8. Distribution of technological data sources among macrocognitive processes, within specialties.

81

3.4.6 Compound Macrocognitive Processes

Due to the dynamic nature of macrocognition, processes often occur simultaneously.134 We

analyzed how a single macrocognitive process lead to another and illustrated their

interrelatedness in Figure 9.

Technology

Management

(6%)

Complexity

Management

(3%)

Uncertainty and Risk

Management

(7%)

Sensemaking

(34%)

Anticipation

(14%)

Interprofessional and

Interteam Communication

(11%)

Problem

Detection

(5%)

Attention

Management

(10%)

Time

Management

(8%)

Self-awareness and

Self-Management

(3%)

1484

1521

1240

1970

1804 1260

1382

2225

2098

1455 1653

1795

1352

Complexity

Management

(3%)


Management

(3%)

Problem

Detection

(6%)

Attention

Management

(7%)

Time

Management

(5%)

3770

6284 3480

4337

Complexity

Management

(3%)


Management

(7%)

Interprofessional and

Interteam Communication

(11%)

Problem

Detection

(5%)

Self-awareness and

Self-Management

(3%)

3637

2622

2814

3010

5244

4147

3637

Figure 9. Relationships between macrocognitive processes in intensive care for physicians, nurses

and respiratory therapists with strength of relationships indicated by the number on the double

arrows

Interrelationships, or macrocognitive pairs, are shown as double-sided arrows between boxes.

The relative strengths of relationships are the values labelled on each arrow. This value was

82

calculated by taking the process pairs associated NVivo matrix query output and dividing by

each process’s proportion. Processes with strong relationships (i.e., above half of the maximum

normalized frequency value, within each group) are shown on the map and those with relatively

weaker relationships are absent from the map. The similarity matrices, normalized output from

the matrix queries, for all possible pairs of macrocognitive processes are found in the appendix.

3.4.6.1 Physician Macrocognition Structure

For physicians, all ten macrocognitive processes were interrelated in some combination, as

illustrated in Figure 9. Problem detection was the central macrocognitive process and was related

to five other processes. A suspected problem may trigger closer monitoring, understanding the

problem in context, anticipating further tests or therapies while also minimizing uncertainty in

the data, and regulating emotions when facing potential patient crisis. This last pair was strongly

linked and indicated an emotional aspect to problem detection. The high degree of

interrelatedness between all processes suggests that physician macrocognition was the most

distributed among the three groups and that they shifted frequently between processes during

critical decision-making.

3.4.6.2 Nurse Macrocogntion Structure

Nurses had five paired processes and five unpaired processes (absent from map). Managing

Complexity was central to their macrocognition since it involved managing direct patient care

while attending to the family, the ICU team, organizational and system requirements. This

process was related to reducing uncertainty of data (Managing Uncertainty and Risk), selecting

and monitoring the most important data and qualitative information (Managing Attention), and

balancing with scheduled interventions (Time Management). Sensemaking, Anticipation, and

Communication, processes with the largest proportion of macrocognition, were absent from the

macrocognitive maps, suggesting that they did not consistently relate to any other process.

3.4.6.3 Respiratory Therapists Macrocogntion Structure

Respiratory therapists had seven process pairs and five unpaired processes. Uncertainty and Risk

Management was the central process with four interrelated processes, suggesting they use the

interrelated processes to minimize risk and uncertainty in the data. Technology Management was

absent from the map which suggests this process was carried out separately. For example, in a

83

complex situation where a patient was on high circulatory support (e.g., ECMO) the respiratory

therapist decided to concentrate on fixing the ECMO circuit. S/he stated: “Because we were only

on partial support and the heart wasn’t pumping very well, in an acute situation, I thought I was

going to have to clamp the patient out so that they weren’t [the issue] to try and fix whatever

mechanical problem there was.”[Respiratory therapist troubleshooting the ECMO circuit while

having an issue with the patient physiology].

3.4.6.4 Comparison of Compound Macrocognitive Processes Between Specialties

The main differences between the maps were the absence of the main processes of sensemaking,

anticipation and technology management in nurse and respiratory therapist maps compared to the

physician maps. The absence of these main processes on the macrocognition map could indicate

that these were cognitively intense and did not associate frequently with any another process.

Since nurses were first-line verifiers of patient data, and respiratory therapists worked primarily

with mechanical ventilation the absence of the Technology management process could indicate

that it is all-consuming. Conversely, physicians’ macrocognitive map indicated associations with

all ten processes suggesting a constant shift between processes during critical decision-making.

All three specialties had the same macrocognition pair of (problem detection)(managing

uncertainty and risk). This suggests that when clinicians encountered a problem they double

checked the data and so shifted between these two processes. Moreover, for nurses and

respiratory therapists, this process pair extended to a three-process chain of (problem

detection)(managing uncertainty)(risk managing complexity) within their macrocognition maps.

The added process of Managing complexity differed between the two as nurses also managed

patient care, documentation demands and family support while respiratory therapists juggled

multiple patients and technical respiratory support, respectively.

3.5 Discussion

This study provides a specialist-specific distribution of macrocognition, within a fragmented

clinical information system, in the context of intensive care. In the following section, we discuss

the implications of our findings on individual clinicians, intensive care teams, and data

integration and visualization technology (DIVT) design.

84

3.5.1 Macrocognition of Individual and Team Decision-Making

3.5.1.1 Specialist Macrocognition

The clinical work of physicians, nurses, and respiratory therapists differed in scope, intervention

responsibilities, patient loads, and proximities to the bedside. Given that technology essentially

mediated Sensemaking, Anticipation, and Communication during critical decision-making, the

identical macrocognitive rankings of physicians and respiratory therapists could be explained by

their direct and exclusive control of interventions (i.e., physicians controlled medication and

high-risk interventions, and respiratory therapists controlled invasive ventilation). In contrast,

since nurses monitored changes in the patient rather than prescribed critical interventions, their

macrocognition involved processes which ensured they communicated validated data to the team

and accurately anticipated interventions.

Holtrop, et al. stated that macrocognitive processes “overlap and interact extensively.”136 The

primary processes of Sensemaking, Anticipation, and Technology Management were absent in

nurse and respiratory therapist maps compared to the physician map. This could indicate that

these processes were cognitively intense and did not associate frequently with any other process.

Since nurses were first-line verifiers of patient data, and respiratory therapists worked primarily

with mechanical ventilation, the absence of the Technology Management process could indicate

that it was an “all-consuming” process for them. Conversely, the interrelatedness of all 10

processes in the physician macrocognitive map suggests a constant shift between processes

during critical decision-making. As such, physicians required DIVTs that support frequent

process switching. In addition, physicians integrated the most diverse data and information in the

absence of sophisticated computer-aided integration. For nurses and respiratory therapists,

DIVTs supporting the main processes rather than process switching between less frequent

processes (e.g., compound maps) may be more beneficial.

Across specialties, the primary macrocognitive processes were technology-mediated and

Technology Management was a highly-ranked process. These findings suggest technology

design, for better or worse, has an undeniable impact on intensive care decision-making.

Therefore, technology recommendations to support macrocognitive processes are provided in

Table 14, where appropriate. Policy or procedural recommendations are provided when

macrocognitive processes were less technology-mediated.

85

In our study, clinicians most commonly accessed the physiological monitor, intervention

technologies (e.g., ventilators, organ-support technologies), blood analysis results, and imaging.

These findings corroborate the ranked data elements routinely used in the NICU which were

daily weight, pH (physiological monitor), pCO2 (blood analysis), FiO2 (ventilator), and blood

culture results (lab results).101 The first priority for ICU managers who wish to reduce clinicians’

cognitive data search efforts is to integrate data streams from these four types of medical devices.

In addition, the various levels of data and information abstraction (e.g., cellular to hospital-level

services) employed by each specialty supports the notion that technology should reflect this

natural organization of data and information (e.g., abstraction hierarchy).86,139,140 DIVT interfaces

at the bedside should be designed with consideration of nurses’ ranked information needs, their

routines/protocols, and their most common macrocognitive processes.101,141 Similarly, DIVTs

should be designed for physicians and respiratory therapists when they are away from the

bedside.

In care management implementation, Holtrop, et al. found that facilities with the highest success

used many macrocognitive processes.136 As an example of collaborative care, this study’s

findings could extend to ICU team care. For example, all clinicians needed to make sure a

technology was functioning properly, apply higher validity technologies (e.g., increasing the

number of leads for an ECG) or order a confirmatory/redundant test (e.g., imaging and blood

analysis) to confirm a trend before making a decision.

3.5.1.2 Implications for Team Macrocognition

3.5.2 Expert Macrocognition and Pattern Recognition

Another theme of this cognitive investigation was the use of pattern recognition and the

prominence of Sensemaking and Anticipation processes. Clinicians recognized a likely pattern

and attempted to confirm a match with their repository of patterns (e.g., clinical experience) by

obtaining more information. Physicians anticipated data gathering through the planning of

required tests to complete the patient “puzzle”. Nurses and respiratory therapists employed this

strategy as well, but in planning the necessary tools for therapeutic interventions. Within their

own toolbox of patient-stabilizing strategies, they sometimes employed therapeutic interventions

that have shown, from past experiences, to “buy time” before more critical decisions about

higher risk interventions were required. These findings corroborate Klein’s recognition-primed

86

decision (RPD) model which describes expert decision-making, under time-pressured, complex,

and uncertain conditions, based on the ability to recognize previous similar situations and

develop likely solutions.142 This strategy of pattern recognition was considered a necessary skill

for intensive care clinicians.142,143

Since physicians and respiratory therapists are responsible for multiple patients and away from

the bedside, they are prone to observing the evolution of patients on a broken timeline and could

benefit from technologies which “fill-in” the data trend “picture.” Physicians and respiratory

therapists were apt to recognizing long-term “fingerprint”’-like patterns in the continuous data,

for example in the ECG or ventilator waveforms, respectively. Nurses, operating in an

instantaneous timeframe, used subtle changes in patient manifestation. Given the different needs

of the three clinical specialties, facilitating patient typification, parametric data trends, and

qualitative information “trending” should be incorporated into future DIVTs.

3.5.3 Implications for Team Macrocognition

All three specialties exhibited the same macrocognition pair of (Problem Detection)(Managing

Uncertainty and Risk). The consistent relationship between these processes may stem from

incomplete or missing data. In practice, clinicians must repeatedly verify the values which do not

fit their predicted patient trajectory. As such, technologies used by all team members should

prioritize this block chain of macrocognitive processes.

Holtrop, et al. used the macrocognitive framework to understand changes in team care (care

management) by relating the support of macrocognition processes to facilities with successful

outcome measures.136 They found that practices that were conceptually aware of macrocognitive

processes and had explicit procedures to facilitate those processes were more successful.136 For

example, Sensemaking and Learning was supported by structured staff training (e.g., Lean

method for quality improvement).136 Similarly, DIVTs could support team decision-making if

they explicitly addressed macrocognitive processes, especially those found to be highly

dependent on technological information sources. Processes which were less technologically-

mediated, including Time Management, Self-awareness and Self-management, and Managing

Complexity could benefit from institutional policies and procedures.

87

Institutional policy could support Time Management and team Sensemaking. For example, in the

minutes or hours following a clinician’s official shift, a nurse stated: “If I forget anything, it’s

usually the first 10 minutes driving. And then I pull over and then I call. […] so I immediately

tell. Cause once I tell, I’m finished.” A downtime or protected time to hand over key information

should be built into the work flow thereby ensuring clinicians have sufficient time to transmit

data and information between shifts. DIVTs supporting communication between teams could be

designed to include last minute, off-site annotations that flag important information from off-

duty staff to on-duty staff. In the short-term, facilitating team Sensemaking in this case would

benefit from procedural solutions.

3.6 Limitations

The aggregate of decision-making processes was not amenable to rigorous statistical analysis of

the similarity matrices and thus, lessened the validity of the compound macrocognitive process

maps. Also, we limited the cognitive investigation to one critical decision, though several

decisions are typically made during dynamic team care. By focusing on different clinical

professions, we sought to understand different types of critical decisions that contribute to

effective team care.

Another limitation was the diversity of critical situations among clinicians. This is

understandable since clinicians perform different duties and make critical decisions based on the

nature of their work. Future studies using the CDM for intensive care settings could narrow to

one of the macrocognitive processes (e.g., Sensemaking), intensive care sub-specialties (e.g.,

cardiac critical care), or patient populations (e.g., post-cardiac surgery neonate monitoring or

post-traumatic brain injury monitoring). Analysis of a shared critical incident experienced by all

specialties could reveal more subtle aspects of the dynamics of team macrocognition for

decision-making.

Macrocognition maps, here used as an analysis technique, illustrate the interrelatedness of

macrocognitive processes and could supplement other cognitive load measurements (e.g.,

NASA-TLX).144 However, it was derived from subjective recollection. To reinforce findings, it

may be analyzed with objective measures of data source use (e.g., video recording of the

incident, log of EMR consultation).

88

3.7 Conclusion

This study successfully used the critical decision method to understand the macrocognitive

processes used during critical decision-making from three perspectives of critical care. The

method placed special emphasis on the sources of data and information and revealed the central

roles of Sensemaking and Technology Management. From adapted elicitation and categorization

techniques, we described macrocognitive complexity and identified processes which were too

cognitively intense to overlap. This exercise also reiterates how the contemporary ICU

environment remains highly multimodal and fragmented, and emphasizes the need for data

integration from all available sources. Finally, some recommendations for medical device

integration and design of data visualization technologies are provided to support macrocognitive

processes that were technologically-dependent. Macrocognitive processes that were less

dependent on technologies were also identified and may be supported through policies and

procedures.

89

Chapter 4 Heuristic Assessment of Continuous Data Integration and

Visualization Software

The objective of this study was to identify usability issues of the data integration and

visualization technologies that violated accepted interface design rules. Usability issues were

identified from the nurse perspective and were based on simple display perception and data

retrieval tasks. These tasks could be performed by any clinician of the ICU team, including

physicians and respiratory therapists. Consequently, resolving these issues relating to these tasks

may improve basic continuous monitoring data use by all members of the multidisciplinary ICU

team. Examples of highest severity issues and recommendations are provided. This chapter was

taken verbatim from the published manuscript and can be found on the Journal of Nursing and

Care website.

4.1 Abstract

The Intensive Care Unit (ICU) is a complex and technologically advanced healthcare setting.

Technologies enable continuous monitoring through patient signals that are sensed, recorded and

displayed at the bedside. Although such technologies have significantly decreased mortality rates

in the ICU, the large amounts of data have contributed to clinician information overload. Critical

care nurses spend more than half of their time scanning and assimilating information from

disparate monitors, at the bedside to assess the patient status. Software that integrates and allows

visualization of large data sets on a single screen are now available. In the present study, we

evaluated software entitled T3™ (Tracking, Trajectory and Triggering). Such computationally

powerful software has great potential to support nurses’ monitoring and decision-making tasks

but the usability, efficiency, and effectiveness of the software are key to end-user adoption. As

such, we conducted a Heuristic Evaluation, where the study’s evaluators interacted with the

software interfaces and were asked to comment on it by describing the usability issues and if

they were in compliance with established usability principles, or heuristics, specifically for

medical device interfaces.

90

A total of 50 usability issues associated with 194 heuristic violations were found. Identified

issues included difficulty with choosing the time period of the patient data signals, distinguishing

between several patient signals and appearance of patient values which were imperceptible to

evaluators; both issues could lead nurses to misinterpret the timing and/or the physiological

status of the patient (e.g., time of shock and exact value of vitals). Heuristic evaluation, an

efficient and inexpensive method, was successfully applied to the T3™ software to identify

usability problems that if left unresolved could lead to patient safety issues. These findings may

have broad implications for the design of the T3™ and other continuous monitoring systems.

4.2 Introduction

Intensive care units (ICUs) are settings where close monitoring and interventions aimed at

achieving homeostasis (i.e. stable vitals within target ranges) are performed on the most fragile

patients. The complexity of a pediatric patient’s underlying condition is exacerbated by their

rapidly evolving developmental physiology.145 For example, target ranges for a basic vital such

as heart rate is highly dependent on age.146 Long-term monitoring of the critically-ill, pediatric

patient is a signature feature of the intensive care unit, and is often associated with the heavy use

of monitoring technologies, which collectively, generate large quantities of data.6 Clinicians

specialized in critical care have been known to experience “information overload”147,148 due to a

high degree of multi-tasking149 and sustained prolonged vigilant monitoring.150 The negative

effects of the technology-intense ICU environment may hinder nurses’ ability to monitor and

signal changes in critically ill patients.

Due to the complexity and fragility of the critically ill patient clinicians need to use different

technologies to get a sense of organ function, the physiological systems affected and the overall

patient status. The use of multiple technologies, used simultaneously to continually assess the

patient status, is termed “multimodal monitoring.”151 Practically, multimodal monitoring is

challenging since nurses must constantly scan each discrete monitoring technology to mentally

integrate the data, assess current stability and predict the future trend of the patient to anticipate

interventions. In the modern technology-driven ICU, a critical care nurse spends half of the time

assimilating information embedded in clinical information systems and 15% of the time on

monitoring live vitals.152 Thus, these aforementioned factors make continuous monitoring during

extended periods of time challenging and increase the difficulty of making critical decisions

91

based on large data sets. Nurses’ workload could potentially be decreased by integrating data

into one trend monitoring software from which data is easily retrieved and visualized by the

nurses through their interaction with the display interface.

Such data integrating and visualization software for continuous multimodal monitoring has been

developed and is the subject of this study. Specifically, we evaluated software entitled “T3™”,

which stands for “Tracking, Trajectory and Triggering” and which has been implemented in

several North American intensive care units. The software combines all compatible data streams

from multimodal monitoring and displays, in real-time, the patient’s historical trends over the

entire length of stay, (e.g. days, weeks or months) on a highly interactive and responsive user

interface. It has been developed to visualize large quantities of continuous multimodal

monitoring data and aid in determining patient risk60,153 but was originally developed to support

physician intensivists. The software interface consists of four main screens: login, unit-level

patient census, individual patient trend information and frequently-asked questions (FAQ). The

general navigation sequence is shown in Figure 10. Although T3™ has the potential to improve

the predictability and reliability of nurses’ decision-making, the design of any medical

technology’s interface may lead to incorrect decision-making or worse, create new sources of

errors154 by hindering easy information retrieval, appropriate display of data or contributing to

overloading memory capacity. To minimize the potential for user error, the usability, efficiency,

and effectiveness of the interface should be assessed.

92

Figure 10. The four main screens of the integrating software

In the present study, we discuss an expedited method that is commonly used to evaluate the

usability of user interfaces, called a heuristic evaluation. Specifically, the evaluation assesses

whether aspects of a design are in agreement or in violation of established usability (i.e., ease-of-

use) principles, or heuristics.155 Data resulting from this evaluation can then be used to iteratively

redesign the interface.

Several sets of heuristics have been proposed in literature, and their application has been

extended beyond software interface evaluation. For instance, these heuristics have been modified

for and applied to several medical device interfaces.125 Heuristic evaluations are conducted by

people that have expertise in human factors and sometimes with the help of an expert knowledge

user. Typically, two or three evaluators independently conduct the evaluation and identify

usability issues.

In sum, this present study aimed to demonstrate the use of heuristic evaluation to assess and

improve current and future continuous monitoring software for intensive care. Results of this

evaluation are applicable to manufacturers and clinicians wishing to improve the user interface

through design of these and other healthcare monitoring systems.

93

4.3 Materials and Methods

4.4 Setting

The data integration and display software was launched at the pediatric intensive care and

cardiac critical care units of a large academic hospital, in Canada. Together these intensive care

units, on the same floor, contain 36 beds and are equally distributed between the two units. There

are single and multiple patient rooms, and each bedspot is equipped with the same patient

monitoring system and charting system.

4.4.1 Data Integrating and Visualization Software

In this study, T3™ version 1.6 was evaluated. At the time of the evaluations, the signals which

could be visualized were the basic vitals, end-tidal CO2 (integrated in 2013), intracranial

pressure, and others listed in Table 15. The display includes these abbreviations and more based

on the monitors connected to the patient. Collectively, they represent several discrete locations

which include the physiological monitor above the bedside, sometimes the mechanical ventilator

and any of three vendor-specific versions of near infrared spectrometers. As of July 2015, near-

infrared spectroscopy (NIRS) signals, such as regional oxygen saturation (rSO2) were integrated

into the software as part of one of the research group’s goals of comprehensively integrating

continuous monitoring signals, and reducing signal redundancy.

Table 15. List of selected patient signals viewable on the data integrating and visualization software

Patient signals Signal Label

Heart Rate HR

Respiratory Rate Resp

Pulse Pulse

Percent oxygen saturation SpO2

Non-Invasive Blood Pressure (systolic, mean or diastolic) NBPs, NBPm, NBPd

Arterial Blood Pressure (systolic, mean or diastolic) ABPs, ABPm, ABPd

Airway respiratory rate awRR

Temperature T

Central Venous Pressure CVP

Intracranial Pressure ICP

End-tidal Carbon Dioxide etCO2

Inspired minimum Carbon Dioxide imCO2

Regional Oxygen Saturation rSO2

Nurses can view both patients in the current census (ICU patient population) and previously

discharged patients in the archive database. The patient screen is where all continuous monitored

94

signals, as well as intermittent signals, such as non-invasive blood pressures, can be viewed on a

single screen.

4.4.2 Heuristic Evaluation: Applying Usability Heuristics for Medical Devices

The heuristic evaluation was conducted in three rounds: one in December 2013 and two in May

2014. During these evaluation rounds, three evaluators assessed the same version of the software

for usability issues. In the first round, one “double-specialist” with novice-level knowledge of

both the clinical work and human factors assessed the interface. In the second round, one domain

expert from bedside clinical nursing and another domain expert from human factors together

assessed the interface. A short third round to evaluate the interface in the clinical setting was

performed by the single “double-specialist” of the first round.

In the two first rounds, the software was viewed on a 15” Samsung Series 9 laptop, with screen

resolution of 1600 x 9000, 8GB of memory and an Intel Core i7-3517U central processing unit,

running Windows 8 64-bit operating system, connected to the internal network and accessing the

day’s patient census and their continuously monitored signals.

The interface was assessed using 14 heuristics, or “rules of thumb”, developed by leading experts

in interface design and modified for medical devices,125,156,157 see Table 16 for the complete list.

When conducting a heuristic evaluation, each usability issue is described, along with which

heuristic(s) it violates and the potential impact it can have. Usability issues often are associated

with more than one type of heuristic violation; these issues are then rated for severity (0:

cosmetic to 4: usability catastrophe, see Table 17). The results of the two rounds were pooled; in

case of discrepancy they were discussed between the human factors researchers who each

participated in the evaluation rounds and consensus on heuristic violations and severity was

reached. The potential clinical impact of the issues, in the clinical setting, was confirmed with a

medical domain expert and frequent user of the software.

95

Table 16. The 14 usability heuristics for medical devices as defined by Zhang et al.125

# Code Heuristic Definitions

1 Consistency Consistency and Standards

Users should not have to wonder whether different words, situations, or actions mean the same thing. Standards and conventions in product design should be followed.

2 Visibility Visibility of System State

Visibility of system state. Users should be informed about what is going on with the system through appropriate feedback and display of information.

3 Match Match Between System and World

The image of the system perceived by users should match the model the users have about the system.

4 Minimalist Minimalist Any extraneous information is a distraction and a slow-down.

5 Memory Minimize Memory Load

Minimize memory load. Users should not be required to memorize a lot of information to carry out tasks. Memory load reduces user’s capacity to carry out the main tasks.

6 Feedback Informative Feedback

Users should be given prompt and informative feedback about their actions.

7 Flexibility Flexibility and Efficiency

Users always learn and users are always different. Give users the flexibility of creating customization and shortcuts to accelerate their performance.

8 Message Good Error Message The messages should be informative enough such that users can understand the nature of errors, learn from errors, and recover from errors.

9 Error Prevent Error It is always better to design interfaces that prevent errors from happening in the first place.

10 Closure Clear Closure Every task has a beginning and an end. Users should be clearly notified about the completion of a task.

11 Undo Reversible actions Users should be allowed to recover from errors. Reversible actions also encourage exploratory learning.

12 Language Use Users’ Language

The language should be always presented in a form understandable by the intended users.

13 Control Users in Control Do not give users that impression that they are controlled by the systems.

14 Document Help and documentation

Always provide help when needed.

Table 17. Severity rating as defined by Zhang et al.125

Severity Description

0 not a usability problem at all

1 cosmetic problem only

2 minor usability problem

3 major usability problem

4 usability catastrophe

4.5 Results

In total, 50 usability issues were found. Two percent of usability issues were rated as a

catastrophic problem (severity = 4), 38% were rated as major usability problems (severity = 3),

96

56% were rated as minor usability problems (severity = 2), and 4% were cosmetic usability

problems (severity = 1).

The 50 usability issues were associated with 194 heuristic violations, as shown in Figure 11. The

most common types of heuristic violations, with over 15 occurrences, were memory, visibility,

match, error, minimalist, and flexibility. The “double-expert” team, consisting of a senior critical

care nurse and human factors expert, revealed 49 more violations than the single “double-

specialist” evaluator and attributed severity to more heuristic violations. When severity for all

issues, from both rounds, was compared there was a 68% severity rating match between the two.

Figure 11. Frequency of heuristic violations of the data integration and visualization software

Most important issues which should be addressed were the manipulation of the timeline (severity

of 4 - usability catastrophe), use of shading to highlight signals which were out of range, and

lack of an undo function (severity of 3 - major usability problem). These examples and others are

discussed below.

4.5.1 Example #1 - Catastrophic Problem

Issue description: The most important usability issue involved choosing the timeframe of data

to be viewed and was rated as a catastrophic problem. Selecting the timeframe of data to be

viewed is an important task as nurses are often required to compare a patient’s stability (i.e.,

vitals are within a target range) at a given point in time to the patient’s baseline values observed

97

at another point in time. For example, ICU nurses who often temporarily care for other nurses’

patients, when covering during breaks, may choose to review the patient vitals from the previous

hours to get a sense of the patient’s stability over time. To do so, the nurse would need to interact

with the software interface and specify the time frame of continuous patient data s/he would like

to view. However, s/he may encounter difficulty when trying to choose the timeframe because

the icons are very small requiring high visual acuity and dexterity with the mouse cursor to select

the desired timeframe. Not being able to easily manipulate the timeframe could lead to faulty

decision-making since interpreting the patient data requires correct time orientation (e.g. start

and end of data, time period of data, relative time period). Thus, the usability of timescale

manipulation is critical since its potential impact on clinical practice is high. In Figure 12, the

illustrative example shows one way to choose the time period of the data.

Figure 12. Screenshot of single patient view showing last two-week trend; the ovals show “pull-in”

or “pull-out” functionality used to select the time window

Heuristics violated: Consistency; visibility; match; memory; minimalist; memory; feedback;

error; undo and control.

Recommendation: Users should be able to manipulate the trend data in a way that they feel in

control of their selection and can easily identify what they have selected. The timeframe of the

data window should be more apparent, with larger sized font.

98

4.5.2 Example #2 - Major Usability Issue

Issue description: A major usability issue was the use of shading as an aid to rapidly visualize

patient signals that are out of range. Rapid visualization of out-of-range patient signals is a

critical feature because it can indicate duration and severity of patient stability. Although shading

of a single parameter may be clearly seen and understood, this feature may lead to confusion

when several patient signal trends are viewed on the same graph. Specifically, when multiple

signals are viewed, the various shadings may overlap thereby hindering nurses’ ability to detect

which specific signal or signals should be addressed. Such confusion could lead to inappropriate

interventions potentially causing patient harm. Figure 13 shows overlapped multiple signals,

each with different colored shadings.

Figure 13. Screenshot of patient signals with shading to indicate out-of-range patient vitals. Graph 1

shows overlapping out-of-range signals

Heuristics Violated: Visibility; memory; feedback and error.

Recommendation: Users should be able to interpret patient instability and detect which specific

signal is unstable, without having to rely on their memory to understand visualization cues such

Graph

Area 1

99

as shading. When it is desirable to view many patient signals and their targets ranges on the same

graph, consider cues other than shading to rapidly identify which signals are out of-range.

4.5.3 Example #3 - Major Usability Issue

Issue description: No “undo” for many actions including zooming in (i.e., no zooming out),

moving the time window along the timescale, and dragging-and-dropping several variables on

one graph. The absence of this function discourages exploration and learning, and could lead to

error in time sensitive situations.

Heuristics Violated: Consistency; match; memory; flexibility; error; undo and control.

Recommendation: New users should be able to perform actions and reversible actions to learn

through exploration. More importantly, when manipulating the interface to visualize data, if an

action creates a worse representation, users should be able to go back to a previous configuration

rather than start from a default setting or an inappropriate configuration. Frequent users should

be able to reverse actions to prevent serious errors or unintentional data representation. Designers

should consider programming an “undo” command for several of the functionalities mentioned

in this issue’s description and as a standard command for any actions performed at the interface.

4.5.4 Example #4 - Minor Usability Issue

Issues description: Use of words that hold different or no meaning to nurses in their clinical

practice. For example, in the census, the column “First Message” appears but does not relate to

information useful to their clinical decision-making. Also in the census, discharged patient data

are located in the “Archived patients” census. Another example is the use of computer

programming terms such as “Administrator” and “Modifier”, in the FAQ, which are specialized

terms for computer programmers but not be understood by clinicians.

Heuristics Violated: Match; memory and language

Recommendation: Change or eliminate the words or information which are unfamiliar to

clinicians.

100

4.5.5 Example #5 - Positive Features

In this sixth iteration, the software interface uses design elements that have been recognized as

helpful to end-users. First, the right-hand legend provides the choice of 5 minutes, 30 minutes or

12-hour trends and are similar to sparklines, developed by Tufte, and are described as “data-

intense, design-simple, word-sized graphics.”158 In a clinical setting, these “sparklines” (i.e.,

small representation of data) were found to be useful in providing physicians with trend

information.159

4.5.6 Example #6 - Positive Features

Second, a design feature that adhered well to the heuristics of consistency and match was the

default colors for traces of heart rate, blood pressures and oxygen saturation. Specifically, the

colors chosen to represent these vitals, on the T3™ interface, matched the colors used by the

bedside physiological monitor. Although no standard exists to represent physiological variables,

the colors used by the T3™ software matched those used in this study’s ICU setting and nurses

were familiar with them. In practice, when switching from the T3™ display back to the

physiological monitor, identifying the traces based on color would require minimal cognitive

effort due to adherence to match and consistency heuristics.

4.6 Discussion

From the heuristic evaluation, 40% of the usability issues identified were categorized as major or

catastrophic usability issues and the remainder, that is 60%, were minor or cosmetic usability

problems. Collectively, the major and catastrophic usability issues could have serious impact on

patient safety and should be addressed. In particular, timescale manipulation was identified as a

catastrophic issue with physiological data representation. Past research has shown that such

timescale manipulation issues contributed to physicians’ and nurses’ inability to see when a

particular physiological parameter has reached a critical point.106 Therefore, the catastrophic

problem of time manipulation requires much attention given the round the clock nature of critical

care.

The three most violated heuristics were those of “memory”, “visibility” and “match”. This

indicates the need to 1) design the software so that using it minimizes cognitive load, 2) display

101

information which clearly indicates what the system is doing, and 3) ensure the interface displays

trend information using cues familiar to nurses.

As the T3™ system integrates more of the monitoring technologies (e.g., electroencephalogram)

and even therapeutic technologies (e.g., infusion pumps, ventilators and feeding pumps) its

impact on decision-making will extend to many other clinicians (e.g. pharmacists, respiratory

therapists and dieticians) who may have different interface requirements. The software’s

extended use to the different types of clinicians could eventually lead to an impact on team-based

clinical decision-making. Thus, consideration must be given to the expected usability issues due

to medical device integration and use with other clinical information systems. That is, continual

efforts to integrate more of the stand-alone medical devices into this display may create new

usability issues as more patient signals are visualized. Designers should consider the heuristics

for medical devices, in the context of the changing multimodal monitoring system and advances

in clinical instrumentation. In addition, as new signals, features and functions are added to the

software, these may impact the interface layout and adherence to the core heuristics. For

example, a possible usability issue may be the visualization of intermittent non-invasive blood

pressures in addition to the continuous invasive blood pressure. The ability to visualize a new

type of blood pressure, in the form of non-continuous data points, may pose a visualization

challenge. To avoid confusion, a quick heuristic evaluation when a new type of data is integrated

into the software is recommended.

Another issue is the level of detail of the trend information available at the bedside; in this case,

a higher level of detail is available from the bedside physiological monitor. The T3™ display

aims to provide long-term trend information (e.g. minutes, hours or days, with a minimum of 5

second intervals) but currently, nurses only use very short-term trends or waveforms from the

physiological monitor (e.g. 15-second timeframes with a minimum of 0.2 second intervals). This

information requirement may indicate that any new trend monitoring software must provide

progressive level of detail to the waveform-level or make this information available. The choice

may not be for one or the other but to have both trends on the same screen or near each other for

quick patient baseline comparison. This usability issue may be confirmed through usability

testing or simulated clinical decision-making experiments with nurses.

102

This study represents the first heuristic evaluation of clinically available, highly interactive, data

integration and visualization software. The usability issues found through the heuristic evaluation

required little cost and the time of one representative end-user (expert nurse) and two human

factors researchers, one of which had observed the ICU and staff for eight months prior to the

first assessment. When all issues were pooled there was a 68% match of severity ratings. In all

instances, severity ratings deviated by one level suggesting use of a three-point (e.g., high-,

medium- or low-) severity scale rather than the four-point (i.e., 4-, 3-, 2-, 1-) severity scale may

minimize disagreement. Given the potential high-risk, high-impact nature of critical care, the

three-point scale would indicate that high and medium severity issues should be addressed and

little gain is achieved with categorizing into one more severity level.

Fifty usability issues were found and two positive design features were highlighted. When

addressing the usability issues efforts should be made to retain the positive design features.

These issues have been shared with the software developers and already some of these issues

have been addressed. In the future, we recommend that heuristic evaluations be performed on the

user interface before software implementation in the clinical setting.

4.7 Limitations

This study was highly institutional context-dependent and user-dependent. Three evaluators,

divided into one “double-expert” team consisting of one domain expert from nursing and one

domain expert from human factors, and one “double-expert”, with intermediate knowledge of

both domains may satisfy Nielsen’s requirement of at least two to three double specialists to

uncover between 81 to 90% of usability problems.155 This criterion may not hold for software

interfaces used in complex settings and used by several types of users.

Further study should include the involvement of nurses as they use the software to perform tasks,

confirming these usability issues and observing many other usability occurring with actual use. A

subsequent phase involving user physicians, nurses and respiratory therapists is planned.

The heuristic evaluation is meant to be a first step in the iterative user-centered design process.

Its strength as a quick evaluation tool means it can be applied as a change-driven process for

quick prototyping in view of optimizing the interface before testing with actual users and

different types of critical care specialists.

103

4.8 Conclusion

The heuristic evaluation method applied by the complementary team identified and prioritized

key interface problems according to severity and impact of the usability issues which can be

addressed during the iterative design life cycle of the software. Heuristic violations help guide

designers by specifying what type of solution is required and help match solutions with known

visualization aids. By using the decades of knowledge from software interface design and the

heuristics for medical devices, basic usability issues were quickly identified with time of few

evaluators. Multidisciplinary teams consisting of actual end-users reveal many more usability

issues than with single evaluators. Throughout the development of the data integrating and

visualization software, quickly finding and addressing the interface usability issues early can

facilitate the transition and integration of these systems into the actual setting. This new software

tool has the potential to minimize the sources of disparate data and help critical care nurses

manage the numerous patient data signals, but the many usability issues must be addressed to

minimize potential use errors and realize its full potential.

104

Chapter 5 Usability of Continuous Data Integration and Visualization

Software

The objectives of this study were to identify and evaluate usability issues of the data integration

software and to determine the ease of use and potential safety impact on clinical decision-

making, while considering the different perspectives of the multidisciplinary critical care team.

In addition, recommendations to improve this and similar data integration platforms are

provided. This chapter was taken verbatim from the published manuscript and can be found on

the Biomedical Central website.

5.1 Abstract

Background: Intensive care clinicians use several sources of data in order to inform decision-

making. We set out to evaluate a new interactive data integration platform called T3™ made

available for pediatric intensive care. Three primary functions are supported: tracking of

physiologic signals, displaying trajectory, and triggering decisions, by highlighting data or

estimating risk of patient instability. We designed a human factors study to identify interface

usability issues, to measure ease of use, and to describe interface features that may enable or

hinder clinical tasks.

Methods: Twenty-two participants, consisting of bedside intensive care physicians, nurses, and

respiratory therapists, tested the T3™ interface in a simulation laboratory setting. Twenty tasks

were performed with a true-to-setting, fully functional, prototype, populated with physiological

and therapeutic intervention patient data. Primary data visualization was time series and

secondary visualizations were: 1) shading out-of-target values, 2) mini-trends with exaggerated

maxima and minima (sparklines), and 3) bar graph of a 16-parameter indicator. Task completion

was video recorded and assessed using a use error rating scale. Usability issues were classified in

the context of task and type of clinician. A severity rating scale was used to rate potential clinical

impact of usability issues.

105

Results: Time series supported tracking a single parameter but partially supported determining

patient trajectory using multiple parameters. Visual pattern overload was observed with multiple

parameter data streams. Automated data processing using shading and sparklines was often

ignored but the 16-parameter data reduction algorithm, displayed as a persistent bar graph, was

visually intuitive. However, by selecting or automatically processing data, triggering aids

distorted the raw data that clinicians use regularly. Consequently, clinicians could not rely on

new data representations because they did not know how they were established or derived.

Conclusions: Usability issues, observed through contextual use, provided directions for tangible

design improvements of data integration software that may lessen use errors and promote safe

use. Data-driven decision-making can benefit from iterative interface redesign involving

clinician-users in simulated environments. This study is a first step in understanding how

software can support clinicians’ decision-making with integrated continuous monitoring data.

Importantly, testing of similar platforms by all the different disciplines who may become

clinician users is a fundamental step necessary to understand the impact on clinical outcomes of

decision aids.

5.2 Background

The Intensive Care Unit (ICU) setting is a complex socio-technical environment where patients

with life-threatening conditions, frequently needing advanced organ support technologies, are

continuously monitored by teams of specialized clinicians.160,161 This setting is synonymous with

multimodal monitoring (MMM) defined as “the combined use of monitors, including […]

clinical examination, laboratory analysis, imaging studies, and physiological parameters” and

relies on human knowledge and skills to effectively use the data.3,151,162,163 However, the massive

amount of data may not be serving patient outcomes. Clifford reports a “growing awareness

within medical communities that the enormous quantity and variety of data available cannot be

effectively assimilated and processed without automated or semi-automated assistance.”17 Celi

attributes the difficulty of establishing cause and effect relationships between the interventions

and the critically-ill patients to the “exceptional complexity of the [ICU] environment […]

particularly vulnerable to variation across patient subsets and clinical contexts.”164 In pediatric

intensive care, complexity of care is increased compared adults due to weight-based dosing, and

age-dependent pharmacokinetics, pharmacodynamics, and physiological norms.165,166

106

Multidisciplinary team care that complements physician care has improved survival of this

complex patient population.167 In fact, the Society of Critical Care Medicine maintains that

“Right Care, Right Now™” is best provided by an integrated team of dedicated medical

experts.45 In the Canadian setting, the core team is comprised of physician intensivists, nurses,

and respiratory therapists. Consequently, all these clinicians must be able to effectively detect

and react to changes in patient status informed by the vast array of MMM data. As such, we

propose that data integration and visualization software may be a solution to help clinicians

process MMM data. The study’s purpose was to test the data integration and visualization

software, specifically the level of simplicity to detect and understand changes in the patient state.

We hypothesize that to properly display patient-specific ICU data in a manner which conveys

meaning to the clinician, software should support data processing in a thoughtful, intuitive, and

user-friendly manner.168 Sub-optimal care may be traced to “flawed user interfaces” that result in

cognitive errors and data misinterpretation.30,40,169 A human factors study approach was chosen

to empirically identify ease of use and safety issues. This approach is well established in aviation

and nuclear power industries to help inform what an optimal user-interface design is and has

recently been applied to healthcare.170-173 In this study, we tested the usability of T3™, a data

integration and visualization software program. The study is the first to report the usability of a

commercially available, interactive, data integration, and visualization software for an ICU

setting.

5.2.1 Data Integration and Visualization Software

In March of 2013, the T3™ software was implemented in a large pediatric ICU department. This

web-based tool captures and displays integrated physiologic data exported from devices and

monitors attached to patients. Specifically, it displays patient-generated physiological data and

therapeutic intervention data (e.g. from infusion pumps and/or a ventilator, or diagnostic results

from blood work with timestamps of important medical events such as chest closures or cardiac

arrests). A schematic of data sources is presented in Figure 1. Its three main functions are

tracking (e.g. supports tracking of patient parameters to their unique norms over time), trajectory

(e.g. visually integrates patient-specific data to show relationships), and triggering (e.g. derives

meaning to support clinical decision-making through real-time computation of the data). It was

available to all clinicians in the unit to either use in real-time or at a later point for review and

debriefing.

107

To access T3™, a login separate from the existing hospital-based network is required. The

interface is not permanently displayed, requiring the clinician to login and access the integrated

data. Prior to implementation, the clinicians were shown the T3™ platform and were provided

information about access and function. However, expectations for use within the ICU workflow

were not made. It should also be noted that T3™ is not an approved patient monitor and there are

no alarms incorporated into the software. It is used at the discretion of clinicians rather than

mandated.

5.2.2 Overview of Project Phases

To evaluate the T3™ continuous multimodal monitoring software design, specifically regarding

end-user needs, a four-phase project was undertaken and loosely conforms to ISO 9241-210

standard.174 The four phases included a systematic literature review, a qualitative study of the

ICU and its clinicians, a heuristic evaluation of the software, and, finally, this usability

investigation of the software (see full description in Figure 2). All phases were part of the user-

centered design and evaluation process. The systematic review focused on studies evaluating

intensive care data integration and visualization on the clinician end-user. This review identified

and assessed human factors studies of qualitative and quantitative natures. The second phase was

an observational study in the ICU where clinicians were observed and interviewed to assess how

physicians, nurses, and respiratory therapists used data, information, and technologies to

influence critical decisions. The third phase of the project was a heuristic evaluation, which is a

cost-effective usability technique. It identifies potential usability issues and associates them to

violations of established good interface design principles.125 Two human factors specialists, a

senior ICU nurse and a senior ICU physician, found 50 potential usability issues associated with

194 heuristic violations.175 While heuristic evaluation is an efficient and inexpensive method to

uncover potential usability issues, usability testing is recognized as a better method because

obstacles are obtained directly from the end-user’s interaction with the system. The fourth phase

was a usability study, of which results are presented here. The goal of this final phase is to assess

how existing data integration software can support physicians, nurses, and respiratory therapists

with their use of continuous data.

108

Iterate, where

appropriate

Phase 2a: Observations

(Understand and specify

the context of use)

Phase 2b: Interviews with

ICU team(Specify the user

requirements)

Recommendations from all Phases

(Produce design solutions

to meet user

requirements)

Phase 3: Heuristic Assessment;

Phase 4: Usability Testing

(Evaluate the design

against requirements)

Phase 1: Proposal and Planning

(Plan the human-centered

design process)

Ongoing Monitoring Post-

Implementation of Design Changes

(Designed solution meets

user requirements)

Figure 14. User-Centered Design and Evaluation Process of an Existing Data Integration and

Visualization Platform in Accordance with the ISO 9241-210 Standard

The iterative design and evaluation cycle is broken down into phases with the related ISO 9241-210 standard’s phases in parenthesis. The cycle was carried out once with each phase described for the design/evaluation of data integration and visualization software for intensive care monitoring and decision-making. Phase 1 was an initial phase where the user-centered design process was identified and work included gathering existing studies in the form of a systematic review. Phase 2 included both unit-level observations and clinician-level interviews to gather information about intensive care work using continuous data. Phase 3 was a heuristic assessment of the software to determine usability issues that violate accepted interface design principles and to suggest design solutions. Phase 4 was a usability test method where issues were identified by actual users performing true-to-work tasks and recommendations for design solutions were provided. Results from this last phase are presented here.

Usability testing has been used to evaluate a number of healthcare technologies such as infusion

pumps, computerized physician order entry, radiation therapy systems, and electronic medical

record systems.166,176-180 There has been little focus on usability testing of data integration

109

software from MMM devices.102,181 By observing users as they carry out realistic tasks, human

factors specialists evaluate how technology helps users accomplish their work goals while

assessing their needs and satisfaction. The strength of usability testing stems from the qualitative

information revealed while using the software. Through these observations, human factors

specialists identified the following: 1) what content is missing, and 2) what design elements went

undetected, led to confusion, and/or led to errors. Based on this data, the design can be refined to

provide better support mechanisms. Consequently, corrective actions are primarily system-based

as opposed to changing human behavior.

The objectives of this study were to identify and evaluate usability issues of the data integration

software and to determine the ease of use and potential safety impact on clinical decision-

making, while considering the different perspectives of the multidisciplinary critical care team.

In addition, recommendations to improve this and similar data integration platforms are

provided.

5.3 Method

5.3.1 Study Design

This is a human factors usability study to assess specific continuous monitoring data integration

software. The study was approved by the Research Ethics Board of the test site institution and

the clinician participants’ hospital.

5.3.2 Setting

Testing sessions were conducted from January to February of 2016 in a usability laboratory

equipped with observational booths behind one-way glass and multiple ceiling-mounted cameras

and microphones. During the two-month study period, data showed low usage with an average of

10 weekly users, in the ICU. Physicians used the software most of the time (96%) compared to

nurses (4%) and respiratory therapists (0%). Usage logs from the ICU indicated there were

between five and 17 weekly users, or approximately 6% of an over 300-clinician staff. The

active users were mostly physicians, and they collectively used the software a total of 30 hours

per week.

110

5.3.3 Software

T3™ is a web-based software available at multiple tertiary hospitals in North America, which

continuously collects, integrates, and displays data from monitoring and intervention devices

every five seconds. Four types of visual aids are generated: 1) time series of continuous

numerical data (e.g. trend lines) displayed as an average over five seconds, 2) colored

highlighted layer over time series (e.g. shading of trend lines), 3) automatic short-term trends

(e.g. sparklines), and 4) persistent bar graph representation of percent risk (e.g. IDO2 indicator).

All are shown in Figure 3. We tested T3™ version 1.6 as a fully-interactive working prototype

software, identical to what was available in the ICU. The version we tested included a 16-

parameter proprietary algorithm which estimated the risk of inadequate oxygen delivery.60

Software was accessed through an intranet website, hosted on a virtual server behind the

hospital’s firewall, and used a Google Chrome™ web browser installed on a computer running a

Microsoft® Windows™ operating system. TechSmith® Morae® software version 2.0.1 was

used to collect audio and video data from the computer screen and the participant’s facial

expressions as they interacted with the software during the simulations (See Figure 15). R

software version x64 3.2.2, package irr, function kappa2, was used to calculate statistics.

111

Figure 15. Representation of time series fictitious data and triggering visual aids: 1) shading, 2)

sparklines, and 3) bar graph of single indicator IDO2 algorithm

Composite screenshot showing time series (center, all parametric trends), the primary visual aid,

with 1) overlaid out-of-range target shading (third graph area), 2) sparklines showing condensed

trend line of fixed time period with exaggerated minima and maxima (far-right), and 3) bar

graphs representing the single indicator which calculates the risk of inadequate oxygen delivery

(IDO2) (bottom).

5.3.4 Scenarios and Tasks

Scenarios, of which there were three, were based on post-cardiac surgery newborn patients, their

data sets, and the events they experienced while in the unnamed North American pediatric

hospital’s ICU. The comprehensive data sets included dozens of monitoring and intervention

data streams and were good representations of closely monitored ICU patients. The data sets,

provided by the software developers, were populated with fictitious names, medical record

112

numbers, and background information (See Table 18). During each test session, the data replayed

from the same start time and presented the patient’s evolving status in real-time. Each scenario

contained at least 24 parameters of continuously collected data, and clinicians could

simultaneously visualize data from up to 16 parameters (four per panel).

Table 18. Description, parameters available, key data features of three scenarios, based on real

patients, used to test the T3™ software functions.

Scenario

number

Main events or interventions Number of

parameters

Key data features

1 - 2 episodes of hypotension - 1 cardiac arrest - 1 initiation onto extracorporeal membrane oxygenation

32 total 24 active

- physiological monitoring - infusion pump data - temperature data - laboratory data

2 - 1 increased erroneous, medical infusion (dopamine) - 1 intervention (inhaled nitric oxide therapy)

34 total 28 active

- physiological monitoring - infusion pump data - laboratory data

3 - 1 attempt at bedside chest closure - 1 cardiac arrest

46 total 36 active

- physiological monitoring - infusion pump data - ventilator data - three oxygen saturation parameters - laboratory data

These three scenarios were the overall context in which participants were asked to carry out 20

types of tasks regarding continuous data use. The tasks are described in Appendix G, Table 26.

5.3.5 Participants

Participants were pediatric intensive care clinicians from three critical care disciplines: seven

physicians, eight nurses, and seven respiratory therapists. They were from the same institute

where the software was implemented. They were the equivalent of full time staff of a large,

tertiary, Canadian, pediatric hospital and all had access to T3™ in their ICU. To detect at least

80% of possible discipline-specific usability issues, seven participants from each discipline were

sufficient.182

5.3.6 Procedure

Upon arriving to the simulation lab, each participant received a brief orientation, outlining the

purpose and objectives of the evaluation, and consent was formally obtained. Participants were

informed that they would be observed, videotaped, and audiotaped. The study facilitator

addressed any questions or concerns before the participants reviewed and signed the consent

113

form. Participants then completed the pre-test questionnaire. No training was provided before the

experiment, although some clinicians received introductory training sessions post ICU software

launch.

During the simulations, participants were asked to “think aloud” as they executed each task. This

was to gain insight into their thought process, as well as providing insight into their use of data

and the information available to them. Both audio and video recordings were made of

simulations (See Figure 15). When a participant was challenged, they verbalized their thoughts

to indicate the cause. A facilitator and two data recorders were in the observation room behind a

one-way mirror. They facilitated, observed, and recorded participant performance (e.g. use,

difficulties, and errors) and feedback. After participants completed scenarios or the allotted time

was exhausted, the facilitator conducted a debrief interview and a post-test questionnaire.

Feedback about participant experience with the T3™ system was collected, comments during

simulation were clarified, and any concerns and/or questions arising from the evaluation were

addressed.

5.3.7 Data Analysis

5.3.7.1 Scoring Task Completion and Usability Error Definition: Use Error Rating

Within this study, we established Use Error Ratings (UERs) on a scale of 2-0 to assess

clinicians’ software competency. (UER definitions are presented in Table 19, both in nominal

form and as numerical codes.) 2 means a “Pass” and indicates clear task completion with no hint,

clarification, or reminder required. 1 means “Help” and indicates one hint was provided for the

task to be completed. 0 means “Fail” and indicates the task could not be completed despite

providing the participant with two hints or more. Task-related usability issues occurred with an

average UER of 1.1. For more depth regarding this data, see Appendix G, Table 27 which

analyzes the data. The same usability issues were analyzed using a percentage pass rate.

Table 19. Use error rating definitions, shown as nominal and numerical codes

Normative

Use Error

Rating

Numerical

Use Error

Rating

Definition

Pass 2 User completed task with no hint, clarification, or reminder

Help 1 User completed task with one hint

Fail 0 User did not complete task despite several hints

114

To ensure appropriate evaluation, the 20 types of tasks attempted were coded by two raters

(authors YL and JT). Interrater reliability was reported both as an absolute percent agreement

and equal weighting Cohen’s Kappa, taking into account chance agreement.183 Where there was

disagreement, agreement was reached through discussion. Based on the associated numerical

code for each clinical group, an average UER was calculated for each task. The average UER

was also calculated for each group of tasks which represented the three general functions of the

software (e.g. tracking, trajectory, and triggering). Finally, a global UER average was calculated

for all participants and for all tasks.

5.3.8 Usability Issue Severity Level

Potential severity of the use error was categorized as minor, if patient was unlikely to be harmed;

moderate, if patient could be temporarily harmed; or high, if patient could be permanently

harmed. This was coded by one rater (YL) and confirmed with an expert physician intensivist

(author AMG). In the case of discrepancies, final score was determined through discussion. This

approach was used to rate the importance of a use error.184

5.4 Results

5.4.1 Participants

At the time of the study, the 22 participants were full-time ICU staff. Participant demographics

are shown in Table 20. From the pre-session questionnaire, only 27% of participants received

formal training when it was offered over two years ago. Though 64% of the participants were

aware of the software, 82% rarely or never used.

The extent of underuse was unknown when usability testing was carried out. Consequently, the

pre-session questionnaire did not ask participants why they did not use the software. The

research team included the question “Did you know that T3™ is accessible from all PC

workstations?” because they suspected that staff were unaware they had access to the software.

Of the 20 participants who answered this question, 14, or 70%, were aware they could access the

software. Two participants, who knew they had access but did not use T3™, provided insight as

to why they did not use it. One nurse preferred to look at the physiological monitor because it

offered a real-time view of the patient status with more detail than T3™. The other nurse stated it

could compliment his/her view of the patient status if s/he had time to use it during his/her shift.

115

These findings suggest that, at the very least, most participants did not extensively use the T3™

software and were naïve to the software.

Table 20. Demographics, clinician specialization, training, current use, and awareness of data

integration software. * CCCU: Cardiac Critical Care Unit; ** PICU: pediatric intensive care unit

Physicians Nurses Respirator

y

Therapists

Global

Proportion

Total Number 7 8 7 22

Gender, % (n) Male 14 (n=1) - 14 (n=1) 9 (n=2)

Female 86 (n=6) 100 (n=8) 86 (n=6) 91 (n=20)

ICU Experience, %

(n)

<1 year 43 (n=3) 13 (n=1) 29 (n=2) 27 (n=6)

1-3 years 29 (n=2) 25 (n=2) - 18 (n=4)

4-10 years 29 (n=2) 25 (n=2) 57 (n=4) 36 (n=8)

>10 years - 38 (n=3) 14 (n=1) 18 (n=4)

ICU Shifts/Week,

% (n)

1-2 times/week - 25 (n=2) 29 (n=2) 18 (n=4)

3-4 times/week 29 (n=2) 75 (n=6) 71 (n=5) 59 (n=13)

>4 times/week 71 (n=5) - - 23 (n=5)

ICU Specialization,

% (n)

CCCU* 29 (n=2) 63 (n=5) - 32 (n=7)

PICU** 29 (n=2) 38 (n=3) - 23 (n=5)

PICU/CCCU 43 (n=3) - 100 (n=7) 45 (n=10)

Previous Training

with Software, %

(n)

Yes 14 (n=1) 50 (n=4) 14 (n=1) 27 (n=6)

No 86 (n=6) 50 (n=4) 86 (n=6) 73 (n=16)

Software Use/Shift,

% (n)

Several times/shift 29 (n=2) - - 9 (n=2)

Once/shift 14 (n=1) 13 (n=1) - 9 (n=2)

Rarely during a shift 43 (n=3) - - 14 (n=3)

Never 14 (n=1) 88 (n=7) 100 (n=7) 68 (n=15)

Awareness of

Software, % (n)

Yes 71 (n=5) 75 (n=6) 43 (n=3) 64 (n=14)

No 29 (n=2) 25 (n=2) 57 (n=4) 36 (n=8)

5.4.2 Interrater Reliability

For all attempted tasks by each participant, the interrater reliability of the UER was 89% between

the two raters (YL and JT). This is in absolute agreement with an equal weighted Cohen’s kappa

of 0.85, corresponding to a strong level of agreement.183

5.4.3 Software Strengths (Aid to Task Completion) and Usability Issues (Hindrance to Task Completion)

5.4.3.1 Overview

Due to time constraints, not all 20 types of tasks could be completed. Participants attempted an

average of 18 of all 20 types of tasks (88%). The task groups representing the three main

116

software functions had the following UER: 1.5/2, or “Pass”, for tracking; 1.3/2, or “Help”, for

trajectory; and 0.4/2, or “Fail”, for triggering. For all tasks, the overall UER was similar across

disciplines with a UER of 1.3 for physicians, a UER of 1.3 for nurses, and a UER of 1.2 for

respiratory therapists. A summary of the average ratings, by tasks and clinician groups, are

shown in Table 21 and illustrated in Figure 16.

Table 21. Usability tasks tested with severity levels and use error ratings.

General

Functions Tasks Tested for Each

Function

Error

Severity

Level

Average Use Error Rating by

Task and by Clinician Type Average

Use

Error

Rating

by Task

Physicians

(n=7)

Nurses

(n=8)

Respiratory

Therapists

(n=7)

Tracking: Orientation (4 tasks)

1. Locating patient file High P (2.0) P (2.0) P (2.0) P (2.0)

2. Identifying a value for a specific physiological variable

High P (1.8) P (1.8) P (1.5) P (1.7)

3. Estimating duration of event by identifying two time points

High H (1.4) P (2.0) P (2.0) P (1.8)

4. Manipulating time scale High H (1.0) F (0.4) H (0.6) H (0.6)

Function Use Error Rating

by Clinician Type

P (1.5) H (1.5) P (1.5) P (1.5)

Trajectory: Relationships between Parameters (10 tasks)

5. Comparing trends for two specific parameters

High H (1.4) P (1.6) P (1.5) H (1.5)

6. Comparing different patient physiological states

High H (1.3) H (1.4) H (1.2) H (1.3)

7. Identifying values for two specific parameters at an event

High H (1.4) H (1.1) H (0.6) H (1.0)

8. Identifying vital signs (group of parameters) prior to an event

High H (0.7) F (0.4) H (1.3) H (0.8)

9. Viewing trend of three redundant overlapping parameters

High H (1.3) H (1.4) H (0.7) H (1.1)

10. Viewing infusion medication data

High P (1.8) H (1.3) P (2.0) P (1.7)

11. Comparing infusion medications with vital signs

High P (1.9) P (1.7) P (1.6) P (1.7)

12. Detecting change in infusion medication rate over time

High H (1.4) F (0.4) H (0.5) H (0.8)

13. Viewing ventilator data High P (2.0) P (1.6) P (1.6) P (1.7)

14. Viewing laboratory data High H (1.0) P (1.8) P (1.7) H (1.5)


by Clinician Type

H (1.4) H (1.3) H (1.3) H (1.3)

Triggering: Automated Integration (3 tasks)

15. Viewing target ranges using shading (semi-automatic aid)

Moderate F (0.4) H (0.6) F (0.4) F (0.5)

16. Sparkline (automatic trend Minor F (0.4) H (0.8) F (0.0) F (0.6)

117

General

Functions Tasks Tested for Each

Function

Error

Severity

Level

Average Use Error Rating by

Task and by Clinician Type Average

Use

Error

Rating

by Task

Physicians

(n=7)

Nurses

(n=8)

Respiratory

Therapists

(n=7)

line for one variable)

17. IDO2 indicator (automatic computation using 16 parameters)

High F (0.4) H (0.5) F (0.3) F (0.4)


by Clinician Type

F (0.4) H (0.6) F (0.2) F (0.4)

Other Functions (3 tasks)

18. Finding notes High H (1.1) P (1.9) H (1.4) H (1.5)

19. Modifying/adding note Moderate H (0.9) H (1.3) P (1.5) H (1.2)

20. Setting targets Moderate P (1.9) P (1.9) P (2.0) P (1.9)


by Clinician Type

H (1.3) P (1.7) P (1.6) P (1.5)

All

functions

Global Function Use Error

Rating, for All Functions by

Clinician Type and for All

Clinicians

H (1.3) H (1.3) H (1.2) H (1.2)

118

Figure 16. Variation of use error ratings across clinician disciplines for all tasks related to

tracking, trajectory, and triggering as well as other software functions

Three levels of use error ratings (UERs) were employed by two raters and averaged for each type of clinician for 20 tasks. The UER distribution was further grouped by function: tracking (Tasks 1-4), trajectory (Tasks 5-14), triggering (Tasks 15-17), and other (Tasks 18-20). Usability issues, defined as tasks with a UER of 1 or less and highlighted in yellow or red, were dependent on the type of task and, to a lesser extent, on the type of clinician. Most usability issues were centered on the trajectory and triggering functions. UER: Pass (P)=2 (green), Help (H)=1 (yellow) and Fail (F)=0 (pink). Clinician groups: DR: physician intensivists, RN: intensive care nurses, and RT: respiratory therapists.

119

5.4.3.2 Tracking Function

Tracking describes the general function of patient census navigation (using the dedicated census

page or short-cut drop-down menu) and time orientation (using time series visual aids). It is a

critical function since making time-sensitive decisions on the wrong patient, or with data that

corresponds to a mistaken time period, can potentially lead to patient harm. The average UER for

patient tracking tasks was 1.8, indicating that participants completed tasks with little or no help.

All tracking tasks had potentially high clinical impact severity. Clinicians easily completed three

of four patient tracking tasks: Task 1) locating a patient in the census, Task 2) identifying a value

for a specific physiological variable, and Task 3) estimating duration of an event by identifying

two time points. However, clinicians had difficulty completing Task 4) manipulating the

timeline, which corresponded to a UER of 0.6.

5.4.3.2.1 Tracking Usability Issue: Situating the Patient Data in Time

Though clinicians could choose their patient and select data from a given time period of data,

they could not easily select specific time periods. To test participants’ ability to situate the data

in time, participants were asked to determine the patient’s length of stay by manipulating the

interface from a default view showing partial patient data. All clinician groups encountered

difficulty with this task, demonstrated by UER scores of 1.0 for physicians, 0.4 for nurses, and

0.6 for respiratory therapists.

This task can be parsed into three successive steps: 1) condense all the collected data into a

single window, 2) check the start and end of the data, and 3) mentally calculate the entire length

of stay. Task difficulty may be due to the first two steps which required clinicians to understand

how to use the six interactive features for time manipulation (See circles in Figure 5a). Clinicians

needed to manipulate the interface and find the start and end of the patient data. Since this was a

“live” patient, the start and end of the data indicated to clinicians when continuous monitoring of

the patient started and, consequently, when they first came to the ICU. The six interactive

features were, at times, imperceptible to participants and required high visual acuity, as well as

manual dexterity. As participants looked back at the parametric data in time, they assumed they

had found the start of the patient data if they encountered a gap (See Figure 17a). When

prompted to continue to look back, they found that there was still more data (See Figure 17b).

These two screenshots show how the interface did not communicate to clinicians the start and

120

end of patient data and could leave users with a sense of uncertainty about whether they were

seeing all the data for a particular patient.

Figure 17. Usability issue of time manipulation interface

Screenshots of patient view with time manipulation interactive light blue icons, circled on top section, with heart rate, arterial blood pressures, and oxygen saturation data streams. Screenshot a) appears to show start and end of data but screenshot b) shows the same gap as an interruption in the data streams, signifying the patient was away from the ICU and therefore, was not continuously monitored.

In conclusion, for the tracking function, most clinician groups could complete three of four tasks,

but the main usability issue centered on the task requiring precise and accurate manipulation of

data presented as a time series. For the specific task of viewing all the data for a particular

patient, exploring the data may leave users with a sense of uncertainty or frustration. Some

participants asked for a manual input of the horizontal (time) range suggesting they did not feel

they could choose the time window of data to a satisfying extent.

5.4.3.3 Trajectory Function

Clinicians closely monitor patient trajectory for rapid or gradual changes by comparing current

physiological monitor data to daily target thresholds. With the availability of continuous data

from a patient’s entire ICU stay, determining trajectory then involves viewing related parameters

121

and investigating both overall trends and point data. To support such analysis, we asked

clinicians to complete ten tasks (Tasks 5 to 14), which tested how easily clinicians could create

multiparametric visualizations (Tasks 5, 6, 9, 10, 11, 13 and 14) and extract data from these

complex visualizations (Tasks 7, 8 and 12). Creating multiparametric visualizations required

clinicians to intuitively understand how to select parameters from a list and view them together

on one of four panels (See Figure 6). Identifying a single point of data required clinicians to hone

in on the time series visualization and read off the chosen parameter’s value on the left-hand side

(See Figure 6). Of the ten trajectory tasks, clinicians failed to complete four (Tasks 7, 8, 9 and

12) and required little or no help (an average UER above 1) to complete the remaining six tasks

(Task 5, 6, 10, 11, 13, and 14) (See Figure 16 and Table 21). All clinician groups had similar

UERs for this set of tasks with 1.4, 1.3, and 1.2 for physicians, nurses, and respiratory therapists,

respectively (See Table 21).

5.4.3.3.1 Trajectory Software Strength: Creating Multiple Parametric Visualizations

Generally, seven tasks (Tasks 5, 6, 9, 10, 11, 13, and 14) were used to test how clinicians used

the software to visualize multiple parameter trends. Essentially, the tasks were to find parameters

and add them to a default of three basic vitals: heart rate; systolic, diastolic, and mean blood

pressures; and oxygen saturation. Task completion generally had a good UER above 1.3, except

for Task 9 which had a UER of 1.1 due to unfamiliar data labels assigned at a different ICU.

Physicians, nurses, and respiratory therapists required little or no facilitation to accomplish the

task of combining different parameters (Task 5, 6, 10, 11, 13 and 14), and the combined average

of all three groups was above 1.0 when creating complex visualizations.

Task 11 was used to test how easily clinicians could visualize both intervention and

physiological data streams, thereby, investigating their interrelationships. Most clinicians

successfully completed this task and had a group UER of 1.9, 1.7, and 1.6 for physicians, nurses,

and respiratory therapists, respectively. One nurse stated that instead of looking at infusions and

vitals separately, making it necessary to recall a child’s baseline physiological vitals from

memory, the software supported this task by displaying both types of parameters on the same

graph. A second nurse remarked that it was “easier to put together the picture [compared to the

current electronic charting system]” and, similarly, one physician remarked “I’m not working as

122

hard with T3™ to make a mental visualization”. These comments indicate that participants liked

how the software helped them to visualize parameter trends or see all the pieces of the puzzle.

5.4.3.3.2 Trajectory Usability Issues: Using Multiple Parametric Visualizations

Intensive care requires knowledge of both overall patient trajectory, spanning their ICU stay, and

the immediate trajectory, such as in response to a therapeutic intervention. Software should

partially off-load the cognitive processes required to transform numerical, short-term data into

longitudinal trends without losing the granularity of the point data. In this study, once clinicians

chose and viewed a set of parameters from dozens available, they were asked to extract and

understand nuances about the combined trends. Two types of tasks tested how clinicians

interpreted multiparametric visualization: 1) identifying point data (Tasks 7 and 8), and 2)

detecting change (Task 12).

To hone in on the time of an event, both Tasks 7 and 8 required time manipulation, a core

usability issue previously discussed. Participants were asked to report values for parameters by

identifying point data from the trends. This dynamic manipulation of the interface required high

visual acuity, manual dexterity, and visual sensitivity to display data for a given time period. It

also required the ability to scan values associated with each parameter chosen (See Figure 18).

Clinicians had more difficulty reporting values for groups of parameters (Task 8) than two

specific parameters (Task 7).

123

Figure 18. Time series data visualization of multiple physiological signals and therapeutic

interventions

Screenshot of patient view showing four view panels with data streams for heart rate and arterial blood pressures in the top panel; oxygen saturation in the second from top panel; medical infusions for epinephrine and norepinephrine in the third from top panel; and blood gas analyses for hemoglobin and carbon dioxide partial pressure in the bottom panel. The identified values for March 13th at 21:17 are found at the left-hand side of the screen and are related to the point in the time series by arrows.

Task 12 required clinicians to detect when a continuous infusion was stopped. Though

physicians (a task average UER of 1.4) could better detect an interruption in the infusion than

nurses (a task average UER of 0.4) and respiratory therapists (a task average UER of 0.5), most

participants failed to notice this. This may be due to 1) an infusion rate of 0 μg/kg/min was

plotted as a continuous line, and/or 2) the automatic vertical scaling feature called “best-fit”

created a vertical range of -0.1 to +0.1 μg/kg/min (See Figure 19). Participants were often

surprised that a rate of 0 μg/kg/min was plotted as a line in the middle of the graph and, instead,

expected a gap in the data when the rate was 0 μg/kg/min. A higher physician UER may also be

explained by the investigative nature of physician work, more advanced training in

pharmacokinetics, and their role as initiators of medical infusions.

The failure to detect change could be attributed to distraction from the multiple viewing panels

(four) that were populated by several parameters of different scales and may have divided

124

participant attention, making detecting parameter changes more challenging. Furthermore,

detecting change only from the time series pattern may have been troublesome due to a small

font size.

Figure 19. Usability issue of auto-fit scaling resulting in misinterpretation of when the medical infusion

ceased

Screenshot of epinephrine infusion with auto-fit scaling resulting in a negative infusion rate of -0.1 μg/kg/min and a rate of 0 μg/kg/min plotted as a line in the middle of the graph area.

Participants suggested scaling based on realistic parameter ranges. For example, medication

infusion scales should always start from 0 since negative infusion rates are impossible and

differences in orders of magnitude between infusions should be graphed as to not dwarf each

other (e.g. dopamine and epinephrine differ by two orders of magnitude). Additionally, scales for

temperature plots should start at approximately normal body temperatures to help highlight

important variances around the baseline to be more informative than if the scale started from 0.

5.4.3.4 Triggering Function

Currently, monitoring a patient involves data from physiological monitors displayed as short-

term (e.g. below a minute) waveforms and numerical values with visual or audible alarms to

signal out-of-range targets. To partially off-load the cognitive processing of monitoring, the

125

software provided three visual aids, or triggers to decision-making, which are overlaid on the

long-term time series data to make unstable time periods more apparent. The triggers were either

semi-automated, requiring clinician input, or fully-automated visualizations, derived only from

the data. Deviations from baselines were highlighted by 1) shading time series data, 2) displaying

mini-trends (sparklines) with exaggerated minima and maxima, and 3) by automatically

computing and displaying the risk of inadequate oxygen delivery (IDO2) as a color-coded bar

graph (See Figure 15). Thus, the software highlighted periods of continuous data with

undesirable trajectory, either for single or combinations of parameters. In this way, clinicians

may interpret data faster by focusing their attention on a portion of data from the computer-

generated visual trends instead of memorizing and creating their own long-term mental trends.

To test the triggering function, clinicians were asked to use the visual aids of shading (Task 15),

sparklines (Task 16), and the IDO2 indicator (Task 17). In general, participants ignored the visual

aids until they were asked to attempt the task and all had UERs below 1, with a global triggering

UER, aggregated by task and clinician type, of 0.4 (See Table 21). Specifically, one physician’s

comment on shading out-of-range values: “I'm not sure if the shading is helpful, I'm getting

distracted by the area under the curve or the shape or something. Maybe if target ranges were

shown as two straight lines across the graph." Clinicians stated that sparklines did not provide

enough detail to be useful. One nurse commented that "[they] prefer[red] just looking at the

graph than looking at that little [graph] on the side because it's bigger, you can see [the graph]

better." Usability of the IDO2 trigger will be discussed in following section.

5.4.3.4.1 Data Reduction: IDO2 Indicator

The IDO2 indicator, derived using a 16-parameter algorithm, calculates and displays the risk of

inadequate oxygen delivery. Six out of seven physicians were unaware of the IDO2 indicator and

were skeptical of it because they did not know how it was derived. One physician’s mistrust was

voiced as follows: “I don’t believe this [indicator] because I don’t know where it came from [or]

what formula [it is based on].” In addition, since physicians regularly integrate data and derive

their own assessment of patient instability, they stated that the indicator was redundant with their

own assessment. For physicians using IDO2 for the first time, the indicator did not provide

enough predictive value for them to incorporate it into their clinical practice. As one remarked

“It’s almost too late. [IDO2] shows you when they are unstable rather than trying to predict

126

adverse events. [IDO2] tells me what I already know.” However, one physician who had prior

knowledge of the IDO2 indicator and trusted the underlying algorithm remarked that it would

prompt investigation and “pull in more variables to explain what was seen [as a period of

instability].”

Nurse impressions of IDO2 were mixed with some voicing confusion as to the meaning of the

indicator as well as annoyance, and others voicing usefulness to confirm their own assessment.

As one nurse stated "I don't really know what this graph is trying to tell me. If it's just telling me

that the [saturations] are low, I already know that the [saturations] are low." Though all nurses

found a correlation between the IDO2 indicator and their own assessment, made directly from

vitals data, they disagreed on whether this was advantageous or redundant. Similar to the

physician group, nurses were skeptical of the new indicator because its derivation was unclear.

Respiratory therapists found that the indicator correlated with their assessment of instability from

the hemodynamic data and also indicated that they would use it if they could trust it. As one

respiratory therapist stated, "because I don't know how this percentage is calculated or what it

takes into account, I don’t find that it is useful other than the color coding which is very

intuitive." Respiratory therapists stated the indicator could help them assess quickly and prompt

further investigation: “[IDO2 is] a first look at what’s going on. If you want details you can look

at parameters more closely”; “it only really tells me that I need more information”; and "[it]

might prompt me to be proactive about suggesting different modalities. Especially because it's

O2-related, it would prompt me as an RT to think outside the box." In summation, the IDO2

indicator may potentially help clinicians proactively detect deterioration, but the software should

allow users to understand how it was derived.

Visually, when attention was called to the persistent bar graph of the IDO2 indicator participants

all interpreted it correctly with higher values represented as red bars thus perceiving the patient

as being at high risk of inadequate oxygenation. This composite parameter could also be seen as

a time series on one of the four graph areas. Two physicians, one nurse, and one respiratory

therapist viewed the IDO2 indicator as a time series. One physician found that the indicator

correlated well with the charted events while the respiratory therapist found that it correlated

with the hemodynamic data. The nurse preferred the bar graph visualization to the time series.

127

One important difference between the bar graph and the time series was that each bar was color

coded with low IDO2 values in green, intermediate values in yellow, and high values in red.

5.4.3.5 Other Functions: Charting

In general, clinicians could easily use the charting features of the software. UERs were high for

physicians, nurses, and respiratory therapists with 1.3, 1.7, and 1.6, respectively (See Table 21).

All clinician groups could easily set targets (Task 20) and visually highlight out-of-target values

for a given parameter trend line but were less able to find and write notes (Task 18 and 19).

5.5 Discussion

Usability testing revealed how data integration software supported or hindered tasks that require

use of continuous patient data by a representative sample of end users. The qualitative nature of

our study provided insight into the user experience and opportunities for user-centered design

modifications. Clinicians had a high degree of flexibility and, consequently, easily produced data

dense visualizations but encountered usability issues of time manipulation, point data

identification, and detection of trend deviations. These issues confirm those identified using the

heuristic evaluation method.175 Attributable themes include the transformation of point data into

time series visualization, the emergence of visual pattern overload, visual aids representing

computer-processed data, data trustworthiness, and use variability among clinical disciplines.

5.5.1 Transforming Numerical Point Data to Long-Term, Time-Scaled Visualizations

Through tracking and trajectory tasks, we found that time series visualizations were appreciated

by clinicians since it off-loaded their existing cognitive task of creating visualizations from

continuous numerical patient data. This may indicate that the software alleviated point data

overload. Point data recall was effective for single parameter trends. However, multiparametric

visualizations lead to denser and overlapping time series making the recall of multiple point data

difficult. While some confusion and misinterpretation was observed, we found that time series

data displays allowed for quick determination of the duration of instability. The software

provided clinicians with a high degree of choice and flexibility to create multiparametric

visualizations. However, it consequently limited their ability to interpret and extract point data.

The effort required to make distinctions between the alternatives appear to outweigh the benefits

128

of having many options and is consistent with Schwartz’s statement in “The Paradox of Choice”:

“choice no longer liberates, but debilitates.”185 A suggestion for improvement is to simplify the

user interface by eliminating some interactive features and communicating to the user the

meaning of each feature to facilitate clear action, to reduce confusion, and to make the user feel

in control of the system.

As previously mentioned, manipulating the software interface required high visual acuity and

manual dexterity causing tasks to be somewhat time consuming. In addition, clinicians were

unaccustomed to the large choice of continuous parameters. Consequently, participants voiced

frustration after completing tasks because they expected to have completed them much faster.

This is consistent with Hick’s law which postulates that time on a task is positively correlated

with number and complexity of choices, and as time to decision increases user satisfaction

decreases.186 This further reinforces that if the interface has poor usability, then low uptake may

result since real ICU tasks are highly time-sensitive.

5.5.2 Integrating Data Trends: Visual Pattern Overload

The availability of dozens of data streams on a single software platform is an undeniable

advantage over existing dispersed clinical information systems and is a crucial step to

understanding relationships between parameters.78,81 The software helped clinicians visualize

multiple parameters as a time series on a single chart. These represented thousands or millions of

data points but were boiled down to single patterns which resulted in dense visualizations. For a

given parameter, individual data points were transformed to more discernable patterns. However,

when participants combined multiple parameters we observed a phenomenon of “visual pattern

overload”. Consequently, participants experienced difficulties in extracting specific data or

detecting subtle changes among the many patterns. To address the usability issues associated

with multiparametric trends, we suggest four strategies outlined below to better support their use.

5.5.2.1 Pre-Defined Parametric Grouping

The need for integrating technologies to show relationships between parameters is a paramount

function of data integration software.78,81 As Feyen stated, “It is not the monitoring that makes

the difference but how this is translated into more appropriate and targeted treatments.”187 Since

interventions act on groups of physiological parameters, visual clutter may be reduced if displays

are reassembled according to intervention, helping to inform targeted treatments. For example,

129

dopamine infusions can be automatically grouped together with heart rate and blood pressure,

upon which they are known to act. Similarly, mechanical ventilation, which acts on pulmonary

physiological parameters, could be grouped with peripheral oxygen saturation and carbon

dioxide data. Parameter grouping through configurable displays has been studied for basic

vitals.97 However, with more advanced monitoring modalities, further systematic selection of

parameters is warranted for each medical infusion or organ support data stream. Hajdukiewicz et

al. postulated the use of the abstraction hierarchy (AH) framework to represent the patient data

and information at several levels of aggregation and abstraction.139 This framework supports

problem-solving and embodying the current state of biomedical knowledge.139,140 For interface

designers, AH patient representations offer a means of allocating roles and responsibilities to

different clinical specialties. Also, it structures data from monitoring devices and therapeutic

interventions by mapping the types of data onto the patient model, at defined levels of

abstraction and aggregation. In this way, configurable displays may off-load the task of selecting

relevant parameters and minimize superfluous data streams. Thus, the clinician is aided in

determining cause-and-effect relationships and supported in their problem-solving activities.

5.5.2.2 Scaling According to the Nature of the Parameters

The issue of automatic scaling was provided as a “one size fits all” solution; however, clinical

parameters have known limitations anchored in the use of the medical devices or knowledge of

human physiology. A few examples are that medical infusions cannot be negative, differences in

infusion rates vary by orders of magnitude, and the temperature of the living human body

generally stays within a few degrees of baseline. Inappropriate scaling led clinicians to ignore

parametric changes or created mistrust of the software. Usability testing incited clinicians to

describe appropriate and realistic scaling for different types of parameters. These preferences

could be easily programmed into the software, avoiding or minimizing the false conclusions

observed during testing. Therefore, usability testing was instrumental in highlighting how data

dense visualizations can be confused and, consequently, be rectified in a subsequent software

iteration.

5.5.2.3 Data Reduction Using Algorithms

Visual pattern overload reduction and pre-defined parametric grouping were automatically

performed through the IDO2 algorithm. The percentage risk of inadequate oxygen delivery was

130

displayed as a persistent bar graph at the bottom of the screen. In this way, data for 16

parameters were effectively reduced and was intuitive to understand. Now, the physician’s

complex cognitive process of relating respiratory physiology and medical interventions to gauge

oxygen delivery, typically from disparate monitors, can be off-loaded. In addition, IDO2’s

estimation of risk addresses the problem of uncertainty inherent to the dynamic nature of critical

care and supports the high-level analytical task of decision-making.188,189

Lack of transparency and published evidence of the new composite IDO2 parameter led to

mistrust and was the main barrier to its use. The only clinician familiar with it was prompted to

investigate further. Thus, our findings suggest that although triggering functions such as the

IDO2 indicator have the potential to aid with patient monitoring, it is imperative that the interface

communicate how new indicators were derived. As an early warning system, the IDO2 indicator

could achieve what Bion described as the proactive identification of early changes to “empower

ward staff to call for help and initiate further investigation to prevent or limit the magnitude of

adverse events.”190 Observations from usability testing warn that without consistent exposure and

integration into clinical practice, data interpretation aids may be ignored, and, thus, excluded

from critical decision-making where they would be most useful.

5.5.2.4 Novel Visualizations

In our study, we tested four types of visualizations and found that time series visualization

worked well for single parameters but was less usable when parameters were combined. In

addition, highlighting out-of-target range data, using shading, as well as exaggerating minima

and maxima by using sparklines were imperceptible to participants. Tasks that required specific

use of multiparametric data should be developed to further test these and other types of data-

dense visualizations. For example, metaphor displays that use various shapes to represent

physiological processes have been explored in anesthesia.191 Indeed, Doig suggested using

shapes to help nurses better visualize hemodynamic parameters.78

5.5.3 Data Trustworthiness

The integrated single-view of multiple data streams improved the trustworthiness of the data as a

whole. For example, by viewing both etCO2 data and intermittent CO2 blood gas data,

respiratory therapists could confirm and trust the continuous etCO2 trend. Also, continuous data

streams complemented the event notes and may benefit charted notes on the electronic medical

131

record (EMR). For example, when ventilator pressure drops to 0 mm Hg, respiratory therapists

could assume this was the exact time the ventilator was disconnected and manual bagging was

initiated. Indeed, Doig found that to prevent data from going unused, it was necessary to

contextualize data.78 Redundant data and additional clinical context may improve data

trustworthiness of continuous data itself and the charted patient record. Future clinical

information systems should integrate MMM data with EMR qualitative information to provide a

complete picture of the patient and automatically check data integrity.

5.5.4 Usability Testing with Diverse Clinician Groups

Bion states that information technologies require “staged “bottom-up” development, pilot testing,

and appropriate implementation into existing hospital culture.”190 Given the complexity of

intensive care and the high degree of specialization of the critical care professional, feedback

from representative end-users is essential for acceptance of the software. This study provides

recommendations for appropriate implementation by revealing aspects of the ICU culture that

would impact software acceptance.

Different types of clinicians required different levels of data granularity. Physicians operated on

a longer patient timeline than nurses, who usually operate within seconds or minutes, and

respiratory therapists, who usually operate on a moderate timeline. Therefore, averaged values,

over five seconds, were not as useful to nurses but were more usable to physicians and

respiratory therapists. To encourage system usage with nurses, more precise data should be made

available. Furthermore, to contextualize data senior physicians more frequently requested

supplemental information than junior physicians.77 To support appropriate decision-making, the

display should show preferred and appropriate number of data streams for each clinical specialty.

5.5.5 Proposed Iteration and Improvements

At the time of writing, a new version of the software, which addressed heuristically found

usability issues, was launched. This new version improved the reading of values on the time

series trend by displaying the changing value close to the scanning cursor, as well as its font size

and style. Also, absolute maximum and minima of the trend is now always visible. In addition,

shortcuts for viewing grouped parameters related to respiratory or hemodynamic functions are

now available. This study’s usability issues should be addressed if the software is to be useful to

clinicians. Future iterations should offer support to select, filter, reduce redundant data streams,

132

provide contextual meaning to the data, and provide novel visualizations that are intuitively

understood. For example, pairing etCO2 with pCO2, which respiratory therapists do to determine

trustworthiness of the etCO2 continuous trend can be readily available if a respiratory therapist is

detected as the user. Better still, employing algorithms to correct the etCO2 trend using more

reliable pCO2 blood gas values and eliminating redundant data thereby reducing overall data and

pattern overload. In the long-term, the future versions of the software should integrate with the

medical record system or new medical record systems should integrate with the existing data

visualization software so as to provide a single source of patient data and information. Table 22

provides practical suggestions for data integration and visualization software.

Table 22. Practical Improvement Suggestions for Data Integration and Visualization Software

Improvement Rationale Suggestions to Achieve

Improvement

Reduce redundant data streams.

Removal of redundant data is required to allow clinicians to efficiently and easily abstract, trend, and interact with the data.

Ensure preprocessing mass volumes of continuous real-time data. For example, employ algorithms that corrects etCO2 trends using pCO2 blood gas values.

Provide user awareness. User-aware applications that dynamically adjust the data display mode based on the user context can ensure that adequate and relevant data needs are being displayed and enhance clinicians’ efficiency and efficacy in extracting meaningful information.

Provide customized view of patient data tailored to the clinician’s needs. For example, if a respiratory therapist is detected as the user, the system would display etCO2 with pCO2 to help respiratory therapist know if s/he should trust the etCO2 continuous trend data.

Reduce clinician cognitive demand in interacting with the visual displays.

Ensuring that components that are important for decision-making are represented in the display in a perceptually similar manner as to improve the clinician’s decision-making accuracy and efficiency.

Present the components that are important for decision-making as an integrated object and/or by presenting them close together spatially or temporally.

Mandate integration of data integration and visualization software with existing medical record systems.

Integration of data integration and visualization software with medical record systems to provide a single source of patient data which facilitates data synchronization and may reduce use errors.

Technology procurement policies should require incoming data platforms to freely exchange data and information with existing clinical information systems.

133

Improvement Rationale Suggestions to Achieve

Improvement

Provide easy time navigation. A critical function of the interface is enabling the user to rapidly select the time frame of continuous data, relative to the patient’s stay in the ICU.

Provide interface controls which support both exploratory data navigation across time and specific user defined timeframes.

Ensure interface is flexible to different types of users and levels of expertise.

Functions which are learned should provide shortcuts for accelerated performance.

Provide layered function description and interface shortcuts.

Ensure software responsiveness.

Additional data streams and access to denser data visualizations may slow down system performance and diminish user satisfaction and decision-making quality.

Ensure new data streams are compressed or back-end processing is sufficient to maintain adequate responsiveness.

Although this study focused on the usability of data integration software, pre- and post-session

questionnaires, the think-aloud nature of the test method revealed aspects of clinical work which

may explain software underuse in clinical practice. For example, the pre-session questionnaire

indicated that six of the 22 participants received training. Training was provided when the

software was launched in the unit but was not mandatory and was not provided on an ongoing

basis to incoming staff. In addition, nurses and respiratory therapists dedicate a large proportion

of their time to charting on the medical record accessed from the bedside computer terminal.

Since the software was web-based, it required clinicians to stop charting, pull up the web-

browser and login on the computer they use to chart. Therefore, staff may deprioritize accessing

the data visualization software because of the numerous steps required to do so. In addition, the

unit’s UNIX-based EMR system was replaced by a Windows-based system during the study

period. Staff may also have devoted more time to learning the new EMR system and had even

less time to explore auxiliary data platforms such as T3™.

In the end, our work indicates that the ideal system for capturing and utilizing continuous

physiologic date in the intensive care unit will allow seamless integration into work flow, is

intuitive and fits with the way clinicians think and work, and is trusted as a platform that

diminishes work and enhances decision-making, rather than contribute to additional confusion,

uncertainty, or skepticism.

134

5.5.6 Improvements Over Existing Work

While planning the overall project, we carried out a systematic review on data integration and

visualization technologies, finding nine studies that reported self-reported usability and

satisfaction by means of a questionnaire and rated on a scale.96,97,100,102-104,106,107,114 Another study

by Peute et al., reported the change in the number and type of usability issues following user-

centered design changes.109 These studies did not relate issues to specific clinician tasks or

software interface features. Our study assessed usability based on tasks, users, and software

features to provide tangible suggestions to improve software design of interfaces running on

similar computational hardware and operating platforms.

We identified factors influencing usability of a fully interactive, commercial data integration

system by physicians, nurses, and respiratory therapists. Also, usability testing enabled a deeper

understanding of how continuous data was identified and interpreted by three distinct clinical

specialties. Finally, recommendations for future iterations of the current software were provided

as well as a description of an ideal integrated data and visualization software platform.

5.5.7 Limitations

The software we tested addressed several theoretical informatics barriers and our findings may

be generally applied to software with a similar level of data integration. Our simulations tested

how untrained participants used the software and provided insight as to the intuitiveness and ease

of the basic tracking functions. A training session focused on the tracking function tasks may

have aided the use of trajectory and triggering functions. Future usability testing could include

training and focus on tasks related to higher-level visualization functions.

The simulations tested 20 types of tasks, most of which were explorative in nature and more

closely related to physician work than the other occupations. As such, these simulations forced

nurses and respiratory therapists to perform investigative tasks outside their usual work scope. In

reality, they spend more time charting or working directly with the patient or other monitoring

devices. Any new software should aim to integrate the data from these technologies and reduce

the burden of charting if it is to be useful to these groups of users.

The simulation environment varied from an actual ICU due to differences in time and stakes. As

a result, transferable information from the simulation may be limited. In the simulation, for

135

example, clinicians were assigned one patient at a time, removing the realism of a multi-patient

workload. Also, the clinicians did not have access to other existing clinical information systems

(e.g. EMR, monitoring and intervention medical device interfaces, and physical paper chart

components) and an actual patient presenting physical symptoms. In addition, the data included

artefacts inherent to medical device signal noise, and sharp peaks or dips in the data could not be

verified as true values instead of false-positives or false-negatives. Again, scenarios were based

on newborn patients who were post-cardiac surgery, limiting transferable information to other

patient populations such as medical-surgical, trauma, or adult. However, we designed tasks using

the software to be plausible to clinicians from both medical-surgical and cardiac specialties.

Future work may include high-fidelity simulations with realistic patients; complementary

technologies; a larger variety of reliable and validated scenarios; and, eventually, in-situ clinical

simulations with appropriate metrics that replicate conditions for higher-level decision-making

tasks.192,193

5.6 Conclusions

Data integration and visualization software offers new ways of perceiving and interpreting data.

Time series visual aids to represent continuous intensive care data were found to be satisfactory

for single parameters but were less useful for multiparametric visualization and single point

recall. Shading to highlight data overlaid on time series visualizations, as well as miniaturized

time series (sparklines) with exaggerated extreme data values, were ignored. The

multiparametric single indicator, which uses a visual aid to summarize the dynamic calculation

of a 16-parameter algorithm, may support the dense use of data but should be tested further in the

context of clinically relevant tasks. These findings highlight the importance and value of

conducting usability testing to uncover potential ease of use and safety issues that can impact the

acceptance of a data integration and visualization system. A recent review of 39 articles of

physiologic data visualization found only one study on the usability of this type of software 58.

Our unique contributions to the study of interactive data integration system is an understanding

of how different clinical specialties interact with a commercial data integration and visualization

technology. We also identified potential interface barriers to the use of such technology to each

discipline-specific practice. The barriers include a difficulty with acquiring multiple parameter

data from data-dense visualizations and perceiving out-of-target data. Another barrier is the

136

limited clinical context of continuous data due to the separate medical recording systems (EMR).

While this study was based specifically on the T3™ system, findings from this study may be

applied generally to other data integration and visualization platforms. Specifically, the practical

improvement suggestions may be applied to other platforms. For instance, features such as

reducing redundant data streams and clinician cognitive demand in interacting with the visual

displays can be effective in allowing clinicians to efficiently and easily abstract, trend, and

interact with data thus resulting in improved clinician’s decision-making accuracy and

efficiency.

Many opportunities exist to uncover other contributory factors beyond usability issues (e.g.,

perceived usefulness, implementation and change management strategy, and training) that can

influence adoption of data integration technologies into clinical practice. Future research

directions include the optimization of the software interface to improve data acquisition and

interpretation; impact assessment of the optimized interfaces during realistic simulations; and,

finally, naturalistic decision-making in the ICU setting. Design solutions, iteratively

implemented and focused on the software system, are expected to mitigate use errors and

promote the safe use of such novel software for intensive care. If tested in simulation, these

solutions should be evaluated in a more realistic setting regarding environment and task load.

Alternatively, solutions could be evaluated during use in the real ICU.

Intensive care clinicians must comprehensively integrate data from disparate technologies to

closely monitor patients. The availability of multimodal continuous data may improve patient

outcomes but risks being simply ignored, or worse, inadvertently introducing new problems such

as cognitive overload that could lead to sub-optimal decision-making 3,38. Data integration

software that enables real-time computation and visualization of continuous monitoring data are

in rapid development 60,100,194,195. However, research has shown that poorly designed

technologies lead to unintended issues, including cognitive overload, mental fatigue, and device

recalls 6,38,187,196-201. Grinspan et al. suggest that the ideal system should “allow clinicians to

abstract, trend, and interact with copious amounts of data through an intuitive user interface” 166.

Moving forward, ICUs and vendors should consider how staff usability testing can assist

selecting or customizing data integration software for improved acceptance of new technologies

into high-risk, technologically-intense settings.

137

Chapter 6 Conclusions

6.1 Key Findings

• The systematic review identified several research gaps. These included 1) a focus limited

to either physicians or nurses; 2) a lack of explicit description of the technological

information sources and how they were used to complete clinical tasks or make

decisions; 3) broad reporting of the decision-making process using clinical data; 4) no

analysis of traditional time series visualization format or an upper limit of parameters

useful for meaningful multi-parametric data visualization; 5) limited descriptions of the

two common types of commercial DIVTs, either physiological monitors or enhanced

EMR systems with integrated physiological data.

• The investigation of macrocognition and technological data and information sources for

clinical decision-making revealed three primary macrocognitive processes used

physicians, nurses and respiratory therapists: 1) sensemaking, 2) anticipation, and 3)

communication. Sensemaking was the most technology-mediated process, across all three

specialties. Furthermore, macrocognitive process switching was identified in all three

specialists and most prominently in physicians. The priority to integrate data and

information sources for critical decision-making are: 1) physiological monitor, 2)

intervention technologies, 3) blood analyses, 4) imaging, 5) the EMR and 6) fluid

balance. These discrete device sources fit the rankings of data elements found in the

literature. Additionally, our study confirmed that nurses were the heaviest users of data

technologies and that DIVTs should be designed for their ease of use.

• The first human factors evaluation of the T3™ DIVT used the heuristic evaluation

method. It identified timescale manipulation and visualization of out-of-range patient

signals as potentially catastrophic issues that need to be addressed.

• Usability testing results identified potential interface barriers to the use of such

technology by each of the three specialties studied. The barriers included: (1) difficulty

138

with acquiring multiple parameter data from data-dense visualizations and perceiving

out-of-target data and (2) limited clinical context of continuous data due to the separate

medical recording systems (EMR). These usability issues may indicate that time-series

visualizations are an inadequate form of visualization for integrated intensive care data.

• While these study findings were based specifically on the T3™ system, they may be

applied generally to other DIVTs.

This multi-phase research project describes the fragmented and integrated technology-mediated

intensive care decision-making. The impact was shown on physicians, nurses and respiratory

therapists, the three primary clinical specialties. This work is the first to explore the relation

between macrocognition and technological data and information sources. It emphasizes nurses’

frontline work with technology and physician process switching.

6.2 Contributions to the Field

This research provides an in-depth understanding of how decision-making, by different intensive

care specialists, is mediated by data-providing technologies and the challenges unifying

technologies must overcome when integrating high-resolution continuous data streams on a

single display. Our original research studies found differences and similarities between

physicians, nurses and respiratory therapists particularly in the macrocognitive processes, their

interrelations, and subsequent impact on cognitive load. Current data display interfaces struggle

to give users adequate control of time windows that inform their particular clinical work, mental

preparation and decision-making. Also, while data integration technologies have been developed

for the physician decision maker, much of the relaying of data and information was done by

nurses. To improve team decision-making and DIVTs must be particularly suited to this specialty

since they inform both physicians and respiratory therapists.

6.3 Future Work

The original research studies of this thesis project were qualitative in nature and served as a

guide to quantitative study design for intensive care data and visualization technologies. Findings

from the cognitive task analyses prioritized the medical devices and clinical patient parameters

developers should necessarily integrate or group to render their DIVT immediately useful.

Candidate technologies should be further evaluated using human factors testing before testing in

139

the clinical setting, as control trials, for example. Usability testing results may be viewed as

criteria for stage-gate design of data integration ICU technologies in view of deployment and

integration into clinical workflow. To arrive at this point, much work could be done to facilitate,

standardize and accelerate testing of these types of technologies, including creating databases of

continuous patient data set scenarios, test metrics, and even registering phases of the

development. These suggestions may seem to stifle innovation but are necessary if we are to treat

technologies as clinical interventions that act directly on the multi-professional clinical staff and

latently on the care and safety of the patient.

The macrocognition framework was used to describe the dynamic decision-making process, from

three perspectives of team care, in the fragmented clinical information technology system of a

pediatric ICU. An extension of this work may be to compare the fragmented CIT system with

one which is integrated by a comprehensive DIVT. At the time of writing, T3™ was augmented

to include ventilator parameters, blood gas, and waveforms of the physiological signals. These

types of data were shown to be important to clinicians in Chapter 3. Therefore, comparing the

macrocognition with the implemented and comprehensive T3™ may be warranted. Furthermore,

applying the macrocognition framework on a shared incident may reveal new aspects of team

macrocognition and direction for design of DIVTs for shared patient data visualizations.

Meta-analysis of human factors studies requires studies with similar outcome methods with

reported power and sample sizes, something difficult to obtain with the pool of potential

intensive care clinicians participants. However, building on studies found in the systematic

review, we can suggest using objective metrics such as time efficiency and cognitive load.

Specifically, time efficiency, in terms of time to make a decision, detect change, or choosing

between two patients may also be replicated in future studies.

Team care certainly does consist of more specialties than physicians, nurses, and respiratory

therapists who provide ventilation support. The latter group would benefit from technologies that

are able to display simultaneous respiratory physiological signals, blood gas results and

interventions data. Further research on how they contribute to team care in situations of escalated

support such as extracorporeal membrane oxygenation, or high frequency oscillation ventilation,

could be studied. Other clinical specialties include pharmacists, nurse practitioners and other

allied health professionals may also extend the relevance of DIVTs to team intensive care.

140

Finally, findings from the usability testing studies of T3™, Chapters 4 and 5, may inform the

experimental design of high-fidelity simulation of complex decision-making using computer-

aided data integration by refining the scenarios, updating with counter-balanced visualizations

(e.g., novel metaphor visualizations) and higher levels of device integration. These high-fidelity

simulations may test the effect of common metrics related to time efficiency, accuracy of

decisions and cognitive load. Once improvement in clinician performance is confirmed through

HF testing, the DIVT could then be studied for its effect on patient outcomes, using pre-post

experimental design. By applying the UCD cycle and HF methods to the design of

comprehensive DIVT, data-driven decision-making may be achieved at the individual and team-

levels of the intensive care.

141

References

1 Almerud, S., Alapack, R. J., Fridlund, B. & Ekebergh, M. Of vigilance and invisibility--being a patient in technologically intense environments. Nursing in critical care 12, 151-158, doi:10.1111/j.1478-5153.2007.00216.x (2007).

2 Ospina-Tascon, G. A., Cordioli, R. L. & Vincent, J. L. What type of monitoring has been shown to improve outcomes in acutely ill patients? Intensive Care Medicine 34, 800-820, doi:http://dx.doi.org/10.1007/s00134-007-0967-6 (2008).

3 Ashworth, P. High technology and humanity for intensive care. Intensive Care Nursing 6, 150-160, doi:http://dx.doi.org/10.1016/0266-612X(90)90074-H (1990).

4 Donchin, Y. et al. A look into the nature and causes of human errors in the intensive care unit. Crit Care Med 23, doi:10.1097/00003246-199502000-00015 (1995).

5 Eytan, D., Goodwin, A. J., Greer, R., Guerguerian, A.-M. & Laussen, P. C. heart rate and Blood Pressure centile curves and Distributions by age of hospitalized critically ill children. Frontiers in Pediatrics 5 (2017).

6 Manor-Shulman, O., Beyene, J., Frndova, H. & Parshuram, C. S. Quantifying the volume of documented clinical information in critical illness. Journal of critical care 23, 245-250, doi:10.1016/j.jcrc.2007.06.003 (2008).

7 Elliott, M. & Coventry, A. Critical care: the eight vital signs of patient monitoring. British journal of nursing (Mark Allen Publishing) 21, 621-625, doi:10.12968/bjon.2012.21.10.621 (2012).

8 Graf, J., von den Driesch, A., Koch, K. C. & Janssens, U. Identification and characterization of errors and incidents in a medical intensive care unit. Acta

Anaesthesiol Scand 49, 930-939, doi:10.1111/j.1399-6576.2005.00731.x (2005).

9 Montgomery, V. L. Effect of fatigue, workload, and environment on patient safety in the pediatric intensive care unit. Pediatric critical care medicine : a journal of the Society of

Critical Care Medicine and the World Federation of Pediatric Intensive and Critical

Care Societies 8, S11-16, doi:10.1097/01.PCC.0000257735.49562.8F [doi]

00130478-200703001-00003 [pii] (2007).

10 Colford, J. M., Jr. & McPhee, S. J. The ravelled sleeve of care. Managing the stresses of residency training. Jama 261, 889-893 (1989).

11 L, S.-P., M, A. & Saint-Jean, M. Challenges and Issues in Adult Intensive Care Nursing. Journal of Nursing & Care 1, 6 (2012).

12 Aiken, L. H., Clarke, S. P., Sloane, D. M., Sochalski, J. & Silber, J. H. Hospital nurse staffing and patient mortality, nurse burnout, and job dissatisfaction. JAMA: the journal

of the American Medical Association 288, 1987-1993 (2002).

142

13 Schaufeli, W. B., Keijsers, G. J. & Reis Miranda, D. Burnout, technology use, and ICU-performance. Organizational risk factors for job stress 12, 259-271 (1995).

14 Lighthall, G. K. & Vazquez-Guillamet, C. Understanding Decision Making in Critical Care. Clinical Medicine & Research 13, 156-168, doi:10.3121/cmr.2015.1289 (2015).

15 Rothschild, J. M. et al. The Critical Care Safety Study: The incidence and nature of adverse events and serious medical errors in intensive care*. Critical Care Medicine 33, 1694-1700, doi:10.1097/01.ccm.0000171609.91035.bd (2005).

16 La Pietra, L., Calligaris, L., Molendini, L., Quattrin, R. & Brusaferro, S. Medical errors and clinical risk management: state of the art. Acta Otorhinolaryngologica Italica 25, 339-346 (2005).

17 Clifford, G. D., Long, W. J., Moody, G. B. & Szolovits, P. Robust parameter extraction for decision support using multimodal intensive care data. Philosophical transactions.

Series A, Mathematical, physical, and engineering sciences 367, 411-429, doi:10.1098/rsta.2008.0157 (2009).

18 Chaudhry, B. et al. Systematic review: impact of health information technology on quality, efficiency, and costs of medical care. Annals of internal medicine 144, 742-752 (2006).

19 Wikström, A.-C., Cederborg, A.-C. & Johanson, M. The meaning of technology in an intensive care unit--an interview study. Intensive & critical care nursing 23, 187-195, doi:http://dx.doi.org/10.1016/j.iccn.2007.03.003 (2007).

20 Tang, P. C. & Patel, V. L. Major issues in user interface design for health professional workstations: summary and recommendations. International journal of bio-medical

computing 34, 139-148 (1994).

21 Miller, G. A. The magical number seven plus or minus two: some limits on our capacity for processing information. Psychol Rev 63, 81-97 (1956).

22 Saaty, T. L. & Ozdemir, M. S. Why the magic number seven plus or minus two. Mathematical and Computer Modelling 38, 233-244, doi:https://doi.org/10.1016/S0895-7177(03)90083-5 (2003).

23 Jennings, D., Amabile, T. M. & Ross, L. Informal covariation assessment: Data-based vs. theory-based judgments. (1982).

24 Imhoff, M., Fried, R. & Gather, U. in Proceedings of the AMIA Symposium. 340 (American Medical Informatics Association).

25 Walsh, T. & Beatty, P. Human factors error and patient monitoring. Physiological

measurement 23, R111 (2002).

26 Mack, E. H., Wheeler, D. S. & Embi, P. J. Clinical decision support systems in the pediatric intensive care unit. Pediatric critical care medicine : a journal of the Society of

143

Critical Care Medicine and the World Federation of Pediatric Intensive and Critical

Care Societies 10, 23-28, doi:10.1097/PCC.0b013e3181936b23 [doi] (2009).

27 Simpao, A. F., Ahumada, L. M. & Rehman, M. A. Big data and visual analytics in anaesthesia and health care. British Journal of Anaesthesia 115, 350-356 (2015).

28 CHFG. What is Human Factors? – CHFG – Clinical Human Factors Group, <http://chfg.org/about-us/what-is-human-factors/> (2017).

29 Mantei, M. M. & Teorey, T. J. Cost/benefit analysis for incorporating human factors in the software lifecycle. Communications of the ACM 31, 428-439 (1988).

30 Fairbanks Rollin, J. & Caplan, S. Poor Interface Design and Lack of Usability Testing Facilitate Medical Error. Joint Commission Journal on Quality and Patient Safety 30, 579-584 (2004).

31 Coiera, E. & Tombs, V. Communication behaviours in a hospital setting: an observational study. BMJ : British Medical Journal 316, 673-676 (1998).

32 Gosbee, J. Communication among health professionals. Human factors engineering can

help make sense of the chaos Information in practice p 673 316, 642, doi:10.1136/bmj.316.7132.642 (1998).

33 Leonard, M., Graham, S. & Bonacum, D. The human factor: the critical importance of effective teamwork and communication in providing safe care. Quality and Safety in

Health Care 13, i85-i90, doi:10.1136/qshc.2004.010033 (2004).

34 Sintchenko, V. & Coiera, E. W. Which clinical decisions benefit from automation? A task complexity approach. International journal of medical informatics 70, 309-316 (2003).

35 Harder, K. A. & Marc, D. Human factors issues in the intensive care unit. AACN

advanced critical care 24, 405-414 (2013).

36 Green, C. A., Gilhooly, K. J., Logie, R. & Ross, D. G. Human factors and computerisation in intensive care units: a review. International journal of clinical

monitoring and computing 8, 167-178, doi:10.1007/bf01738889 (1991).

37 Bion, J. F., Abrusci, T. & Hibbert, P. Human factors in the management of the critically ill patient. British Journal of Anaesthesia 105, 26-33, doi:10.1093/bja/aeq126 (2010).

38 Sahuquillo, J. Does multimodality monitoring make a difference in neurocritical care? European journal of anaesthesiology. Supplement 42, 83-86, doi:10.1017/s0265021507003353 (2008).

39 Harrison, M. I., Koppel, R. & Bar-Lev, S. Unintended consequences of information technologies in health care—an interactive sociotechnical analysis. Journal of the

American medical informatics Association 14, 542-549 (2007).

144

40 Georgiou, A. & Westbrook, J. I. Clinician reports of the impact of electronic ordering on an emergency department. Studies in health technology and informatics 150, 678-682 (2009).

41 Thursky, K. A. & Mahemoff, M. User-centered design techniques for a computerised antibiotic decision support system in an intensive care unit. International Journal of

Medical Informatics 76, 760-768, doi:http://dx.doi.org/10.1016/j.ijmedinf.2006.07.011 (2007).

42 Martin, J. L., Clark, D. J., Morgan, S. P., Crowe, J. A. & Murphy, E. A user-centred approach to requirements elicitation in medical device development: a case study from an industry perspective. Appl Ergon 43, 184-190 (2012).

43 DIS, I. in International Standardization Organization (ISO). Switzerland (2009).

44 Travis, D. ISO 13407 is dead. Long live ISO 9241-210! (2011). <http://www.userfocus.co.uk/articles/iso-13407-is-dead.html>.

45 Angood, P. B. Right Care, Right Now™—You can make a difference. Critical care

medicine 33, 2729-2732 (2005).

46 Custer, J. W. et al. A qualitative study of expert and team cognition on complex patients in the pediatric intensive care unit. Pediatric critical care medicine : a journal of the

Society of Critical Care Medicine and the World Federation of Pediatric Intensive and

Critical Care Societies 13, 278-284, doi:10.1097/PCC.0b013e31822f1766 (2012).

47 De Georgia, M. A., Kaffashi, F., Jacono, F. J. & Loparo, K. A. Information Technology in Critical Care: Review of Monitoring and Data Acquisition Systems for Patient Care and Research. The Scientific World Journal 2015, 9, doi:10.1155/2015/727694 (2015).

48 Cunningham, S., Deere, S., Elton, R. A. & McIntosh, N. Neonatal physiological trend monitoring by computer. Int J Clin Monit Comput 9, 221-227 (1992).

49 Cunningham, S., Deere, S., Symon, A., Elton, R. A. & McIntosh, N. A randomized, controlled trial of computerized physiologic trend monitoring in an intensive care unit. Crit Care Med 26, 2053-2060 (1998).

50 Cole, W. G. & Stewart, J. G. Human performance evaluation of a metaphor graphic display for respiratory data. Methods Inf Med 33, 390-396 (1994).

51 Lin, Y. L., Guerguerian, A. M., Tomasi, J., Laussen, P. & Trbovich, P. Usability of data integration and visualization software for multidisciplinary pediatric intensive care: a human factors approach to assessing technology. Bmc Medical Informatics and Decision

Making 17, doi:10.1186/s12911-017-0520-7 (2017).

52 Cunningham, S., Deere, S., Elton, R. A. & McIntosh, N. Comparison of nurse and computer charting of physiological variables in an intensive care unit. International

journal of clinical monitoring and computing 13, 235-241 (1996).

145

53 Dijkema, L. M., Dieperink, W., van Meurs, M. & Zijlstra, J. G. Preventable mortality evaluation in the ICU. Crit Care 16, 309, doi:cc11212 [pii]

10.1186/cc11212 [doi] (2012).

54 Moher, D., Liberati, A., Tetzlaff, J. & Altman, D. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA Statement. PLOS Med 6, doi:10.1371/journal.pmed.1000097 (2009).

55 Alexander, G. & Staggers, N. A systematic review of the designs of clinical technology: findings and recommendations for future research. ANS. Advances in nursing science 32, 252-279, doi:10.1097/ANS.0b013e3181b0d737 (2009).

56 Görges, M. & Staggers, N. Evaluations of physiological monitoring displays: a systematic review. Journal of clinical monitoring and computing 22, 45-66 (2008).

57 Garg, A. X. et al. Effects of computerized clinical decision support systems on practitioner performance and patient outcomes: a systematic review. Jama 293, 1223-1238 (2005).

58 Kamaleswaran, R. & McGregor, C. A Review of Visual Representations of Physiologic Data. JMIR Medical Informatics 4, e31, doi:10.2196/medinform.5186 (2016).

59 Hayes-Roth, B. et al. Guardian: A prototype intelligent agent for intensive-care monitoring. Artificial intelligence in medicine 4, 165-185 (1992).

60 McManus, M., Baronov, D., Almodovar, M., Laussen, P. & Butler, E. in Decision and

Control (CDC), 2013 IEEE 52nd Annual Conference on. 763-769 (IEEE).

61 Brown, H., Terrence, J., Vasquez, P., Bates, D. W. & Zimlichman, E. Continuous Monitoring in an Inpatient Medical-Surgical Unit: A Controlled Clinical Trial. The

American journal of medicine 127, 226-232 (2014).

62 Ahmad, S. et al. Continuous multi-parameter heart rate variability analysis heralds onset of sepsis in adults. PLoS ONE 4, doi:http://dx.doi.org/10.1371/journal.pone.0006642 (2009).

63 Gomez, H. et al. Development of a multimodal monitoring platform for medical research. Conf Proc IEEE Eng Med Biol Soc 2010, 2358-2361, doi:http://dx.doi.org/10.1109/IEMBS.2010.5627936 (2010).

64 McAlpine, B. & VanKampen, D. Clinical engineering and information technology: Working together to implement device integration. Biomedical Instrumentation and

Technology WHO 45, 445-449 (2011).

65 Bowling, A. (Open University Press, 2002).

66 Staggers, N. & Blaz, J. W. Research on nursing handoffs for medical and surgical settings: an integrative review. J Adv Nurs 69, 247-262, doi:10.1111/j.1365-2648.2012.06087.x [doi] (2012).

146

67 Toye, F. et al. Meta-ethnography 25 years on: challenges and insights for synthesising a large number of qualitative studies. BMC medical research methodology 14, 80 (2014).

68 Hannes, K. & Macaitis, K. A move to more systematic and transparent approaches in qualitative evidence synthesis: update on a review of published papers. Qualitative

Research 12, 402-442, doi:doi:10.1177/1468794111432992 (2012).

69 Tong, A., Flemming, K., McInnes, E., Oliver, S. & Craig, J. Enhancing transparency in reporting the synthesis of qualitative research: ENTREQ. BMC Medical Research

Methodology 12, 181, doi:10.1186/1471-2288-12-181 (2012).

70 Noblit, G. W. & Hare, R. D. Meta-ethnography: Synthesizing qualitative studies. Vol. 11 (Sage, 1988).

71 Schütz, A. Collected Papers 1. The Hague (1962).

72 Britten, N. et al. Using meta ethnography to synthesise qualitative research: a worked example. Journal of Health Services Research & Policy 7, 209-215 (2002).

73 Peute, L. W. et al. A framework for reporting on Human Factor/Usability studies of Health Information Technologies. Studies in health technology and informatics 194, 54-60 (2013).

74 Weir, C. R., Staggers, N. & Phansalkar, S. The state of the evidence for computerized provider order entry: a systematic review and analysis of the quality of the literature. Int J

Med Inform 78, 365-374, doi:10.1016/j.ijmedinf.2008.12.001 (2009).

75 Shadish, W. R. Revisiting field experimentation: field notes for the future. Psychological

methods 7, 3-18 (2002).

76 West, V. L., Borland, D. & Hammond, W. E. Innovative information visualization of electronic health record data: a systematic review. Journal of the American Medical

Informatics Association : JAMIA 22, 330-339, doi:10.1136/amiajnl-2014-002955 (2015).

77 Alberdi, E. et al. Expertise and the interpretation of computerized physiological data: implications for the design of computerized monitoring in neonatal intensive care. International Journal of Human-Computer Studies 55, 191-216, doi:http://dx.doi.org/10.1006/ijhc.2001.0477 (2001).

78 Doig, A. K., Drews, F. A. & Keefe, M. R. Informing the design of hemodynamic monitoring displays. CIN - Computers Informatics Nursing 29, 706-713, doi:http://dx.doi.org/10.1097/NCN.0b013e3182148eba (2011).

79 Kannampallil, T. G. et al. Understanding the nature of information seeking behavior in critical care: implications for the design of health information technology. Artificial

intelligence in medicine 57, 21-29 (2013).

80 Koch, S. H. et al. Intensive care unit nurses' information needs and recommendations for integrated displays to improve nurses' situation awareness. Journal of the American

147

Medical Informatics Association 19, 583-590, doi:http://dx.doi.org/10.1136/amiajnl-2011-000678 (2012).

81 Sharit, J., Czaja, S. J., Augenstein, J. S., Balasubramanian, G. & Schell, V. Assessing the information environment in intensive care units. Behaviour & Information Technology

Ethnography 25, 207-220 (2006).

82 Smielewski, P. et al. ICM+, a flexible platform for investigations of cerebrospinal dynamics in clinical practice. Acta Neurochir Suppl 102, 145-151 (2008).

83 Gomez, H. et al. Development of a multimodal monitoring platform for medical research. Conf Proc IEEE Eng Med Biol Soc 2010, 2358-2361 (2010).

84 Engelman, D., Higgins, T. L., Talati, R. & Grimsman, J. Maintaining situational awareness in a cardiac intensive care unit. J Thorac Cardiovasc Surg 147, 1105-1106, doi:http://dx.doi.org/10.1016/j.jtcvs.2013.10.044 (2014).

85 Rasmussen, J. Skills, rules, and knowledge; signals, signs, and symbols, and other distinctions in human performance models. Systems, Man and Cybernetics, IEEE

Transactions on, 257-266 (1983).

86 Vicente, K. J. & Rasmussen, J. Ecological interface design: Theoretical foundations. Systems, Man and Cybernetics, IEEE Transactions on 22, 589-606 (1992).

87 Hollnagel, E. Modeling the orderliness of human action. Cognitive engineering in the

aviation domain, 65-98 (2000).

88 Benner, P., Hughes, R. G. & Sutphen, M. in Patient Safety and Quality: An Evidence-

Based Handbook for Nurses (ed R. G. Hughes) (Agency for Healthcare Research and Quality (US), 2008).

89 Scheffer, B. K. & Rubenfeld, M. G. A consensus statement on critical thinking in nursing. The Journal of nursing education 39, 352-359 (2000).

90 Drews, F. A. & Westenskow, D. R. The right picture is worth a thousand numbers: data displays in anesthesia. Hum Factors 48, 59-71, doi:10.1518/001872006776412270 (2006).

91 Gilhooly, K. J. et al. Biomedical knowledge in diagnostic thinking: the case of electrocardiogram (ECG) interpretation. European Journal of Cognitive Psychology 9, 199-223 (1997).

92 Puri, N., Puri, V. & Dellinger, R. P. History of technology in the intensive care unit. Crit

Care Clin 25, 185-200 (2009).

93 Jordan, D. & Rose, S. E. Multimedia abstract generation of intensive care data: the automation of clinical processes through AI methodologies. World J Surg 34, 637-645, doi:http://dx.doi.org/10.1007/s00268-009-0319-5 (2010).

148

94 Zeng, Q., Cimino, J. J. & Zou, K. H. Providing concept-oriented views for clinical data using a knowledge-based system: an evaluation. Journal of the American Medical

Informatics Association : JAMIA 9, 294-305 (2002).

95 Ahmed, A., Chandra, S., Herasevich, V., Gajic, O. & Pickering, B. W. The effect of two different electronic health record user interfaces on intensive care provider task load, errors of cognition, and performance. Critical care medicine 39, 1626-1634 (2011).

96 Anders, S. et al. Evaluation of an integrated graphical display to promote acute change detection in ICU patients. International Journal of Medical Informatics 81, 842-851, doi:http://dx.doi.org/10.1016/j.ijmedinf.2012.04.004 (2012).

97 Drews, F. A. & Doig, A. Evaluation of a Configural Vital Signs Display for Intensive Care Unit Nurses. Hum. Factors 56, 569-580 (2014).

98 Dziadzko, M. A. et al. User perception and experience of the introduction of a novel critical care patient viewer in the ICU setting. International Journal of Medical

Informatics 88, 86-91 (2016).

99 Effken, J. A. Improving clinical decision making through ecological interfaces. Ecological Psychology 18, 283-318, doi:10.1207/s15326969eco1804_4 (2006).

100 Effken, J. A., Loeb, R. G., Kang, Y. & Lin, Z. C. Clinical information displays to improve ICU outcomes. International Journal of Medical Informatics 77, 765-777, doi:http://dx.doi.org/10.1016/j.ijmedinf.2008.05.004 (2008).

101 Ellsworth, M. A., Lang, T. R., Pickering, B. W. & Herasevich, V. Clinical data needs in the neonatal intensive care unit electronic medical record. BMC medical informatics and

decision making 14, 92 (2014).

102 Forsman, J., Anani, N., Eghdam, A., Falkenhav, M. & Koch, S. Integrated information visualization to support decision making for use of antibiotics in intensive care: design and usability evaluation. Inform Health Soc Care 38, 330-353, doi:http://dx.doi.org/10.3109/17538157.2013.812649 (2013).

103 Gorges, M., Kuck, K., Koch, S. H., Agutter, J. & Westenskow, D. R. A far-view intensive care unit monitoring display enables faster triage. Dccn 30, 206-217, doi:http://dx.doi.org/10.1097/DCC.0b013e31821b7f08 (2011).

104 Gorges, M., Westenskow, D. R. & Markewitz, B. A. Evaluation of an integrated intensive care unit monitoring display by critical care fellow physicians. Journal of clinical

monitoring and computing 26, 429-436 (2012).

105 Koch, S. H. et al. Evaluation of the effect of information integration in displays for ICU nurses on situation awareness and task completion time: A prospective randomized controlled study. International journal of medical informatics 82, 665-675 (2013).

149

106 Law, A. S. et al. A comparison of graphical and textual presentations of time series data to support medical decision making in the neonatal intensive care unit. Journal of

Clinical Monitoring & Computing 19, 183-194 (2005).

107 Liu, Y. & Osvalder, A.-L. Usability evaluation of a GUI prototype for a ventilator machine. Journal of clinical monitoring and computing 18, 365-372 (2004).

108 Miller, A., Scheinkestel, C. & Steele, C. The effects of clinical information presentation on physicians' and nurses' decision-making in ICUs. Appl Ergon 40, 753-761, doi:http://dx.doi.org/10.1016/j.apergo.2008.07.004 (2008).

109 Peute, L. W., De Keizer, N. F., Van Der Zwan, E. P. & Jaspers, M. W. Reducing clinicians' cognitive workload by system redesign; a pre-post think aloud usability study. Studies in Health Technology & Informatics 169, 925-929 (2011).

110 Pickering, B. W., Herasevich, V., Ahmed, A. & Gajic, O. Novel Representation of Clinical Information in the ICU Developing User Interfaces which Reduce Information Overload. Applied Clinical Informatics 1, 116-131, doi:10.4338/aci-2009-12-cr-0027 (2010).

111 Pickering, B. W., Gajic, O., Ahmed, A., Herasevich, V. & Keegan, M. T. Data utilization for medical decision making at the time of patient admission to ICU. Crit Care Med 41, 1502-1510, doi:10.1097/CCM.0b013e318287f0c0 (2013).

112 Pickering, B. W. et al. The implementation of clinician designed, human-centered electronic medical record viewer in the intensive care unit: a pilot step-wedge cluster randomized trial. International Journal of Medical Informatics 84, 299-307 (2015).

113 van der Meulen, M. et al. When a graph is poorer than 100 words: A comparison of computerised natural language generation, human generated descriptions and graphical displays in neonatal intensive care. Applied Cognitive Psychology 24, 77-89, doi:http://dx.doi.org/10.1002/acp.1545 (2010).

114 Wachter, S. B., Markewitz, B., Rose, R. & Westenskow, D. Evaluation of a pulmonary graphical display in the medical intensive care unit: An observational study. Journal of

Biomedical Informatics 38, 239-243, doi:http://dx.doi.org/10.1016/j.jbi.2004.11.003 (2005).

115 Dal Sasso, G. M. & Barra, D. C. Cognitive Workload of Computerized Nursing Process in Intensive Care Units. Computers, informatics, nursing : CIN 33, 339-345; quiz E331 (2015).

116 Korhonen, I. et al. Building the IMPROVE Data Library. IEEE engineering in medicine

and biology magazine : the quarterly magazine of the Engineering in Medicine &

Biology Society 16, 25-32 (1997).

117 Saeed, M. et al. Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II): a public-access intensive care unit database. Critical care medicine 39, 952 (2011).

150

118 Ebadollahi, S. et al. Predicting Patient's Trajectory of Physiological Data using Temporal Trends in Similar Patients: A System for Near-Term Prognostics. AMIA Annu Symp Proc 2010, 192-196 (2010).

119 Asan, O. et al. Provider Use of a Novel EHR display in the Pediatric Intensive Care Unit. Large Customizable Interactive Monitor (LCIM). Applied Clinical Informatics 7, 682-692 (2016).

120 Olchanski, N. et al. Can a Novel ICU Data Display Positively Affect Patient Outcomes and Save Lives? J Med Syst 41, 171, doi:10.1007/s10916-017-0810-8 (2017).

121 Aakre, C. A., Chaudhry, R., Pickering, B. & Herasevich, V. Information Needs Assessment for a Medicine Ward-Focused Rounding Dashboard. J. Med. Syst. 40, doi:10.1007/s10916-016-0542-1 (2016).

122 Carayon, P. et al. A Systematic Review of Mixed Methods Research on Human Factors and Ergonomics in Health Care. Appl Ergon 51, 291-321, doi:10.1016/j.apergo.2015.06.001 (2015).

123 Lin, Y., Guerguerian, A., Laussen, P. & Trbovich, P. Heuristic Evaluation of Data Integration and Visualization Software Used for Continuous Monitoring to Support Intensive Care: A Bedside Nurses Perspective. J Nurs Care 4, 2167-1168.1000300, doi:http://dx.doi.org/10.4172/2167-1168.1000300 (2015).

124 Lin, Y. L., Guerguerian, A.-M., Tomasi, J., Laussen, P. & Trbovich, P. Usability of data integration and visualization software for multidisciplinary pediatric intensive care: a human factors approach to assessing technology. BMC Medical Informatics and Decision

Making 17, 122, doi:10.1186/s12911-017-0520-7 (2017).

125 Zhang, J., Johnson, T. R., Patel, V. L., Paige, D. L. & Kubose, T. Using usability heuristics to evaluate patient safety of medical devices. Journal of Biomedical

Informatics 36, 23-30, doi:http://dx.doi.org/10.1016/S1532-0464(03)00060-1 (2003).

126 Ewing, G., Ferguson, L., Freer, Y., Hunter, J. & McIntosh, N. Observational Data Acquired on a Neonatal Intensive Care Unit. University of Aberdeen Computing Science

Departmental Technical Report: TR 205 (2002).

127 Effken, J. A. Coordination of hemodynamic monitoring and control performance. (1993).

128 Miller, A. A work domain analysis framework for modelling intensive care unit patients. Cognition, Technology & Work 6, 207-222 (2004).

129 Peute, L. W., de Keizer, N. F. & Jaspers, M. Cognitive evaluation of a physician data query tool for a national ICU registry: comparing two think aloud variants and their application in redesign. Studies in health technology and informatics 160, 309-313 (2009).

151

130 Henriksen, K., Battles, J., Keyes, M. & Grady, M. Patient Monitors in Critical Care: Lessons for Improvement--Advances in Patient Safety: New Directions and Alternative Approaches (Vol. 3: Performance and Tools). (2008).

131 Chen, M. et al. Data, information, and knowledge in visualization. IEEE Computer

Graphics and Applications 29 (2009).

132 Crandall, B. & Getchell-Reiter, K. Critical decision method: A technique for eliciting concrete assessment indicators from the intuition of NICU nurses. Advances in Nursing

Science 16, 42-51 (1993).

133 Patterson, E. S. & Miller, J. E. Macrocognition metrics and scenarios: design and

evaluation for real-world teams. (Ashgate Publishing, Ltd., 2012).

134 Klein, G. et al. Macrocognition. Intelligent Systems, IEEE 18, 81-85 (2003).

135 Schubert, C. C., Denmark, T. K., Crandall, B., Grome, A. & Pappas, J. Characterizing novice-expert differences in macrocognition: an exploratory study of cognitive work in the emergency department. Ann Emerg Med 61, 96-109, doi:10.1016/j.annemergmed.2012.08.034 (2013).

136 Holtrop, J. S., Potworowski, G., Fitzpatrick, L., Kowalk, A. & Green, L. A. Understanding effective care management implementation in primary care: a macrocognition perspective analysis. Implementation Science : IS 10, 122, doi:10.1186/s13012-015-0316-z (2015).

137 Baxter, G. D., Monk, A. F., Tan, K., Dear, P. R. & Newell, S. J. Using cognitive task analysis to facilitate the integration of decision support systems into the neonatal intensive care unit. Artif Intell Med 35, 243-257, doi:S0933-3657(05)00045-X [pii]

10.1016/j.artmed.2005.01.004 [doi] (2005).

138 Vivian, R., Falkner, K. & Falkner, N. in Learning and Teaching in Computing and

Engineering (LaTiCE), 2013. 154-161 (IEEE).

139 Hajdukiewicz, J. R., Vicente, K. J., Doyle, D. J., Milgram, P. & Burns, C. M. Modeling a medical environment: an ontology for integrated medical informatics design. International journal of medical informatics 62, 79-99 (2001).

140 Sharp, T. D. & Helmicki, A. J. The Application of the Ecological Interface Design Approach to Neonatal Intensive Care Medicine. Proceedings of the Human Factors and

Ergonomics Society Annual Meeting 42, 350-354, doi:10.1177/154193129804200336 (1998).

141 Asan, O., Flynn, K. E., Azam, L. & Scanlon, M. C. Nurses' perceptions of a novel health information technology: A qualitative study in the pediatric intensive care unit. International Journal of Human-Computer Interaction 33, 258-264 (2017).

152

142 Klein, G. The recognition-primed decision (RPD) model: Looking back, looking forward. Naturalistic decision making, 285-292 (1997).

143 Fackler, J. C. et al. Critical care physician cognitive task analysis: an exploratory study. Crit Care 13, R33, doi:cc7740 [pii]

10.1186/cc7740 [doi] (2009).

144 Hart, S. G. in Proceedings of the Human Factors and Ergonomics Society Annual

Meeting. 904-908 (Sage Publications).

145 Kliegman, R. Nelson textbook of pediatrics. Vol. 994 (Saunders Elsevier Philadelphia, 2007).

146 Massin, M. & von Bernuth, G. Normal Ranges of Heart Rate Variability During Infancy and Childhood. Pediatric Cardiology 18, 297-302, doi:10.1007/s002469900178 (1997).

147 Martich, G. D., Waldmann, C. S. & Imhoff, M. Clinical Informatics in Critical Care. Journal of Intensive Care Medicine 19, 154-163, doi:10.1177/0885066604264016 (2004).

148 Hall, A. & Walton, G. Information overload within the health care system: a literature review. Health Info Libr J 21, 102-108, doi:10.1111/j.1471-1842.2004.00506.x [doi]

HIR506 [pii] (2004).

149 Laxmisan, A. et al. The multitasking clinician: Decision-making and cognitive demand during and after team handoffs in emergency care. International Journal of Medical

Informatics 76, 801-811, doi:http://dx.doi.org/10.1016/j.ijmedinf.2006.09.019 (2007).

150 Weinger, M. Vigilance, Boredom, and Sleepiness. Journal of clinical monitoring and

computing 15, 549-552, doi:10.1023/a:1009993614060 (1999).

151 Le Roux, P. Physiological monitoring of the severe traumatic brain injury patient in the intensive care unit. Curr Neurol Neurosci Rep 13, 331, doi:10.1007/s11910-012-0331-2 [doi] (2013).

152 Tang, Z., Mazabob, J., Weavind, L., Thomas, E. & Johnson, T. R. A time-motion study of registered nurses' workflow in intensive care unit remote monitoring. AMIA Annu

Symp Proc, 759-763, doi:86310 [pii] (2006).

153 Baronov, D. et al. in Pediatric and Congenital Cardiac Care 387-395 (Springer, 2015).

154 McQuillan, P. et al. Confidential inquiry into quality of care before admission to intensive care. BMJ 316, 1853-1858 (1998).

155 Nielsen, J. in Proceedings of the SIGCHI conference on Human factors in computing

systems. 373-380 (ACM).

153

156 Nielsen, J. 10 Heuristics for User Interface Design: Article by Jakob Nielsen, <http://www.nngroup.com/articles/ten-usability-heuristics/> (1995).

157 Shneiderman, S. B. & Plaisant, C. (Pearson Addison Wesley, USA, 2005).

158 Tufte, E. R. Beautiful evidence. New York (2006).

159 Bauer, D. T., Guerlain, S. & Brown, P. J. The design and evaluation of a graphical display for laboratory data. Journal of the American Medical Informatics Association 17, 416-424 (2010).

160 Almerud, S., Alapack, R. J., Fridlund, B. & Ekebergh, M. Of vigilance and invisibility – being a patient in technologically intense environments. Nursing in critical care 12, 151-158, doi:10.1111/j.1478-5153.2007.00216.x (2007).

161 Wenham, T. & Pittard, A. Intensive care unit environment. Continuing Education in

Anaesthesia, Critical Care & Pain 9, 178-183, doi:10.1093/bjaceaccp/mkp036 (2009).

162 Hemphill, J. C., Andrews, P. & Georgia, M. Multimodal monitoring and neurocritical care bioinformatics. Nat Rev Neurol 7, doi:10.1038/nrneurol.2011.101 (2011).

163 Citerio, G. et al. Data Collection and Interpretation. Neurocritical Care 22, 360-368, doi:10.1007/s12028-015-0139-4 (2015).

164 Celi, L. A., Mark, R. G., Stone, D. J. & Montgomery, R. A. "Big data" in the intensive care unit. Closing the data loop. Am J Respir Crit Care Med 187, 1157-1160, doi:10.1164/rccm.201212-2311ED (2013).

165 Spooner, S. A. Special requirements of electronic health record systems in pediatrics. Pediatrics 119, 631-637, doi:10.1542/peds.2006-3527 (2007).

166 Grinspan, Z. M., Pon, S., Greenfield, J. P., Malhotra, S. & Kosofsky, B. E. Multimodal monitoring in the pediatric intensive care unit: new modalities and informatics challenges. Seminars in pediatric neurology 21, 291-298, doi:10.1016/j.spen.2014.10.005 (2014).

167 Kim, M. M., Barnato, A. E., Angus, D. C., Fleisher, L. F. & Kahn, J. M. The effect of multidisciplinary care teams on intensive care unit mortality. Archives of internal

medicine 170, 369-376, doi:10.1001/archinternmed.2009.521 (2010).

168 Schmidt, J. M. & Georgia, M. Multimodality monitoring: informatics, integration data display and analysis. Neurocrit Care 21, doi:10.1007/s12028-014-0037-1 (2014).

169 Rothman, B. S. in Monitoring Technologies in Acute Care Environments: A

Comprehensive Guide to Patient Monitoring Technology (eds M. Jesse Ehrenfeld & Maxime Cannesson) 13-22 (Springer New York, 2014).

170 Cooper, J. B., Newbower, R. S., Long, C. D. & McPeek, B. Preventable Anesthesia Mishaps: A Study of Human Factors. Anesthesiology 49, 399-406 (1978).

154

171 Carayon, P. Human factors in patient safety as an innovation. Appl Ergon 41, 657-665, doi:http://dx.doi.org/10.1016/j.apergo.2009.12.011 (2010).

172 Carayon, P. Handbook of human factors and ergonomics in health care and patient

safety. (CRC Press, 2011).

173 Reader, T. W. & Cuthbertson, B. H. Teamwork and team training in the ICU: Where do the similarities with aviation end? Critical Care 15, 313, doi:10.1186/cc10353 (2011).

174 Standardization, I. O. f. (Switzerland, 2010).

175 Lin, Y. L., Guerguerian, A.-M., Laussen, P. & Trbovich, P. Heuristic evaluation of a data integration and visualization software used for continuous monitoring to support intensive care: a bedside nurse's perspective. Journal of Nursing & Care (2015).

176 Garmer, K., Liljegren, E., Osvalder, A. L. & Dahlman, S. Arguing for the need of triangulation and iteration when designing medical equipment. Journal of clinical

monitoring and computing 17, 105-114 (2002).

177 Chan, A. J. et al. The use of human factors methods to identify and mitigate safety issues in radiation therapy. Radiotherapy and Oncology 97, 596-600 (2010).

178 Chan, J., Shojania, K. G., Easty, A. C. & Etchells, E. E. Usability evaluation of order sets in a computerised provider order entry system. BMJ quality & safety 20, 932-940, doi:10.1136/bmjqs.2010.050021 (2011).

179 Middleton, B. et al. Enhancing patient safety and quality of care by improving the usability of electronic health record systems: recommendations from AMIA. Journal of

the American Medical Informatics Association 20, e2-e8 (2013).

180 Patterson, E. S. et al. Enhancing electronic health record usability in pediatric patient care: a scenario-based approach. Jt Comm J Qual Patient Saf 39, 129-135 (2013).

181 Daniels, J., Fels, S., Kushniruk, A., Lim, J. & Ansermino, J. M. A framework for evaluating usability of clinical monitoring technology. Journal of clinical monitoring and

computing 21, 323-330, doi:10.1007/s10877-007-9091-y [doi] (2007).

182 Nielsen, J. & Landauer, T. K. in Proceedings of the INTERACT'93 and CHI'93

conference on Human factors in computing systems. 206-213 (ACM).

183 McHugh, M. L. Interrater reliability: the kappa statistic. Biochemia Medica 22, 276-282 (2012).

184 Cassano-Piché, A., Trbovich, P., Griffin, M., Lin, Y. L. & Easty, T. Human Factors For

Health Technology Safety: Evaluating and Improving the Use of Health Technology In

The Real World. 1 edn, (HumanEra @ UHN, Global Centre for eHealth Innovation, University Health Network

International Federation of Medical and Biological Engineering, Clinical Engineering Division, 2015).

155

185 Schwartz, B. The paradox of choice. (Ecco 2004).

186 Hick, W. E. On the rate of gain of information. Quarterly Journal of Experimental

Psychology 4, 11-26 (1952).

187 Feyen, B. F., Sener, S., Jorens, P. G., Menovsky, T. & Maas, A. I. Neuromonitoring in traumatic brain injury. Minerva anestesiologica 78, 949-958 (2012).

188 Amar, R. & Stasko, J. in Information Visualization, 2004. INFOVIS 2004. IEEE

Symposium on. 143-150 (IEEE).

189 Amar, R., Eagan, J. & Stasko, J. in IEEE Symposium on Information Visualization, 2005.

INFOVIS 2005. 111-117.

190 Bion, J. & Heffner, J. E. Challenges in the care of the acutely ill. Lancet (London,

England) 363, doi:10.1016/s0140-6736(04)15793-0 (2004).

191 Michard, F. Decision support for hemodynamic management: from graphical displays to closed loop systems. Anesthesia & Analgesia 117, 876-882 (2013).

192 McBride, M. E., Waldrop, W. B., Fehr, J. J., Boulet, J. R. & Murray, D. J. Simulation in pediatrics: the reliability and validity of a multiscenario assessment. Pediatrics, peds. 2010-3278 (2011).

193 Kushniruk, A., Nohr, C., Jensen, S. & Borycki, E. M. From Usability Testing to Clinical Simulations: Bringing Context into the Design and Evaluation of Usable and Safe Health Information Technologies. IMIA Yearbook 2013: Evidence-based Health Informatics 8, 78-85 (2013).

194 Pickering, B., Herasevich, V., Ahmed, A. & Gajic, O. Novel representation of clinical information in the ICU: developing user interfaces which reduce information overload. Appl Clin Inform 1, 116-131 (2010).

195 Moorman, J. R. et al. in Engineering in Medicine and Biology Society, EMBC, 2011

Annual International Conference of the IEEE. 5515-5518 (IEEE).

196 Tuman, K. J., Carroll, G. C. & Ivankovich, A. D. Pitfalls in interpretation of pulmonary artery catheter data. Journal of cardiothoracic anesthesia 3, 625-641 (1989).

197 Andrews, F. J. & Nolan, J. P. Critical care in the emergency department: monitoring the critically ill patient. Emergency Medicine Journal : EMJ 23, 561-564, doi:10.1136/emj.2005.029926 (2006).

198 Freeman, J. M. Beware: The Misuse of Technology and the Law of Unintended Consequences. Neurotherapeutics 4, 549-554, doi:http://dx.doi.org/10.1016/j.nurt.2007.04.003 (2007).

199 Drews, F. A. Patient Monitors in Critical Care: Lessons for Improvement; Advances in

Patient Safety: New Directions and Alternative Approaches (Vol. 3: Performance and

Tools). (2008).

156

200 Reader, T., Cuthbertson, B. & Decruyenaere, J. Burnout in the ICU: potential consequences for staff and patient well-being. Intensive Care Med 34, doi:10.1007/s00134-007-0908-4 (2008).

201 Simone, L. K. Software-Related Recalls: An Analysis of Records. Biomedical

Instrumentation & Technology 47, 514-522 (2013).

157

Appendix A: Systematic Search Strategies

May 2014

Search Strategy (Medline, Embase, Web of Science, PsycINFO):

1 exp Critical Care/ (43531) 2 exp Intensive Care Units/ (55611) 3 ((critical or intensive) adj2 care).mp. (130846) 4 (PICU or NICU or ICU).mp. (32491) 5 ((patient* or ambulator* or "body temperature*" or electrocardiograph* or ekg or ecg or "electric cardiogram*" or electrocardiogram* or brain* or cerebral*) adj2 monitor*).mp. (39649) 6 ("life support*" or CPR or resuscitat* or reanimat* or capnogra* or capnometry or neuromonitor* or telemonitor*).mp. (72347) 7 ((airway* or breath*) adj2 (control* or manage* or regulat*)).mp. (9010) 8 (high adj2 frequenc* adj2 ventilat*).mp. (3582) 9 ((invasive or noninvasive or "non invasive") adj2 ventilat*).mp. (3469) 10 (pressure* adj2 (breath* or respirat* or ventilat*)).mp. (23265) 11 (respirat* adj2 (control* or regulat*)).mp. (6807) 12 (therapeutic* adj2 hyperventilat*).mp. (10) 13 (ventilat* adj2 support).mp. (5872)

14 or/1-13 (280140)

15 (data* adj2 display*).mp. (7541) 16 (physiologic* adj2 monitor*).mp. (45714) 17 (graph* adj2 display*).mp. (1509) 18 (data* adj2 interface*).mp. (586) 19 (monitor* adj2 (system* or platform*)).mp. (12029) 20 (software* adj2 (system* or platform*)).mp. (2698) 21 ((multimodal* or multi-modal*) adj2 monitor*).mp. (259) 22 (continuous adj2 monitor*).mp. (9941) 23 (computer* adj2 (design* or graphic*)).mp. (25269) 24 (software* adj2 design*).mp. (6419) 25 informatic*.mp. (17315) 26 (data* adj2 acquisition*).mp. (8040) 27 (integrat* adj2 (display* or monitor* or platform* or software*)).mp. (1802)

28 or/15-27 (129019)

29 evaluation studies as topic/ or device approval/ or diagnostic test approval/ or feasibility studies/ or pilot projects/ or program evaluation/ or validation studies as topic/ (284249) 30 decision support techniques/ or data interpretation, statistical/ (60230) 31 Decision Support Systems, Clinical/ (4743) 32 Technology Assessment, Biomedical/ (8069) 33 "outcome assessment (health care)"/ or patient outcome assessment/ or watchful waiting/ or "process assessment (health care)"/ (52518) 34 quality assurance, health care/ or total quality management/ or quality improvement/ (64251) 35 user-computer interface/ (27480) 36 Adaptation, Psychological/ (73905)

158

37 human engineering/ or man-machine systems/ or "task performance and analysis"/ or "time and motion studies"/ or work simplification/ (36767) 38 Consumer Satisfaction/ (17443) 39 human factor*.mp. (4523) 40 (adapt* or adjust* or analys* or assess* or "clinical prediction*" or coping or "critical

incident techni*" or critique* or "decision aid*" or "decision* model*" or "decision* support

model*" or "decision* support system*" or "decision* support techni*" or "device approv*" or

"diagnostic test* approv*" or effective* or ergonomic* or evaluat* or "feasibility stud*" or "human

engineer*" or interpret* or "man-machine system*" or outcome* or "pilot project*" or "pilot stud*"

or preference* or "pre-post test*" or "process* measure*" or "program* sustainab*" or

"psychology engineer*" or "quality assurance" or "quality improve*" or "quality manage*" or

satisf* or "task* performance*" or "technology assess*" or "time and motion stud*" or "time

stud*" or "user-computer interface*" or "validation stud*" or "virtual system*" or "watchful wait*"

or "work simplif*").mp. (7676889)

41 or/29-40 (7678638)

42 14 and 28 and 41 (10389)

43 limit 42 to yr="2000 -Current" (6209)

159

January 2018, Web of Science

Set

Results

# 38 1,794 (#35 OR #28) AND LANGUAGE: (English) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=2014-2018

# 37 5,283 (#35 OR #28) AND LANGUAGE: (English) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 36 5,495 #35 OR #28 Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 35 171 #34 AND #7 Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 34 29,939 #33 OR #32 OR #31 OR #30 OR #29 Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 33 14,704 TS=(human* NEAR/2 computer* NEAR/2 (interface* or interact*)) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 32 737 TS="man-machine system*" Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 31 12,259 TS=(graph* NEAR/1 user* NEAR/1 interface*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 30 2,300 TS=(ecological NEAR/2 (display* or interface* or monitor*)) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 29 333 TS="user-computer interface*" Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 28 5,395 #27 AND #23 AND #7 Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 27 18,330,350 #26 OR #25 OR #24 Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 26 2,601,342 TS=simulat* Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 25 16,751,072 TS=(adapt* or adjust* or analys* or assess* or "clinical prediction*" or "critical incident techni*" or critique* or "decision aid*" or "decision* model*" or "decision* support model*" or "decision* support system*" or "decision* support techni*" or effective* or ergonomic* or evaluat* or "human engineer*" or interpret* or outcome* or preference* or "process* measure*" or "psychology engineer*" or satisf* or "task* performance*" or "technology assess*" or "time and motion stud*" or "time stud*" or "watchful wait*" or "work simplif*") Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 24 662,501 TS=human factor* Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 23 406,575 #22 OR #21 OR #20 OR #19 OR #18 OR #17 OR #16 OR #15 OR #14 OR #13 OR #12 OR #11 OR #10 OR #9 OR #8 Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 22 83,732 TS=(information NEAR/2 technolog*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 21 15,777 TS=((data* or information*) NEAR/2 visualization) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 20 21,211 TS=(integrat* NEAR/2 (display* or monitor* or platform* or software*)) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

160

# 19 54,416 TS=(data* NEAR/2 acquisition*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 18 20,236 TS=informatic* Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 17 27,330 TS=(software* NEAR/2 design*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 16 43,047 TS=(computer* NEAR/2 (design* or graphic*)) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 15 25,505 TS=(continuous NEAR/2 monitor*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 14 606 TS=((multimodal* or multi-modal*) NEAR/2 monitor*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 13 48,254 TS=(software* NEAR/2 (system* or platform*)) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 12 76,107 TS=(monitor* NEAR/2 (system* or platform*)) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 11 7,252 TS=(data* NEAR/2 interface*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 10 6,151 TS=(graph* NEAR/2 display*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 9 3,764 TS=(physiologic* NEAR/2 monitor*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 8 7,941 TS=(data* NEAR/2 display*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 7 209,300 #6 OR #5 OR #4 OR #3 OR #2 OR #1 Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 6 1,749 TS=(med* NEAR/2 surg* NEAR/2 unit*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 5 16,384 TS=(pressure* NEAR/2 (breath* or respirat* or ventilat*)) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 4 1,384 TS=neuromonitor* Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 3 36,275 TS=((patient* or brain* or cerebral*) NEAR/2 monitor*) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 2 51,821 TS=(PICU or NICU or ICU) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

# 1 143,145 TS=((critical or intensive) NEAR/2 care) Indexes=SCI-EXPANDED, SSCI, CPCI-S, CPCI-SSH Timespan=All years

161

Appendix B: Qualitative Study Assessment Tools

Table 23. Bowling’s completeness checklist for healthcare research

Point Description

1 Clearly stated aims & objectives

2 Study design adequately described

3 Appropriate use of instruments (reliability, validity)

4 Adequate description of source of sample, inclusion/exclusion criteria, response rates

5 Statistical power discussed

6 Ethical Considerations discussed

7 Study piloted

8 Appropriate analyses (statistical or qualitative)

9 Results clear, adequately supported

10 Discussion relates results to study question and relevant literature

11 Limitations of research and design discussed

12 Implications discussed

Study completeness of qualitative studies, Checklist item (Bowling 2002)65

ENTREQ Statement, modification are in bold

No Item Description

1 Aim In critical care, how and why do clinician use clinical data and data integration technologies for intensive care monitoring?

new Rationale for

review

Technology in intensive care has historically been given little attention but it tightly coupled with the delivery of care by clinicians. Understanding how human factors are shaped by existing technologies is necessary to design better technologies for the future.

2 Synthesis methodology

Meta-ethnography

3 Approach to searching

The search was iterative and relevant studies were divided post-SR into qualitative and quantitative parts, according to Chaudhry’s definition

4 Inclusion criteria

All studies focussed on ICU clinician participants, with a data integration technology, English and French, Jan 2000 to May 2004, original research peer-reviewed publications,

5 Data Sources Professional medical librarian ran a systematic search on Ovid, Embase, CCRCT, Web of Science, and PsycINFO databases. Additional hand search was performed on references from the final selection of papers and related systematic reviews.

6 Electronic Search strategy

Provided by CN

7 Study screening methods

Title and abstract screening by YL. Full text screening independently by YL and PT.

8 Study characteristics

Evidence table provided

9 Study selection results

After removing duplicates, 6111 articles were screened, see PRISMA formatted flowchart

162

10 Rationale for

appraisal

To set minimum criteria for qualitative studies

11 Appraisal items Bowling’s 12-item checklist for health information technologies was used to assess study completeness.

12 Appraisal process

Appraisal for completeness was conducted by two independent reviewers (YL and LK)

13 Appraisal results

All studies were kept and provided at least 9/12 of the necessary items.

14 Data extraction Background section was screened for mention of the theoretical basis for the study. Methods section was screened for methods used. First order constructs and direct quotations were extracted from the results section. Second order constructs were extracted from discussion and conclusion. Two independent reviewers read and reread the final studies and extracted second order constructs. A list of constructs was finalized through discussion and consensus was reached where labels were different.

15 Software NVivo 8 was used to code first and second order constructs from the data.

16 Number of reviewers

Four reviewers.

17 Coding First and second order constructs were searched line by line in the pertinent sections.

18 Study comparison

Concepts were pooled and harmonized into a final list which was re-found within text and verified for consistency.

19 Derivation of themes

Third order constructs were derived inductively from the first and second order constructs observed in the studies.

20 Quotations First order constructs are the quotations (direct from the participants)

21 Synthesis output

HF qualitative studies which are useful to data integration technology designers and ICU technology managers, whenever possible, should have a mature software implemented within the actual workplace for there to be meaningful query of the clinician user. Clinician impressions of data integration technologies should relate tasks for technology functionalities to understand if it helps or hinder intensive care delivery to the patient. Qualitative studies should not shy away from more detailed analysis of the technology as it relates to clinician use. Studies which do not describe the functions and interface features of the software are too general to inform design of technologies that are already converging to provide similar integration and functionality.

163

Figure 20. Proportion of qualitative and quantitative studies from 2001 to 2018.

164

Appendix C: Quantitative Study Assessment Tools

Study completeness of quantitative studies (Peute 2013)73, added criteria are highlighted in

yellow and bold, removed criteria and in red

Heading Sub-Heading Item

Introduction

Keywords

Type or functionality of the system

UCD phases

scientific domain

methods applied

usability as mesh term

Essential information

Conclusion or recommendations previous HF/usability studies

purpose and reason for study

scientific aims

potential health implications and ethical principles Background information

If HF/usability study is an integrated part in HIT development

Support for HF/usability activities within organization

system design/development team

UCD phases that are covered

system design principles or existing standards used specifications/goals/requirements depending on UCD phases

If the study is

scientifically oriented

User interface design principles applied or methods evaluated

theories underlying the interface design principles or methods evaluated

System type or its part/functionality

Version

release date

graphical view

system behavior view

the setting

the user tasks to be supported

main system functionalities

the ICT architecture

number of users

overview actual/intended users’ profile

if the system is in use the context of the system

user characteristics

organizational and physical environment and equipment

Method

Method section

Applied method(s)

suitability of each method

number of and expertise background of study

165

evaluators

description of study variables

outcome measures and quality metrics

if study used test scenarios or tasks

if scenarios developed based on Delphi or expert consensus*

if study participants are (representative) end users

IRB or REB approval

Background study participants

Age

gender

linguistic and culture background

level of education

professional competence

potential disabilities

level of experience using IT

level of experience with similar system

Generalizability and reproducibility of the

study

Setting of the study

study period and evaluation time

instructions provided to participants and the recruitment

resources required and their availability

Results

Result section

If HF/usability methods have been applied

results are reported on per method

unexpected events encountered

unexpected results uncovered

If the study reports on usability problems

Presentation of results should rely on classification scheme

usability problems rated for their severity

usability problems rated for their potential impact on patient safety.

Discussion

Discussion section

Intended purpose of the study is achieved

limitations of the study

contribution of the study to the UCD process

added value of method applied

knowledge/evidence gained in terms of HF/usability principles

added value of this paper

166

Modified QUality ASessment Informatics Instrument (QUASII) score from Weir et al. 200974,

on a scale of 1 to 5. Modifications are in red and bold

Q ID # 1 What is your estimate of the overall degree of research quality? (used for validation

purposes)

2 To what degree does the manipulation and/or measurement of the Independent Variable(s) reflect the underlying construct that was proposed by the authors?

3 To what degree was the technology implementation sufficient?

4 To what degree were the dependent variable(s) valid and clinically significant? (Is the selected DV appropriate? Is the impact of the technology on the DV large enough to justify changing clinical practice?)

5 To what degree was the proposed relationship between the independent variable and the dependent variable specified in terms of mediators and moderators? Was there evidence of selection bias in terms of measuring the types of effects (e.g. choosing only those outcomes thought to be favorable)?

6 To what degree do deficiencies in design impact the conclusions? (pre/post designs getting the worst scores)

7 To what degree do differences in the type of clinicians between study groups impact the conclusions

8 To what degree do differences in the technology implementation between study groups impact the conclusions?

9 To what degree do differences in the way that groups were treated during the study period impact the conclusions?

10 To what degree do differences in the way that measurements were taken during the study period impact the conclusions?

11 To what degree did the measurement of the dependent variables impact the conclusions? (reliability, validity, floor and ceiling effects)

12 To what degree did inappropriate unit of analysis impact the conclusions? (e.g using a patient level analysis when provider behavior was the target)

13 To what degree did the way that confounders were included in the statistical analysis impact the conclusions?

14 To what degree did possible problems with missing data impact the conclusions?

15 To what degree did the type of statistical analysis done impact the conclusions?

16 To what degree did “fishing” or conducting multiple tests impact the conclusions?

17 To what degree are the study results generalizable?

18 Do the conclusions match the results reported?

SUM Max 90

167

Appendix D: Cognitive Load Assessment Tool and Statistical Analysis

Rating scale end points and descriptions, for each of the six dimensions of cognitive load,

assessed using the NASA-TLX instrument.144

Mental demand (Low/High)

Definition: How much mental and perceptual activity was required (e.g., thinking, deciding,

calculating, remembering, looking, searching, etc.)? Was the task easy or demanding, simple or

complex, exacting or forgiving?

Physical demand (Low/High)

Definition: How much physical activity was required (e.g.. pushing, pulling, turning, controlling,

activating, etc.)? Was the task easy or demanding, slow or brisk, slack or strenuous, restful or

laborious?

Temporal demand (Low/High)

Definition: How much time pressure did you feel due to the rate or pace at which the tasks or

task elements occurred? Was the pace slow and leisurely or rapid and frantic?

Performance (Good/Poor)

Definition: How successful do you think you were in accomplishing the goals of the task set by

the experimenter (or yourself)? How satisfied were you with your performance in accomplishing

these goals?

Effort (Low/High)

Definition: How hard did you have to work (mentally and physically) to accomplish your level

of performance?

Frustration level (Low/High)

168

Definition: How insecure, discouraged, irritated, stressed and annoyed versus secure, gratified,

content, relaxed and complacent did you feel during the task?

169

Descriptive statistics and non-parametric tests, R-output

round(stat.desc(CogLoadFacet$Score, basic=FALSE, norm=TRUE), digit=3) median mean SE.mean CI.mean.0.95 var std.dev 7.000 7.472 0.124 0.244 22.295 4.722 coef.var skewness skew.2SE kurtosis kurt.2SE 0.632 0.499 3.878 -0.423 -1.646 normtest.W normtest.p 0.953 0.000 Levene’sLevene’sLevene’sLevene’s teststeststeststests forforforfor homogeneityhomogeneityhomogeneityhomogeneity (six(six(six(six TLXTLXTLXTLX dimensions,dimensions,dimensions,dimensions, nononono controlcontrolcontrolcontrol forforforfor displaydisplaydisplaydisplay type)type)type)type) > CogLoadFacet_ss_mental<- subset(CogLoadFacet,TLX.Dimension=="Mental") > leveneTest(CogLoadFacet_ss_mental$Score, CogLoadFacet_ss_mental$Display, center = "mean") Levene's Test for Homogeneity of Variance (center = "mean") Df F value Pr(>F) group 3 0.5625 0.6402 237 > CogLoadFacet_ss_physical<- subset(CogLoadFacet,TLX.Dimension=="Physical") > leveneTest(CogLoadFacet_ss_physical$Score, CogLoadFacet_ss_physical$Display, center = "mean") Levene's Test for Homogeneity of Variance (center = "mean") Df F value Pr(>F) group 3 1.8099 0.146 237 > CogLoadFacet_ss_temporal<- subset(CogLoadFacet,TLX.Dimension=="Temporal") > leveneTest(CogLoadFacet_ss_temporal$Score, CogLoadFacet_ss_temporal$Display, center = "mean") Levene's Test for Homogeneity of Variance (center = "mean") Df F value Pr(>F) group 3 0.2967 0.8278 237 > CogLoadFacet_ss_performance<- subset(CogLoadFacet,TLX.Dimension=="Performance") > leveneTest(CogLoadFacet_ss_performance$Score, CogLoadFacet_ss_performance$Display, center = "mean") Levene's Test for Homogeneity of Variance (center = "mean") Df F value Pr(>F) group 3 5.402 0.001301 ** 237 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > CogLoadFacet_ss_effort<- subset(CogLoadFacet,TLX.Dimension=="Effort") > leveneTest(CogLoadFacet_ss_effort$Score, CogLoadFacet_ss_effort$Display, center = "mean") Levene's Test for Homogeneity of Variance (center = "mean") Df F value Pr(>F) group 3 5.6704 0.0009113 *** 237 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > CogLoadFacet_ss_frustration<- subset(CogLoadFacet,TLX.Dimension=="Frustration") > leveneTest(CogLoadFacet_ss_frustration$Score, CogLoadFacet_ss_frustration$Display, center = "mean") Levene's Test for Homogeneity of Variance (center = "mean") Df F value Pr(>F)

170

group 3 5.0193 0.002162 ** 237 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Mental DemandMental DemandMental DemandMental Demand, descriptive statistics and Levene’s test for homogeneity of variance [ns]

> by(CogLoadFacet_ss_mental$Score, CogLoadFacet_ss_mental$Display, stat.desc, desc = FALSE, basic=FALSE, norm=TRUE) CogLoadFacet_ss_mental$Display: Econtrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p -0.02089611 -0.04090707 -1.02476778 -1.01339524 0.96521975 0.01735854 ------------------------------------------------------------------------------------------------------------------- CogLoadFacet_ss_mental$Display: NewVis skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.2265491 0.3755930 -0.9362997 -0.7870173 0.9645258 0.0662923 ------------------------------------------------------------------------------------------------------------------- CogLoadFacet_ss_mental$Display: Pcontrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p -0.63405614 -0.69590814 0.07693079 0.04338976 0.94984901 0.23001658 ------------------------------------------------------------------------------------------------------------------- CogLoadFacet_ss_mental$Display: TabBar skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.25210380 0.41795981 -0.85340908 -0.71734264 0.95792651 0.03055001

> leveneTest(CogLoadFacet_ss_mental$Score, CogLoadFacet_ss_mental$Display) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 3 0.6494 0.584 237

Physical DemandPhysical DemandPhysical DemandPhysical Demand, descriptive statistics and Levene’s test for homogeneity of variance [ns]

> by(CogLoadFacet_ss_physical$Score, CogLoadFacet_ss_physical$Display, stat.desc, desc = FALSE, basic=FALSE, norm=TRUE) CogLoadFacet_ss_physical$Display: Pcontrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.982868694 1.078747259 0.533354654 0.300817553 0.860856638 0.002327122 ------------------------------------------------------------ CogLoadFacet_ss_physical$Display: Econtrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 1.476380e+00 2.890220e+00 1.368671e+00 1.353482e+00 7.691583e-01 1.655871e-10 ------------------------------------------------------------ CogLoadFacet_ss_physical$Display: NewVis skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 2.561539e+00 4.246745e+00 7.707489e+00 6.478617e+00 5.667602e-01 2.368908e-12 ------------------------------------------------------------ CogLoadFacet_ss_physical$Display: TabBar skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 2.793910e+00 4.631989e+00 7.840875e+00 6.590736e+00 5.347042e-01 7.886747e-13

171

> leveneTest(CogLoadFacet_ss_physical$Score, CogLoadFacet_ss_physical$Display) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 3 2.3583 0.07234 . 237 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Temporal DemandTemporal DemandTemporal DemandTemporal Demand, descriptive statistics and Levene’s test for homogeneity of variance [ns]

> by(CogLoadFacet_ss_temporal$Score, CogLoadFacet_ss_temporal$Display, stat.desc, desc = FALSE, basic=FALSE, norm=TRUE) CogLoadFacet_ss_temporal$Display: Econtrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.004116476 0.008058579 -0.961427374 -0.950757771 0.972884617 0.059035916 ------------------------------------------------------------ CogLoadFacet_ss_temporal$Display: NewVis skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.38121570 0.63201286 -0.66807270 -0.56155605 0.95870563 0.03344828 ------------------------------------------------------------ CogLoadFacet_ss_temporal$Display: Pcontrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p -0.4054866 -0.4450417 -0.2156405 -0.1216235 0.9398394 0.1331283 ------------------------------------------------------------ CogLoadFacet_ss_temporal$Display: TabBar skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.17425317 0.28889220 -0.81562778 -0.68558514 0.96795705 0.09955443 > leveneTest(CogLoadFacet_ss_temporal$Score, CogLoadFacet_ss_temporal$Display) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 3 0.298 0.8268 237

PerformancePerformancePerformancePerformance, descriptive statistics and Levene’s test for homogeneity of variance [passes]

> by(CogLoadFacet_ss_performance$Score, CogLoadFacet_ss_performance$Display, stat.desc, desc = FALSE, basic=FALSE, norm=TRUE) CogLoadFacet_ss_performance$Display: Econtrol skewness skew.2SE kurtosis kurt.2SE normtest.W 0.198265016735 0.388131531094 -1.496233749868 -1.479629042344 0.891857842249 normtest.p 0.000001993633 ------------------------------------------------------------ CogLoadFacet_ss_performance$Display: NewVis skewness skew.2SE kurtosis kurt.2SE normtest.W 0.25171185975 0.41731001779 -1.47832372804 -1.24262170079 0.87625479567 normtest.p 0.00001307379 ------------------------------------------------------------ CogLoadFacet_ss_performance$Display: Pcontrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p

172

-0.2636540 -0.2893734 -0.6881362 -0.3881160 0.9703002 0.6311245 ------------------------------------------------------------ CogLoadFacet_ss_performance$Display: TabBar skewness skew.2SE kurtosis kurt.2SE normtest.W 0.41974830587 0.69589558936 -1.28092105866 -1.07669265825 0.88574446332 normtest.p 0.00002762905 > leveneTest(CogLoadFacet_ss_performance$Score, CogLoadFacet_ss_performance$Display) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 3 4.062 0.007705 ** 237 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

EffortEffortEffortEffort, descriptive statistics and Levene’s test for homogeneity of variance [passes]

> by(CogLoadFacet_ss_effort$Score, CogLoadFacet_ss_effort$Display, stat.desc, desc = FALSE, basic=FALSE, norm=TRUE) CogLoadFacet_ss_effort$Display: Econtrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.32262690 0.63158734 -0.83074067 -0.82152139 0.96336239 0.01299651 ------------------------------------------------------------ CogLoadFacet_ss_effort$Display: NewVis skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.31666046 0.52498751 -0.74103875 -0.62288849 0.96084054 0.04293255 ------------------------------------------------------------ CogLoadFacet_ss_effort$Display: Pcontrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.555386997 0.609564842 -0.002967817 -0.001673879 0.931481597 0.084054361 ------------------------------------------------------------ CogLoadFacet_ss_effort$Display: TabBar skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.259625557 0.430430040 -1.181234720 -0.992900180 0.933153083 0.002029508 > leveneTest(CogLoadFacet_ss_effort$Score, CogLoadFacet_ss_effort$Display) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 3 5.4949 0.00115 ** 237 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

FrustrationFrustrationFrustrationFrustration, descriptive statistics and Levene’s test for homogeneity of variance [passes]

> by(CogLoadFacet_ss_frustration$Score, CogLoadFacet_ss_frustration$Display, stat.desc, desc = FALSE, basic=FALSE, norm=TRUE) CogLoadFacet_ss_frustration$Display: Econtrol skewness skew.2SE kurtosis kurt.2SE normtest.W 0.87360904249 1.71021202239 0.21790589946 0.21548765183 0.91572435791 normtest.p 0.00002428254 ------------------------------------------------------------ CogLoadFacet_ss_frustration$Display: NewVis skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p

173

0.520019146 0.862133391 -0.829964476 -0.697636011 0.934923231 0.002434858 ------------------------------------------------------------ CogLoadFacet_ss_frustration$Display: Pcontrol skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.57720186 0.63350773 -0.49494182 -0.27915232 0.89899938 0.01487188 ------------------------------------------------------------ CogLoadFacet_ss_frustration$Display: TabBar skewness skew.2SE kurtosis kurt.2SE normtest.W normtest.p 0.2515352 0.4170171 -0.4979296 -0.4185403 0.9689750 0.1123145 > leveneTest(CogLoadFacet_ss_frustration$Score, CogLoadFacet_ss_frustration$Display) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 3 4.3262 0.005427 ** 237 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

174

KruskalKruskalKruskalKruskal----Wallis test for mental demand, all combinations of displaysWallis test for mental demand, all combinations of displaysWallis test for mental demand, all combinations of displaysWallis test for mental demand, all combinations of displays

Kruskal-Wallis test for mental demand

Multiple comparison test after Kruskal-Wallis p.value: 0.05 Comparisons obs.dif critical.dif difference Econtrol-NewVis 17.464687 30.28302 FALSE Econtrol-Pcontrol 87.375972 41.00247 TRUE Econtrol-TabBar 22.012306 30.28302 FALSE NewVis-Pcontrol 104.840659 42.87269 TRUE NewVis-TabBar 4.547619 32.77083 FALSE Pcontrol-TabBar 109.388278 42.87269 TRUE

Kruskal-Wallis test for physical demand

> kruskalmc(Score~Display, data=CogLoadFacet_ss_physical) Multiple comparison test after Kruskal-Wallis p.value: 0.05 Comparisons obs.dif critical.dif difference Econtrol-NewVis 29.134475 30.28302 FALSE Econtrol-Pcontrol 17.729689 41.00247 FALSE Econtrol-TabBar 24.753522 30.28302 FALSE NewVis-Pcontrol 46.864164 42.87269 TRUE NewVis-TabBar 4.380952 32.77083 FALSE Pcontrol-TabBar 42.483211 42.87269 FALSE

Kruskal-Wallis test for temporal demand

> kruskalmc(Score~Display, data=CogLoadFacet_ss_temporal) Multiple comparison test after Kruskal-Wallis p.value: 0.05 Comparisons obs.dif critical.dif difference Econtrol-NewVis 12.5022294 30.28302 FALSE Econtrol-Pcontrol 100.3515557 41.00247 TRUE Econtrol-TabBar 12.1927055 30.28302 FALSE NewVis-Pcontrol 112.8537851 42.87269 TRUE NewVis-TabBar 0.3095238 32.77083 FALSE Pcontrol-TabBar 112.5442613 42.87269 TRUE

Kruskal-Wallis test for performance

> kruskalmc(Score~Display, data=CogLoadFacet_ss_performance) Multiple comparison test after Kruskal-Wallis p.value: 0.05 Comparisons

175

obs.dif critical.dif difference Econtrol-NewVis 0.5856965 30.28302 FALSE Econtrol-Pcontrol 90.8692740 41.00247 TRUE Econtrol-TabBar 3.7126806 30.28302 FALSE NewVis-Pcontrol 90.2835775 42.87269 TRUE NewVis-TabBar 3.1269841 32.77083 FALSE Pcontrol-TabBar 87.1565934 42.87269 TRUE

Kruskal-Wallis test for effort

> kruskalmc(Score~Display, data=CogLoadFacet_ss_effort) Multiple comparison test after Kruskal-Wallis p.value: 0.05 Comparisons obs.dif critical.dif difference Econtrol-NewVis 39.932138 30.28302 TRUE Econtrol-Pcontrol 10.277874 41.00247 FALSE Econtrol-TabBar 35.487694 30.28302 TRUE NewVis-Pcontrol 50.210012 42.87269 TRUE NewVis-TabBar 4.444444 32.77083 FALSE Pcontrol-TabBar 45.765568 42.87269 TRUE

Kruskal-Wallis test for frustration

> kruskalmc(Score~Display, data=CogLoadFacet_ss_frustration) Multiple comparison test after Kruskal-Wallis p.value: 0.05 Comparisons obs.dif critical.dif difference Econtrol-NewVis 9.8181737 30.28302 FALSE Econtrol-Pcontrol 8.8636560 41.00247 FALSE Econtrol-TabBar 18.7388086 30.28302 FALSE NewVis-Pcontrol 0.9545177 42.87269 FALSE NewVis-TabBar 8.9206349 32.77083 FALSE Pcontrol-TabBar 9.8751526 42.87269 FALSE

176

Friedman ranked sum for all six dimensFriedman ranked sum for all six dimensFriedman ranked sum for all six dimensFriedman ranked sum for all six dimensions, with paper control data removed (missing values, ions, with paper control data removed (missing values, ions, with paper control data removed (missing values, ions, with paper control data removed (missing values,

complete sets of data comparing Gorges 2011 and 2012 and Anders 2012, assumes Econtrol is complete sets of data comparing Gorges 2011 and 2012 and Anders 2012, assumes Econtrol is complete sets of data comparing Gorges 2011 and 2012 and Anders 2012, assumes Econtrol is complete sets of data comparing Gorges 2011 and 2012 and Anders 2012, assumes Econtrol is

similar for all three studies, and Tabular (Anders) and Bar (Gorges) are equivalent)similar for all three studies, and Tabular (Anders) and Bar (Gorges) are equivalent)similar for all three studies, and Tabular (Anders) and Bar (Gorges) are equivalent)similar for all three studies, and Tabular (Anders) and Bar (Gorges) are equivalent) > friedman.test(as.matrix(completenewdf_mental)) Friedman rank sum test data: as.matrix(completenewdf_mental) Friedman chi-squared = 58.482, df = 3, p-value = 1.24e-12 > friedman.test(as.matrix(completenewdf_physical)) Friedman rank sum test data: as.matrix(completenewdf_physical) Friedman chi-squared = 124.03, df = 3, p-value < 2.2e-16 > friedman.test(as.matrix(completenewdf_temporal)) Friedman rank sum test data: as.matrix(completenewdf_temporal) Friedman chi-squared = 70.548, df = 3, p-value = 3.258e-15 > friedman.test(as.matrix(completenewdf_performance)) Friedman rank sum test data: as.matrix(completenewdf_performance) Friedman chi-squared = 68.967, df = 3, p-value = 7.104e-15 > friedman.test(as.matrix(completenewdf_effort)) Friedman rank sum test data: as.matrix(completenewdf_effort) Friedman chi-squared = 63.683, df = 3, p-value = 9.594e-14 > friedman.test(as.matrix(completenewdf_frustration)) Friedman rank sum test data: as.matrix(completenewdf_frustration) Friedman chi-squared = 77.064, df = 3, p-value < 2.2e-16

177

PostHoc Friedman ANOVA, without comparing to Paper controlPostHoc Friedman ANOVA, without comparing to Paper controlPostHoc Friedman ANOVA, without comparing to Paper controlPostHoc Friedman ANOVA, without comparing to Paper control

Mental Demand (1=Econtrol, 2= NewVis, 3=TabBar) Multiple comparisons between groups after Friedman test p.value: 0.05 Comparisons obs.dif critical.dif difference 1-2 76.0 38.23198 TRUE 1-3 94.5 38.23198 TRUE 1-4 87.5 38.23198 TRUE 2-3 18.5 38.23198 FALSE 2-4 11.5 38.23198 FALSE 3-4 7.0 38.23198 FALSE

PostHoc Friedman ANOVAPostHoc Friedman ANOVAPostHoc Friedman ANOVAPostHoc Friedman ANOVA

Physical Demand (1=Econtrol, 2= NewVis, 3=TabBar) Multiple comparisons between groups after Friedman test p.value: 0.05 Comparisons obs.dif critical.dif difference 1-2 102.0 38.23198 TRUE 1-3 121.5 38.23198 TRUE 1-4 114.5 38.23198 TRUE 2-3 19.5 38.23198 FALSE 2-4 12.5 38.23198 FALSE 3-4 7.0 38.23198 FALSE


Temporal Demand (1=Econtrol, 2= NewVis, 3=TabBar) Multiple comparisons between groups after Friedman test p.value: 0.05 Comparisons obs.dif critical.dif difference 1-2 86.5 38.23198 TRUE 1-3 100.5 38.23198 TRUE 1-4 93.0 38.23198 TRUE 2-3 14.0 38.23198 FALSE 2-4 6.5 38.23198 FALSE 3-4 7.5 38.23198 FALSE


Performance Demand (1=Econtrol, 2= NewVis, 3=TabBar) Multiple comparisons between groups after Friedman test p.value: 0.05 Comparisons obs.dif critical.dif difference 1-2 101 38.23198 TRUE 1-3 88 38.23198 TRUE 1-4 79 38.23198 TRUE 2-3 13 38.23198 FALSE 2-4 22 38.23198 FALSE 3-4 9 38.23198 FALSE

178


Effort Demand (1=Econtrol, 2= NewVis, 3=TabBar) Multiple comparisons between groups after Friedman test p.value: 0.05 Comparisons obs.dif critical.dif difference 1-2 83.0 38.23198 TRUE 1-3 98.5 38.23198 TRUE 1-4 88.5 38.23198 TRUE 2-3 15.5 38.23198 FALSE 2-4 5.5 38.23198 FALSE 3-4 10.0 38.23198 FALSE


Frustration Demand (1=Econtrol, 2= NewVis, 3=TabBar) Multiple comparisons between groups after Friedman test p.value: 0.05 Comparisons obs.dif critical.dif difference 1-2 84.0 38.23198 TRUE 1-3 103.5 38.23198 TRUE 1-4 104.5 38.23198 TRUE 2-3 19.5 38.23198 FALSE 2-4 20.5 38.23198 FALSE 3-4 1.0 38.23198 FALSE

179

Appendix E: Critical Decision Method Sample Questions

Critical Decision Making Probe Questions

Critical decision method probe questions with reference from Baxter et. al137, see second page of

this appendix.

Incident probe questions:

1. Please think back to a situation where you had to make a critical situation regarding a patient. Can you describe the situation/patient state? What was the critical decision?

2. What specifically about this situation heightened your awareness? 3. Was the development of the situation expected? If not, what was different from

your past experiences? 4. Can you describe the timeline of how soon this happened? Did this happened

between x time and y time? 5. What specifically about the situation lead to you investigate further? 6. Let me repeat my understanding of your timeline and the events leading to the

critical decision, is this accurate? 7. How did you realize the situation was more critical?

Source probe questions:

1. What information did you use to making this decision? 2. Which technologies were you referencing? Were some more useful than others?

Was the data reliable? If not, how did you use it? 3. In addition to yourself, who else provided information to help inform the situation? 4. Would you have preferred other additional sources? 5. Was this a situation in which you were trying to get every piece of information

available to you? Which pieces of information were most important? Which were less important but available to you?

180

Anticipation

Interprofessional and interteam

communication

Managing attention

Managing complexity

Managing uncertainty

and risk

Problem detection

Self-awareness and self-

management

Sensemaking

Technology manageme

nt

Time manageme

nt

Anticipation

4325 1109 1077 1804 1075 1521 0 660 1260 773


1109 4980 737 0 502 532 1653 423 1455 882

Managing attention

1077 737 6326 0 438 1240 0 647 188 128

Managing complexity

1804 0 0 27650 1049 0 0 332 0 1382


1075 502 438 1049 9746 1970 2098 881 1795 524

Problem detection

1521 532 1240 0 1970 13136 2225 1484 0 742


0 1653 0 0 2098 2225 22120 553 1352 0

Sensemaking

660 423 647 332 881 1484 553 1672 703 258

Technology management

1260 1455 188 0 1795 0 1352 703 14209 451

Time management

773 882 128 1382 524 742 0 258 451 6605

Figure 21. Physician common macrocognitive process, based on normalized frequency of coded references, with upper 50% bold and underlined. Non-paired processes have a cell value of 0.

181

Anticipation


communication

Managing attention

Managing complexity


and risk

Problem detection


management

Sensemaking

Technology manageme

nt

Time manageme

nt

Anticipation

4018 913 1975 1305 1950 2158 1650 1450 632 1227


913 3791 1301 785 1438 1777 1488 915 488 221

Managing attention

1975 1301 10344 3770 878 2556 2043 1317 671 0

Managing complexity

1305 785 3770 32389 6284 2661 0 1714 640 3480


1950 1438 878 6284 23552 4337 1907 2653 1565 0

Problem detection

2158 1777 2556 2661 4337 12628 2019 2218 1243 2252


1650 1488 2043 0 1907 2019 24854 1040 1214 0

Sensemaking

1450 915 1317 1714 2653 2218 1040 3485 640 773


632 488 671 640 1565 1243 1214 640 6910 0

Time management

1227 221 0 3480 0 2252 0 773 0 15215

Figure 22.Nurse common macrocognitive process, based on normalized frequency of coded references, with upper 50% bold and underlined. Non-paired processes have a cell value of 0.

182

Anticipation


communication

Managing attention

Managing complexity


and risk

Problem detection


management

Sensemaking

Technology manageme

nt

Time manageme

nt

Anticipation

5094 1664 502 1031 1173 1106 1804 784 1386 1203


1664 8459 1473 3637 3010 1242 3637 661 1939 1212

Managing attention

502 1473 4074 2310 730 413 1925 185 0 513

Managing complexity

1031 3637 2310 30415 2622 0 4147 221 0 0


1173 3010 730 2622 9548 2814 5244 965 2307 1748

Problem detection

1106 1242 413 0 2814 12738 1484 890 725 495


1804 3637 1925 4147 5244 1484 34562 1106 2028 922

Sensemaking

784 661 185 221 965 890 1106 1522 595 369


1386 1939 0 0 2307 725 2028 595 15200 451

Time management

1203 1212 513 0 1748 495 922 369 451 4762

Figure 23. Respiratory therapist common macrocognitive process, based on normalized frequency of coded references, with upper 50% bold and underlined. Non-paired processes have a cell value of 0.

183

Figure 24. Charts showing distribution of macrocognitive processes within specialties.

184

Figure 25. Charts showing distribution of technologies according to macrocognitive processes, within specialties.

185

Figure 26. Distribution of technological information sources, within each specialties

186

Appendix F: Summary of Major and Minor Usability Problems

Table 24. Summary of major usability issues

Issue# Place of

Occurrence

Usability problem description Impact

1 Patient window

No zoom out function Users are discouraged from making mistakes and exploring the software

2 Patient window

Arrow button does not work when timeline is long

Limits long-term trend window to 2 weeks. Clinical diagnosis with comparison to patient state prior to 2 weeks ago, not possible.

3 Patient window

Shading unclear – at first glance which lines corresponds to which vital is unclear E.g. Too many target ranges for similar parameters such as arterial blood pressures for systolic, mean and diastolic blood pressures

Visually may be confusing as specific target ranges for closely spaced trendlines require clinician to understand by elimination or reasoning. This impacts their cognitive load.

4 Patient window

Inconsistent use of the color green for shading

Green shading for target ranges used for sparklines, in the legend, and for heart rate trendline, in the main graph area.

5 Patient window

Inconsistent use of the color blue for hyperlinks

May frustrate user who may assume a hyperlink but experiences no feedback

6 Patient Window - legend

Small dots indicate place for drag and drop functionality, small and unclear this function exists

This functionality may be ignored and restricts interaction with the DIVS

7 Patient window - legend

Flags as filters – unclear to RN user May remain a mystery and is superfluous information that users may need to consistently ignore

8 Patient window – graph

Cursor position on graph indicates current values in small fonts on the far left and date aligned with cursor “moving line” – chose time point values not easily viewed

Requires user to be focussed on the screen to discover the values but this does not fit with ICU setting

9 Patient window – graph y-axis

Small and best-fit must be changed from a pull down menu; pull down menu limited

Values may not be viewable limiting view of out of graph values.

10 Patient Window - graph

Choosing different axis by hovering over line and clicking on arrow that appears – actions are invisible to users

Function may be overlooked and users may go back to other, more responsive information systems.

11 Patient Window - Legend trendline

(sparkline) unclear what time is showing; changing does not indicate new timeline or message “Current Status” unclear Within the legend current status may mean numerical value or short-term trends which are redundant with the main graph area

The many sparklines may be confusing or present its own version of data overload in the form of dense graphs within the legend.

12 Patient No way to access notes quickly – need to Users must manipulate timeline to

187

Issue# Place of

Occurrence

Usability problem description Impact

window – notes legend

scroll through manipulation of the timeline to find note – currently workaround is click “2weeks” and “jump to current”

search for physician notes. This restricts information retrieval.

13 Patient window - Pop-up

Calculate statistics – unclear access to function (e.g. select time points, pop-up for function, then choose based on 2,3,4 sigma); may not be necessary for all users; although accessible all users cannot save targets

Does not let users view long-term values which are out of range, something they currently do with the physiological monitor.

14 Patient window – more info

Pop up for y-axis choice still visible in the “More info” window – this choice is not possible in the new window but does not disappear. Must make a choice about y-axis for choice window to disappear but brings back to previous window

Forces user to find a work around to make these windows disappear.


Meaning of Priority unclear Users may ignore this information if not clinically relevant but may feel this is important.

16 Archived Patients selection

Clicking on x for vitals shows is irreversible – no undo

Users will tend not to explore visualization capabilities.

17 FAQ Measures does not mean same thing to RN user – should be modified for clinician language (i.e. vitals)

May create confusion.

18 FAQ No word/topic search functionality Help functional may be difficult to access

19 FAQ Back button inconsistent – sometimes found in FAQ, not found in patient window to get back to Census Overview

Users will tend not to explore navigation.

188

Table 25. Summary of minor usability

Issue# Place of

Occurrence

Usability problem description

1 Census Overview, Patient screen

Font size too small for vitals and values, based on lap-top screen view (similar to ICU bedside PC) – does not accommodate clinicians with different levels of eyesight

2 Census Overview Ability to see all patients – conflict with current practice of restricting patients to RN load; different specialities require different

3 Census Overview No understanding of “First message” information – extraneous information for NB

4 Census Overview Archive and Current Patients – not sure which census is being presented

5 Census Overview Search window – unclear what can be searched; was not apparent since position was below user name

6 Census Overview “Program” in “SickKids ICU T3 Program” has no meaning to RN – extraneous information

7 Census Overview Spacing not optimized. Example “DOB” occupies same space than “Name” column

8 Census Overview CCCU patients in PICU bed not marked as current Daily Paper Census convention (with hearth symbol)

9 Census overview “Archived patients” list looks same as “Current patients” list; at first glance; no apparent color or shading difference, only when selecting archived patients – but overall screen shows distinction

10 Patient Window Top of patient window – “jammed information” – becomes important with long names, possibly may not be visible


Extra graphs no able to remove, some users may want to see less number of graphs and more trends on fewer graphs to see details

12 Patient window Shading unclear – at first glance which lines corresponds to which vital is unclear

13 Patient window- Pop-up Notes window

No way to see Note information right away, need to scroll down to view note – RNs currently cannot create Notes so this function is useless and greyed out

14 Patient window Over two week period not possible – only use arrow to view datasets of 2 weeks

15 Patient window Missing points – no message to state why

16 Patient window Note shows greyed pencil and changes to colored pencil when post-it note is selected - inconsistent

17 Patient window Target origins not known or transparent – RN user would like to know if this is standard, physician or clinical practice guidelines

18 Overall Help or information not easily available and does not encourage self-learning or discovery

19 Patient Window – bottom right

Restore to default view not obvious

20 Patient Window - Legend

Default view to all vitals being captured

21 Patient Window - Legend

Current status is of larger font than trendline time, and font style is different from vitals and total mini-trendline (sparklines) times


Cannot repeat vitals on different graphs


SI unclear

189

Six other issues related to the in-use function of the software and two other issues were cosmetic of a

severity of 1.

190

Appendix G: Usability Tasks, Checklist and Detailed Data

Table 26. List of usability tasks tested and representative questions posed to the participants. Checked box indicated a pass rate of less than

50%.

Task ID# Optimistic Conservative

Description of Tasks Manuscript

Description Tasks/Questions Asked to Participant

DR RN RT DR RN RT

1 Locating patient file 1. Locating patient file Find patient file.

2 Recalling maximum or minimum values for a

specific variable

2. Identifying a value for a specific physiological variable

What was the lowest etCO2 value recorded during cardiac arrest?

3 Estimating duration of event

3. Estimating duration of event by identifying two time points

How long after chest closure did the cardiac arrest happen?

4 ✓ ✓ ✓ ✓ ✓ Time scale manipulation

4. Manipulating time scale

How long has the patient been in the unit?

5 Comparing trends for two variables

5. Comparing trends for two specific parameters

Did the blood pressure or oxygen saturation fall first?

6 ✓ Comparing different patient states

6. Comparing different patient physiological states

Provide values for HR and SpO2 pre- and post-surgery. How are these signals different from the current signals?

7 ✓ ✓ ✓ ✓ Recalling range of values for specific

variables, surrounding a specific event

7. Identifying values for two specific parameters at an event

What were the range of vitals for blood pressure and saturation during this event [cardiac arrest]?

8 ✓ ✓ Recalling change of values for several

variables, prior to a specific event. Finding notes

8. Identifying vital signs (group of parameters) prior to an event

Identify and report the vitals prior to the hypotension event post-surgery: hypotension event

Did these values change significantly after chest closure and prior to cardiac arrest?

Comment on the vitals you would use to indicate readiness for chest closure.

9 ✓ ✓ Selection of inactive variables. Time scale

manipulation

9. Viewing trend of three redundant, overlapping parameters

View and compare saturation data (SpO2, SpO2 r, SpO2 l) for entire length of stay.

10 ✓ Viewing infusion rates over time

10. Viewing infusion medication data

The dose of dopamine was increased over time. Please go back to the time period when this occurred. By how much and over what time period was dopamine increased?

11 Comparing infusions 11. Comparing infusion What was the impact of the dopamine infusion on 1 or 2

191




DR RN RT DR RN RT

with vitals medications with vital signs

vitals of concern?

12 ✓ ✓ ✓ Viewing infusion data 12. Detecting change in infusion medication rate over time

Around what time was epinephrine stopped?

13 Viewing ventilator data 13. Viewing ventilator data

How did values of peak inspiratory airway pressure, mean airway pressure, and positive end expiratory pressure change during cardiac arrest?

14 ✓ Viewing laboratory data

14. Viewing laboratory data

What was the trend of the hematocrit, glucose, and PaCO2?

15 ✓ ✓ ✓ ✓ ✓ ✓ Visual representation of target ranges

15. Viewing target ranges (semi-automated visual aid)

When were the targets set? When did the values go out of range for these set targets?

16 ✓ ✓ ✓ ✓ ✓ Sparklines 16. Sparkline (automatic trend line for one variable)

Is there a faster way to visualize the trend for the past 30 minutes? How does the 30-minute automatic trend compare to the trends shown in the main graph?

17 ✓ ✓ ✓ ✓ ✓ IDO2 indicator 17. IDO2 indicator (automatic computation using 16 parameters)

Does this indicator signal approximate duration of patient instability? Is this a meaningful indication of patient instability?

18 ✓ Finding notes 18. Finding notes A therapeutic intervention is required, please find this source of information and state the time of the intervention [inhaled nitric oxide].

Approximately when did the physician attempt chest closure?

For ECMO initiation: A previous physician wrote a note indicating a therapeutic intervention, please find this source of information.

19 ✓ ✓ Modifying the patient record by adding note

19. Modifying/adding note

The plan is to keep the patient hypothermic for 48 hours. Can you add a note indicating the start of hypothermia at 33 degrees Celsius?

After observing the effect of the intervention, please use T3 to communicate your thoughts on this event.

20 Setting targets 20. Setting targets Please set targets for oxygen saturation.

Assuming you are now coming off shift, please set an appropriate target range for heart rate to help with

192




DR RN RT DR RN RT

monitoring.

Total below

50% cut-off point

3 4 5 9 9 8

193

Table 27. Usability tasks tested with pass rates as percentage and fraction of total users.

General

Functions Tasks Tested for Each Function

Pass rate by Task and by

Clinician Type Pass Rate by

Task

(max n=22)

Usability

Issue

(Y/N) Physicians

(max n=7)

Nurses

(max n=8)

Respiratory

Therapists

(max n=7)

Tracking: Orientation (4 tasks)

1. Locating patient file 100% (7/7)

100% (8/8)

100% (7/7) 100% (22/22) N

2. Identifying a value for a specific physiological variable 80% (4/5) 75% (3/4) 67% (4/6) 73% (11/15) N

3. Estimating duration of event by identifying two time points 60% (3/5) 100% (4/4)

100% (5/5) 86% (12/14) N

4. Manipulating time scale 43% (3/7) 0% (0/8) 14% (1/7) 18% (4/22) Y

Function Pass Rate by Clinician Type 71% 63% 68%

Trajectory: Relationships between Parameters (10 tasks)

5. Comparing trends for two specific parameters 60% (3/5) 57% (4/7) 67% (4/6) 61% (11/18) N

6. Comparing different patient physiological states 67% (4/6) 50% (4/8) 50% (3/6) 55% (11/20) N

7. Identifying values for two specific parameters at an event 40% (2/5) 43% (3/7) 20% (1/5) 35% (6/17) Y

8. Identifying vital signs (group of parameters) prior to an event 17% (1/6) 0% (0/8) 50% (3/6) 20% (4/20) Y

9. Viewing trend of three redundant, overlapping parameters 29% (2/7) 71% (5/7) 17% (1/6) 40% (8/20) Y

10. Viewing infusion medication data 83% (5/6) 29% (2/7) 100% (5/5) 67% (12/18) N

11. Comparing infusion medications with vital signs 86% (6/7) 83% (5/6) 57% (4/7) 75% (15/20) N

12. Detecting change in infusion medication rates over time 57% (4/7) 14% (1/7) 0% (0/6) 25% (5/20) Y

13. Viewing ventilator data 100% (6/6)

60% (3/5) 80% (4/5) 81% (13/16) N

14. Viewing laboratory data 50% (3/6) 75% (3/4) 67% (4/6) 63% (10/16) N

Function Pass Rate by Clinician Type 59% 45% 45%

Triggering: Automated Integration (3 tasks)

15. Viewing target ranges (semi-automated visual aid) 0% (0/5) 20% (1/5) 20% (1/5) 13% (2/15) Y

16. Sparkline (automatic trend line for one variable) 0% (0/5) 0% (0/5) 0% (0/4) 0% (0/14) Y

17. IDO2 indicator (automatic computation using 16 parameters) 20% (1/5) 0% (0/8) 0% (0/6) 5% (1/19) Y

Function Use Error Rating, by Clinician Type 7% 6% 7%

Other Functions

18. Finding notes 43% (3/7) 88% (7/8) 57% (4/7) 64% (14/22) N

19. Modifying/adding note 29% (2/7) 29% (2/7) 50% (3/6) 35% (7/20) Y

194

(3 tasks) 20. Setting targets 86% (6/7) 86% (6/7) 100% (6/6) 90% (18/20) N

Function Pass Rate, by Clinician Type 52% 68% 68%

Number of

Tasks with

Usability

Issues

9 9 8

All

functions

Global Function Pass Rate for All Functions and Clinicians by

Clinician Type

54% 47% 49%

Groups highlighted in blue were below the 50% cut-off level.

technology-mediated data, its integration and its impact on intensive care cognitive work · 2018....

Documents