final research report - cpoe (tsourdinis)

25
Running head: AUTOMATION BIAS AND SITUATIONAL AWARENESS IN PHYSICIAN ORDER ENTRIES AFFECT PATIENT OUTCOME: IS THE PEN MIGHTIER THAN THE KEYBOARD? Automation Bias and Situational Awareness in Physician Order Entries Affect Patient Outcome: Is the Pen Mightier than the Keyboard? George Tsourdinis 1 & Christopher Straus, MD 2 1 Biological Sciences Division, The University of Chicago, 5801 S Ellis Ave, Chicago, IL 60637 2 Department of Radiology, The University of Chicago Medicine, 5841 S Maryland Ave., Chicago, IL 60637

Upload: george-tsourdinis

Post on 08-Apr-2017

172 views

Category:

Documents


0 download

TRANSCRIPT

Running head: AUTOMATION BIAS AND SITUATIONAL AWARENESS IN PHYSICIAN

ORDER ENTRIES AFFECT PATIENT OUTCOME: IS THE PEN MIGHTIER THAN THE

KEYBOARD?

Automation Bias and Situational Awareness in Physician Order Entries Affect Patient Outcome: Is

the Pen Mightier than the Keyboard?

George Tsourdinis1 & Christopher Straus, MD2

1Biological Sciences Division, The University of Chicago, 5801 S Ellis Ave, Chicago, IL 60637

2Department of Radiology, The University of Chicago Medicine, 5841 S Maryland Ave., Chicago, IL 60637

2

Introduction

History and Research on CPOE/CDSS Systems

Aside from the stethoscope and white coat, the pen and prescription (℞) pad are the next

most recognizable tools in the doctor’s repertoire of weapons against human pathology. A blank

slate, the prescription pad possessed the power to order any prescription at the physician’s will. With

the advent of laboratory chemistry, blood workups, medical imaging, and the systematization of

hospitals as we know them today in the United States, the medical requisition sheet accompanied the

prescription pad in the doctor’s arsenal. Better known as a physician order entry (POE), the

requisition form allowed the physician to order any laboratory test, request any radiogram/scan,

make any referral, and order any procedures he or she thought necessary for the patient’s

betterment. As computer technology continued to advance into the 21st century, hospital systems

and software vendors seized the opportunity to wipe the slate clean and begin anew with the

computerized physician order entry (CPOE) system. CPOE is defined by the direct entry, and thus

authorization, of medical orders into a computer software by a physician9. The motivation behind

CPOE was not just to make the famously inscrutable handwriting of a physician legible, but mainly

to introduce a system that would reduce the large amount of medical-related errors that occur daily

and to increase efficiency and workflow in the medical environment. CPOE’s functionality arises not

in itself, but from an added component in the software called a clinical decision support system

(CDSS). A CDSS provides live feedback in the form of prompt windows to the physician about

potential errors as the physician enters his or her order. Errors can occur from incompatible drug

interactions, unnoticed preexisting allergies to an ordered medication, and aberrant weight- and

kidney- based medication dosing1. Pre-programmed “order sets” are installed into the CDSS and

CPOE systems and contain the algorithms that can make these predictions of error based on prior

3

extensive research demonstrating contradictory multi-drug interactions, proper dosages, and any

unfavorable patient interactions to medications. Combined together, CPOE and CDSS have been

touted to set new standards of quality patient care, reduce costs, and increase workflow efficiency.

Naturally, to further achieve these goals of reduced error and increased quality of care, hospital

administrators and pharmaceutical companies have made great strides in lobbying for the

implementation of CPOE across all hospitals in the United States (US). Moreover, a consortium of

over 150 influential public and private healthcare benefit organizations called “the Leapfrog Group”

has incentivized the implementation and continued use of CPOE and CDSS systems by including

the presence of CPOE/CDSS as one of their criteria in granting hospitals A-F report card ratings

based on numerous safety factors2.

Prima facie, computer-based health systems technologies are appealing, but initial reports

advocating for CPOE’s benefits have not accounted for other aspects of the CPOE system, and

positive initial results should be interpreted cautiously. One such aspect is that as of 2009, less than

5% of hospitals in the US have fully implemented CPOE. While implementation numbers have

certainly increased since 2009 to now, the Leapfrog Group predicts that only in twenty years’ time

will CPOE achieve maximum outreach and implementation within urban hospitals. Furthermore,

utilization is still low from a national perspective, with less than 50% of physicians entering at least

80% of their orders electronically1. Therefore, initial reports may not have been capturing the

complete sample size of potential CPOE utilization, thus inadvertently excluding negative events as

a result of CPOE.

Studies investigating the effects of CPOE/CDSS have spawned out of an interest to evaluate

its efficacy, but many have reported mixed results. One way to evaluate CPOE efficacy is to measure

a broad category known as “patient outcome.” The spectrum of variables entailing patient outcome

lies anywhere from biologically adverse events to a psychological state pre-/post-treatment and even

4

whether malpractice lawsuits have been filed as a result of poor medical treatment. There are a few

robust ways to measure patient outcome:

•   Adverse drug events (ADEs) are defined as errors in drug use (or lack of

administering an intended medicine) that result in harm to the patient14.

•   A medication prescription error (MPE) is defined to be any error in the

prescribing of a medication regardless of outcome14.

•   Wrong-patient errors are detected by the “retract and reorder” method, which

determines orders place for a patient that were then rapidly discontinued by the same

physician. Next, the analysis searches for whether an identical order was placed for a

different patient by the same physician5.

•   Intercepted medication errors include errors that had the potential to cause great

harm to the patient but did not actually reach the patient due to an interception by

medical staff. Non-intercepted errors can result in preventable ADEs associated with

MPEs9.

•   A malpractice suit is a lawsuit issued forth by the patient and/or his/her family

members against the doctor for alleged mistreatment of the patient during his/her

time in the doctor’s care. The severity of malpractice suits are usually measured in

monetary losses of the defendant (i.e. the physician), imprisonment time (if

applicable), and number of total malpractice suits per physician career – regardless of

whether a settlement was pursued.

To add to the complexity, not all negative patient outcomes are even reported†. One study found

that direct observation of physicians throughout the day yielded an objective measurement of a

17.9% error rate, while the logged error rate in the medical records system by the same physicians was

                                                                                                               †  Further, each outcome above can be classified as preventable, non-preventable, or potential (“potential” meaning likely to have occurred).  

5

0.9%. Dramatic discrepancies in reported errors hide vital information that would otherwise be

helpful to improve future patient outcomes.

To highlight the benefits of CPOE, Shamliyan et al. conducted a meta-analysis of twelve

studies and discovered that all studies reportedly had a reduction in MPEs with CPOE

implementation (66% reduction in adult MPE’s). Furthermore, they found that CPOE

implementation mediated the avoidance of 775 MPEs per 1,000 orders in one study9. Van Rosse and

colleagues came across similar findings, with uniform reporting of MPE reduction with CPOE

implementation across thirteen studies. However, their same meta-analysis found an elevated risk of

mortality with CPOE14. Another analysis by the Shamliyan group analysis also found that even after

the implementation of CPOE, the rate of wrong-drug prescriptions did not decrease for some time9.

Another study had uncovered a three-fold increase in the likelihood of wrong-patient errors among

emergency department physicians using CPOE, compared to those ordering through handwritten

means5. Additionally, CPOE’s counterpart, CDSS, has been shown to introduce error, too, despite

its complex algorithms for suggesting useful advice to the ordering doctor. One cited study reported

a 26% increase in likelihood of incorrect decision being made by a physician when the CDSS system

suggested an erroneous course of treatment;3 still, an experiment conducted by Goddard et al. found

that physician decision accuracy was improved in 13.1% of cases when CDSS was employed3. These

data are just a small sample of the heterogeneity surrounding research on CPOE/CDSS.

Mediators of Error in CPOE/CDSS Systems: Cognitive Biases

Apparently, there is still no consensus on CPOE/CDSS systems as they pertain to

benefitting the medical system. Indeed, it is expected and observed that CPOE systems can be more

efficient than handwritten requisition forms and improve communication between physicians,

nurses, and pharmacists14, but the presence of benefits does not preclude the potential for CPOE to

6

introduce new errors that were not anticipated before implementation. With increasing demands on

physicians in the face of a primary care physician shortage and increasing patient influxes due to the

Affordable Care Act, doctors may be more hard-pressed to see more patients within shorter

amounts of time. External forces like the ACA and the constantly evolving landscape of medicine

could be mediating the effects of a likewise-recently introduced CPOE/CDSS system. The

expediency with which some physicians are forced to carry out their tasks, and the co-occurrence of

newly implemented CPOE/CDSS systems, could be giving rise to unwanted cognitive biases. The

human mind is especially subject to cognitive biases under new environmental stressors, and while

biases affect nearly every aspect of social interaction, they especially play a large - and sometimes

detrimental role - in communication between physicians and supporting medical staff. Out of the

plethora of biases that barrage the mind daily, automation bias stands out as the most relevant bias to

interact with CPOE/CDSS systems.

Automation Bias and Situational Awareness

Automation bias (AB) is formally known as the tendency to over-rely on automated

systems4, like the CDSS of CPOE. Given the opportunity via automated technology, AB not only

has the potential to alter the manner in which a physician conducts their work, but also change the

types of error a physician is likely to make. In other words, AB is a very applicable bias to investigate

in the realm of CPOE effects on healthcare, since it can alleviate some forms of human error (e.g.

efficiency) but introduce other errors – as mentioned prior. AB has been heavily studied in the realm

of aviation, a field reliant on “autopilot” systems – as research is showing, perhaps, too reliant. Over-

reliance on such automated interfaces in any context produce two distinct errors: commission error

(abiding by incorrect advice from the system) and omission error (not acting because one is not

prompted to act by the system)11. The general notion surrounding AB is that, when presented with

7

an automated system that is said to be trusted with completing/aiding with a task, a human operator

will tend to forego vigilance and information seeking, instead using the automation as a replacement

for the operator’s independent cognitive faculties. Preliminary studies have shown that CDSS

recommendations during use, for example, can reverse a physician’s correct order choice to an

incorrect one 8% of the time5. While still a relatively new idea, AB can have grave consequences for

the patient and could underlie medical errors experienced through CPOE. While numerically small,

the effect of 8% preventable incorrect answers could be fatal for the patients on the receiving end.

Because it is such a relatively new concept, AB does not have a validated, widespread

method of measurement at this time. Nonetheless, another related psychological measurement has

been well-developed and strongly defined since the 1980’s, when Dr. Mica Endsley published her

model of situational awareness. Situational awareness (SA) is stratified into hierarchical ‘levels’

specifically defined to be a person’s perception of elements within a current situation (Level I SA),

that person’s comprehension of the meaning of a situation (Level II SA), and lastly their ability to

predict how elements in the situation will change in the future (Level III SA)10. Each step in SA level

represents a leap from concrete observations to abstract thoughts and predictions. Endsley’s model

for SA is much more understandable in its visual form, below in Figure 1.

Figure 1 – Model of situational awareness, as proposed by Endsley, M.; adapted from Singh et al.10

 

8

The Situation Awareness Global Assessment Technique (SAGAT) is the primary method by

which one measures SA. This tool is used to assess SA across all three levels of the Endsley model.

Experimental models for utilizing SAGAT always involve a simulation, whereby an experimental

subject is instructed to proceed routinely through a specific task in their field. Once the

experimenters believe the environmental stimuli have given the subject a chance to obtain enough

information to satisfy each SA level, the experimenters will “freeze” the simulation and issue

questions to the trainee to test their perception of the situation at that precise moment in time. The

simulation resumes and the the next “freeze” occurs at a subsequent time point. A SAGAT score is

generated by the end of the simulation, which simply compares the subject’s self-reported SA with

the objective “reality” of the situation, per the lead investigator’s expert opinion6,10. The more

accurate the subject is, the higher their SA is, and thus the closer their subjective experience is to the

objective reality of the situation. Traditionally, medical teams in various departments utilize a simple

checklist that enumerates a sequential list of actions the performer must accomplish to properly

complete the task at hand. However, the SAGAT does what the traditional checklist cannot, and

that is to capture a snapshot of the subject’s decision-making processes in time, before errors are

resolved and before outcome is determined. While the checklist does measure the subject’s

knowledge of the task, the SAGAT measures actual SA. A significant positive correlation between

these two scoring systems is defined as a measure of “overall task expertise and management”6.

Situational awareness has been shown to possess a relationship with automation, as

highlighted in Figure 1, and can be one of the many mediators influencing SA. Furthermore, Skitka

and colleagues who are at the forefront of AB research, found significant interactions between

automated systems and SA, showing that non-medical automated systems could “short circuit” SA

and cause its decline11. Therefore, it is highly likely that SA is inversely correlated with AB.

9

General Research Design

Given our understanding of AB, SA, and CPOE/CDSS systems, we will investigate the role of

AB, as measured via SA (due to prior demonstrated success in validated SA methods, with no well-

defined methods for AB at this time), in how a physician operates a CPOE automated system within

an emergency department context at the University of Chicago Medical Center. This study will also

be restricted to the specialty of radiology, meaning that all patient medical records and scenarios

should contain some form of medical imaging that was ordered for the patient. The study will be

divided into two parts:

•   Experiment I will be a randomized, retrospective archival analysis of the University of

Chicago medical records pre- and post-CPOE implementation to determine whether patient

outcome was worse or better with CPOE versus handwritten requisition forms. The

independent variables here will be 1) the condition of strictly either handwritten or CPOE

orders – as shown by the medical record – and 2) the ordering physician’s medical expertise

in terms of ‘years after graduating medical school.’ The dependent variable for this portion

of the study will be patient outcome, as determined by reported ADEs, MPEs, and severity

of malpractice suits filed.

o   We expect there to be a statistically significant increase in the number of ADE’s and

severity of malpractice suits filed after CPOE implementation, from the written

condition to the CPOE condition, validating prior studies’ estimates. In addition, we

predict a non-statistically significant decrease in MPEs, as the literature has shown to

occur, from written to CPOE conditions [Hypothesis 1.1]. This effect would likely

be due to the new set of cognitive biases, including AB and decreased SA, that arose

after CPOE implementation. While this serves as supportive evidence, it is not

10

definitive. These effects await to be confirmed in Experiment II, in accordance with

our hypotheses.

o   We also anticipate all independent variables to steadily decrease in frequency as they

vary with increasing medical expertise during the handwritten time period. However,

we predict that there will be a positive correlation between ADE’s and severity of

malpractice suits with increasing medical experience during the CPOE time period

[Hypothesis 1.2]. This effect is expected because there has been a demonstrated

negative correlation in the literature between susceptibility to change and physician

experience level3. One study seems to verify this, as it found older CDSS users

making more overall errors4. This implies that older/more experienced physicians

may be more reluctant to learning a completely new way of conducting order entries

and may spend less time learning to train on the system, while younger physicians

being trained in CPOE during medical school will have relatively fewer errors – but

still more errors overall compared to handwritten conditions – since they already are

naturally habituated to the system. This habituation may also be the reason we expect

to see a decrease of dependent variables during the handwritten period, since all

physicians, regardless of experience level, have been trained throughout their

education to conduct handwritten orders. Therefore, it is expected that younger

physicians will generally have more errors in the handwritten condition only, since

they simply have less experience in medicine.

•   Experiment II will comprise an experimental simulation design with hired patient-actors,

employing the use of the SAGAT method for assessing the SA of physicians across varying

expertise levels in written requisition form versus CPOE conditions. There will be 3

11

simulations in order to control for inpatient symptomatology and increase generalizability of

our results; in other words, we want to eschew from only holding one simulation type (e.g.

only a pulmonary embolism case), since we cannot disprove that the effect of cognitive

biases discovered might be specific to that illness case only – we hope to see the

hypothesized effects across a diverse array of pathologies. Lastly, we will implement a

specific type of order entry system in both paper and computerized forms called the

radiology order entry (ROE) that was created and validated by Rosenthal, et al.8. The ROE

was chosen in order to measure the accuracy of physician’s choice of radiological imaging

orders, based on an “appropriateness criteria” scale adopted from the American College of

Radiology (ACR). The independent variables in this portion of the study will be 1) the

simulated condition of being given a requisite form to fill out a handwritten ROE versus

filling the order through ROE, and 2) the expertise of the recruited physician-subject

(stratified by third/fourth-year medical student, first-year resident “intern,” and attending

physician). The dependent variables here will be mean SAGAT score and a traditional

checklist score (both as a percentage of correct responses), along with the ACR

appropriateness score (as a measure of “utility”; please see Methods for utility score

breakdown).

o   We anticipate to observe a significant decrease in the mean SAGAT score across all

experience groups from the handwritten ROE to computerized ROE conditions.

[Hypothesis 2.1]. This effect is expected due to the introduction of an automatic

system, and presumably a higher AB as indicated by the lower SA (lower mean

SAGAT).

o   We also expect to see a significant increase in SAGAT as medical experience

increases within the handwritten ROE condition, as the constancy of trained

12

handwriting across age groups does not impose any biases on physicians, and

SAGAT will be purely dependent on medical knowledge and expertise. However, we

expect to a decrease in mean SAGAT as medical experiences increases within the

computerized ROE condition, given that older physicians experience a reluctance to

a new system of training, as mentioned prior [Hypothesis 2.2].

o   Finally, we expect to see a significant increase in the ACR appropriateness score with

increasing medical experience within the handwritten ROE condition, since

appropriateness (or utility) here would depend on experience alone. However, we

hypothesize that the ACR appropriateness score will decrease with medical

experience in the computerized ROE condition, due to the mediating effects of AB

[Hypothesis 2.3]. Here, the ACR appropriateness score is analogous to ADEs or any

other medical metric of patient outcome, since lower appropriateness scores imply a

radiogram was ordered that will be of less benefit/utility to the patient based on their

illness. Errors from lack of appropriateness can translate into a missed pathology (a

false negative), whereby the type of imaging ordered does not detect a particular

bodily cue that confirms/rejects a diagnosis.

Importance

United States hospitals collectively incur over $2 billion in annual costs of ADEs alone1, and

potentially even higher costs arise from other forms of error in the medical system. Concurrently,

the ordering and utilization of diagnostic radiology continues to rise; for instance, the time period

between 1996 to 2010 saw a 7.8%, 10%, and 3.9% increase in computed tomography (CT), magnetic

resonance imaging (MRI), and ultrasound – respectively8. It is perplexing that the costs of care and

medical errors continue to rise even with a steady increase in radiological diagnostic equipment

13

utilization, since it would be expected that early diagnosis would reduce costs and errors. More

importantly, there are patients who are victims of rising costs of care and increasing medical errors.

We strongly suspect this increase in errors can be attributed to communication gaps in medicine,

which are further underpinned by sometimes unnoticeable cognitive bias. The introduction of

CPOE into our hospital institutions may have solved workflow inefficiencies, but it also appears to

be creating new problems that result from the automated nature of CPOE-CDSS systems. However,

the heterogeneity of data in the literature prevents one from drawing a definitive conclusion about

the effects of CPOE on patient outcome, and even less research has been done to investigate the

cognitive biases that mediate this interaction. Here, we combine a statistical retrospective analysis

with an experimental approach on situational awareness to create a new paradigm of CPOE

research. The novelty of our study lies in its dual-experimental approach, its focus on radiology, and

its application of the SAGAT technique to a live patient-actor simulation, which has never been

explored before. By studying these two methods in parallel and stressing the psychological mediators

of error, our research will shed more light on possible mechanisms by which medical errors occur,

leading to potential policy changes and solutions in the long run that can ameliorate these systematic

symptoms in our hospitals. The overall goal of this research is not to abolish computerized systems

and reinstate writing as our primary form of physician ordering, but instead to find ways to preserve

the benefits of CPOE while eliminating its newly introduced negative effects. Uncovering the

mechanisms at play will allow physicians to confront said biases to improve overall patient outcome.

Methods

Experiment I

This experiment will be a retrospective archival analysis of past physician requisition forms

pre-CPOE implementation and of past CPOE orders post-CPOE implementation. The University

14

of Chicago Medical Center in Hyde Park, IL will serve as our electronic medical record sample. The

University of Chicago has received an “A” rating from the Leapfrog Group for the past consecutive

5 years13. CPOE implementation is a major factor in determining higher hospital ratings. Assuming

the University of Chicago implemented its CPOE system in 2005*, we will draw patient records

from the two-year time span from 2000 to 2002 for our pre-CPOE implementation condition (fully

handwritten orders). We will allow two years of acclimation to CPOE implementation transition

period before drawing our next sample time period; this prevents any confounding factors involving

a lack of familiarity with CPOE, contributing to error rates rather than AB itself. Thus, once all

physicians have been acclimated to CPOE, we will draw patient records from a two-year time span

from 2008 to 2010 for our post-CPOE implementation condition (fully computerized orders).

Patient medical records, and associated physician order entries (either handwritten scanned

copies or digital computerized copies) will be acquired with respect to patient privacy laws (HIPAA)

and IRB guidelines after receiving authorization and approval for both Experiments I and II. For

each time condition (pre-/post-CPOE implementation), we will screen collected patient files. Our

screening criteria for medical record selection include:

1.   Patients that have no history of “hand-offs” and have complete continuity of care. Hand-

offs are defined as a state of transition for the patient from the care of one physician to

another. This criterion prevents the attribution of any detected errors to another physician

who had the patient handed to him or her. This further isolates the potential effects of AB

from other confounding factors.

2.   Patients must have had at least one physician order entry (whether handwritten or

computerized) for any type of radiological examination. This further focuses our patient

                                                                                                               *  The date of CPOE implementation is not known or reported on the University’s website or annual hospital records. For the purposes of this proposal, we will assume implementation occurred at the beginning of 2005.

15

population (which is still large due to high rates of radiological utilization) to patients that

may have experienced adverse events as a result of inappropriate radiological exam orders

and thus missed pathologies. Assuming a missed pathology occurred, certain drugs that

were subsequently prescribed could have had negative interactions with the missed

pathology (e.g. a missed tumor that could have been detected on MRI but instead an

ultrasound was ordered).

Once the patient records have been screened, we will randomly select a nationally representative

sample size of n=1,000 for each pre/post condition from this pool of screened records. A high

sample size plus randomization ensures a high statistical power that will be able to detect any

significant differences in these populations, while a nationally representative sample supports the

generalizability of our results in the future.

Individually, each patient medical record will be examined by trained research assistants, who

are fourth-year medical students naïve to the study hypothesis (i.e. “blinded”). It is necessary to have

research assistants who are sufficiently medically-literate and familiar with medical records to

conduct the following analysis. Furthermore, blinding our research assistants will prevent the biased

mis-categorization of patients or underestimation/overestimation of the severity of their reported

medical errors if they were un-blinded to the hypothesis of the study. Assistants will record the

following into a computer data collection software like Microsoft Excel or Prism Graphpad: whether the

order was handwritten or computerized; the physician’s number of years in medical practice (post-

graduation); and the dependent variables of ADEs, MPEs, and severity of malpractice suits filed.

The data will then undergo statistical analysis to compare the effects of the handwritten-vs.-

computerized and medical experience independent variables on the aforementioned dependent

variables.

16

Experiment II

This experiment will be of an experimental simulation design with the intended purpose to

further support the results of Experiment I and to further define the psychological mechanisms of

cognitive biases in the physician ordering process, namely AB as measured by level of SA. We will

adopt a similar study design to that of Hogan et al., who utilized a trauma Human Patient Simulator

mannequin with programmable vital signs and realistic symptoms to emulate various scenarios of

pulmonary trauma in the emergency room (ER). They placed teams of physicians based on different

experience levels in this simulation room, conducted a SAGAT protocol (outlined in the Introduction),

and measured SA as it relates to medical experience. Similarly, we will utilize various scenarios to

generalize our findings across multiple situations in the ER. However, we will diverge from original

study in several ways. First, we plan to hire patient-actors per the hour with our research funds to

emulate a sick patient in need of emergency care. Each actor will be briefed by an ER physician on

what symptoms they should be displaying beforehand and would be allowed to practice their roles in

front of said physician to make sure their acting is convincing enough to the participants. Patient-

actors have been employed in simulations for medical school training for years and have been

shown to enhance the experience by making the experience more realistic than an unresponsive

mannequin12. Actors will be in a room designed to feel like an ER, with all of the materials and

medical supplies the participants would expect to find in the environment.

Participants will be physicians recruited from the University of Chicago Medical Center.

They will be informed of their eligibility to partake in the study via invitations cards placed in their

individual mail boxes, only if they are ER attendings, ER residents, or are currently on clinical

rotations in the ER as medical students (either third or fourth years). Invitations will kindly invite

them to “participate in a study exploring decision making in medicine.” As mentioned, we will aim

to recruit the following levels of medical experience: third/fourth-year medical students, first-year

17

resident “interns,” and attending physicians that have just completed residency. Because we will hold

three illness scenarios, across two conditions (handwritten and computerized ROEs), with four

participants desired for each scenario to obtain a mean score, and four experience levels of

physicians, we will ideally need to recruit 72 (3 x 2 x 4 x 4) participants with relatively equal spread of

medical experiences across the sample. We intend to conduct this experiment as a between-subject

design, whereby each participant is observed in and exposed to only one “treatment” (e.g. only one

scenario for each participant of each medical experience level). Such a between-subject design

permits us to compare the effects of the treatment on the dependent variables directly. A within-

subject design might require fewer participants, indeed, but it risks participant bias, in which the

physician participant begins to understand what is being tested across multiple ER scenarios and

begins to alter their behavior to suit the study objective.

A second difference is that we will design a SAGAT procedure with a similar template to the

Hogan et al. study, but with different inquiries that are more relevant to the ER and radiology. We

will utilize three illness scenarios common in the ER that require radiological tests to be properly

diagnosed and treated. Before each scenario, the physician participant will be briefed and perform a

20-minute training session, during which an experimenter gives them an orientation of the medical

record charting/ROE computer system and written form; this is a necessary step to ensure that the

participant is familiar with the tools they are using, thereby controlling for lack of familiarity that

could affect our dependent variables. In each scenario, the actor will be dressed in a hospital gown

lying upon an ER gurney connected to wires showing vital signs (heart rate, oxygen saturation, etc.)

on a monitor displaying false, programmed values per the settings encoded by the investigator.

These false values will be symptomatic of the illness being suffered by the patient in a specific

scenario. The participant will also be presented with edited blood workup/lab results that are

appropriate for diagnosis in the specific illness scenario. Each scenario will be scripted in such a way

18

that the patient progressively reveals more information and medical history to the physician upon

interview; the actor will be instructed to be visibly experiencing discomfort and show pain when the

physician presses on a specific area only during the routine physical examination in all three

scenarios. After some arbitrary interval of time (at least three minute elapsed before the first freeze;

at least one minutes elapsed between freezes), the experimenters will “freeze” the scenario by

turning the audio of the monitors off and instructing the participant to turn towards a blank wall to

eliminate any audio-visual cues. During this “freeze,” the participant will be asked questions that are

meant to assess his or her SA at specific levels from memory, alone. Once the inquiry is complete,

the monitors will be reactivated and the participant resumes their examination. During the

examination, the physician may be charting the patient’s medical history and writing his or her ROE

order, or charting and placing an ROE on a computerized system. It is possible that a freeze will

occur while he or she is writing or typing. If we were to postpone the charting or ROE submission

until after all three levels of SA had been assessed, the very fact that the experimenters made the

physician more aware of the situation by asking these questions might artificially increase his or her

ACR appropriateness score for the ROE. By allowing the events of the examination to proceed

naturally and interjecting at certain time intervals unbeknownst to the the participants, we can more

accurately capture an organic “snapshot” of the physician’s decision making processes as measured

by SA. Below, are elaborated the three scenarios designed for this experiment and questions asked

during each freeze to assess levels I, II, and III of Endsley’s SA model. Levels of SA are defined in

the Introduction section, but will include a small reminder of their definition in parentheses, below:

1.   Osteomyelitis Scenario: The patient presents with heel, back, and bone pain. They

experience chills, fatigue, fever, and occasional night sweats. Their blood sugar is heavily

elevated (this illness occurs mostly in diabetics) in the lab results and blood pressure is high.

19

The skin may or may not have ulcers/redness painted on with makeup to simulate a visual

symptom.

2.   Pulmonary Embolism Scenario: The patient presents with deep chest pain, shortness of

breath, and a dry cough. Their heart rate is abnormal or racing. The patient expresses

concerns of lightheadedness and difficulty with breathing.

3.   Aortic Aneurism Scenario: The patient complains of sharp pain in their abdomen and back.

They may present with bleeding and/or strong headaches. The patient usually does not

present severe symptoms until the aneurism ruptures. It is up to the physician to determine

the cause and proceed with a “watch-and-wait” route or order the patient to surgery

depending on the case’s severity.

SAGAT Inquiries at each SA Level Freeze (applicable across all illness scenarios):

•   Level 1 (perception of basic, factual information in the environment; symptoms):

“What is the patient’s workup?” “What is their heart rate?” “What your findings on

the neurological and respiratory exams?” “How long has the patient been

experiencing these symptoms?”

•   Level 2 (comprehension of the situation; differential diagnosis/-es): “What is/are the

potential cause(s) of the patient’s physiological abnormalities?”

•   Level 3 (projection of patient’s future status): “What would you expect to happen to

the patient’s blood pressure in the next five minutes?” “How will the patient’s

parameters change in the next fifteen minutes?” “What would you do to exclude

alternative diagnoses [if more than one exists]?”

20

Once all freezes have been conducted, the physician will conclude the examination by debriefing the

patient on their condition, wishing them well, and exiting the room. Each session will be video and

audio recorded with cameras that will be made aware to the participant in their 20-minute

orientation briefing. In the analysis phase, medical student research assistants naïve to the hypothesis

(as in Experiment I) will watch the audio-visual recordings to assess the correctness of each

physician to determine a SAGAT score. The score will be calculated by the % correct responses (=

correct responses/total responses *100) Answers to each SA level freeze inquiry will have been pre-

determined by three, expert emergency medicine physicians who will have reached a consensus on

specific values for lab results and monitoring (assess Level I), probable differential diagnosis (or

diagnoses; assess Level II), and planned courses of action based on whether certain unexpected

conditions arise in the patient (assess Level III). Of note, physicians will be granted a ±10% error

range around the actual, pre-determined answers in Level I.

In the meantime, a separate group of medical student research assistants naïve to the

hypothesis will examine the participant’s written or computerized ROE and use objective criteria

based on the ACR appropriateness scale to assess the appropriateness of the radiogram order, given

the specific illness that was examined for. The ACR score is a nine-point scale that categorizes

appropriateness levels into ranges of utility: “low utility” (score: 1-3), “moderate utility” (score: 4-6),

and “high utility” (score: 7-9).

Lastly, another separate research assistant will analyze the audio-visual recordings and

establish a percent-correctness score for each participant based on a traditional checklist. The

purpose of the traditional checklist score is to make sure that the SAGAT score is measuring similar

qualities in the participants that are relevant to SA in a medical scenario. If the SAGAT and traditional

checklist scores correlate to a significant degree, we can conclude that our study displayed proper

construct validity, since the SAGAT was not testing anything we did not want to test for. The

21

traditional checklist here functions as a guideline of sorts to ensure this relevance in the SAGAT

score6.

Statistical analyses will be conducted in the same manner as Experiment I to compare the

effects of written-vs.-computerized ROE and medical experience on SAGAT/traditional checklist

scores and ACR appropriateness score for radiograms ordered.

Limitations

One central limitation to our research design lies in the sheer difficulty to separate medical expertise

and familiarity (or usability) with computerized systems. There could uncontrollable subjective past

negative experiences with computers that cause reluctance and lower usability for the ordering

physician, irrespective of age. We attempted to control for familiarity and set a baseline usability for

a system the doctors have most likely never encountered before in Experiment II by orienting the

participants to the user systems (written and computerized forms) for charting and ROE entry before

the procedure even began. However, subjective past experiences with technology should be

acknowledged as a particularly evasive confounding factor that can impact usability and participant

affect towards the automation technology. This negative affect could artificially induce errors

without the physician’s realization, since the bias could be unconscious. Therefore, studies in CPOE

should aim to increase the exposure of psychological biases that underpin automation-induced

medical error. Making physicians more aware could be one solution to the issue of medical error.

We hypothesized that increasing age and experience in medicine will be positively correlated

with an increase in ADEs and lawsuit severity. However, it could very well be argued that older

physicians have more diagnostic and medical experience overall, which allows them to circumvent

their biases and focus on what truly matters. Thus, this would conversely result in a decrease in

ADEs and lawsuit severity with increasing expertise. While the data in the literature show the

22

opposite trend for now4, a scenario in which expertise trumps computer interface familiarity can be

imagined.

Furthermore, we did not account for trust/reliance interactions. Here, trust can be defined as

the physician’s confidence in the system or him-/herself. Trust in the computer over oneself can

result in reliance, which leads to an increased susceptibility of committing an error due to AB.

However, greater trust in oneself can allow the individual to override cognitive shortcuts like AB in

times of stress and make more independent decisions that lead to better outcomes. Physicians have

actually been shown to accept CDSS recommendations that were incorrect when they reported less

confidence in their own diagnosis4. This effect of a physician’s confidence in their diagnostic ability

could still be variable in physicians who have been in a medical career for many years. Increased age

does not necessarily imply increased confidence in one’s abilities, always. Therefore, future studies

repeating our paradigm should choose different subjective variables and administer surveys assessing

the participant’s confidence in their scenario. Accounting for subjective factors can reveal a hidden

side of physician decision making that can also mediate whether he or she commits an unintended

medical error.

One last consideration for future studies should consider the finding that additional options

have the potential to increase the difficulty of decision making. When confronted with either

sending a patient to receive hip replacement surgery or place them on ibuprofen, most physicians

significantly opted for the ibuprofen route, rather than the surgery. However, when another group

of physicians were confronted with the same scenario, but now with an additional drug option (two

total drug options and one surgery option), most physicians significantly opted their patient for the

surgery, instead!7 Psychologists take these results to mean that when presented with additional

options, making the choice between only two similar, yet slightly different options is much harder

than choosing between two different options. Adding alternative options coaxes the mind to follow

23

the path of least resistance and choose the most different option available (in this case, surgery

instead of two similar pain pills). Others attribute this to a “technological imperative,”7 a so-called

inner feeling of unease when the option to simply ‘do nothing’ seems unappealing to us, given the

vast array of interventional technology at our disposal. Thus, the imperative to intervene may be

stronger when fewer alternatives are available to us than when many alternatives are present and the

decision-making process spends too much cognitive energy. Considerations of the technological

imperative in CPOE are important, too, as CDSS recommendations providing too many alternatives

could result in omission errors, thereby preventing the patient from receiving a necessary

intervention. Overcoming these biases is difficult work. But recognizing the principle of primum non

nocere, or “first, do no harm,” is a vital first step towards confronting one’s cognitive biases to reduce

adverse patient events.

24

Works Cited

1.   Dixon, B.E. & Zafar, A. (2009). Inpatient Computerized Provider Order Entry (CPOE):

Findings from the AHRQ Health IT Portfolio. Agency for Healthcare Research and Quality.

AHRQ Publication No. 09-0031-EF, 1-18. www.ahrq.gov

2.   Eikel, C., Delbanco, S., M. John, M. (2003). Eisenberg patient safety awards: The Leapfrog

Group for Patient Safety: rewarding higher standards.  The Joint Commission Journal on

Quality and Patient Safety, 29(12), 634–639. www.jcrinc.com

3.   Goddard, K., Roudsari, A.V., & Wyatt, J.C. (2014). Automation bias: Empirical results.

assessing influencing factors. I. J. Medical Informatics, 83(5), 368–375.

http://dx.doi.org/10.1016/j.ijmedinf.2014.01.001

4.   Goddard, K., Roudsari, A.V., & Wyatt, J.C. (2012). Automation bias: a systematic review of

frequency, effect mediators, and mitigators. Journal of the American Medical Informatics

Association : JAMIA, 19(1), 121–127. http://doi.org/10.1136/amiajnl-2011-000089

5.   Green, R.A., Hripcsak, G., Salmasian, H., Lazar, E.J., Bostwick, S.B., Bakken, S.R., &

Vawdrey, D.K. (2015). Intercepting wrong-patient orders in a computerized provider

order entry system. Annals of Emergency Medicine, 65(6), 679-686.

doi:10.1016/j.annemergmed.2014.11.017

6.   Hogan, M.P, Pace, D.E., Hapgood, J., & Boone, D.C., (2006). Use of human patient

simulation and the situation awareness global assessment technique in practical trauma

skills assessment. Journal of Trauma Injury, Infection, and Critical Care, 61(5), 1047–1052.

doi: 10.1097/01.ta.0000238687.23622.89

7.   Redelmeier, D.A., & Shafir, E. (1995). Medical Decision Making in Situations That Offer

25

Multiple Alternatives. The Journal of the American Medical Association: JAMA, 273(4), 302-

305. doi: 10.1001/jama.1995.03520280048038.

8.   Rosenthal, D.I., Weilburg, J.B., Schultz, T., Miller, J.C., Nixon, V., Dreyer, K.J., & Thrall,

J.H. (2006). Radiology order entry with decision support: initial clinical experience.

Journal of the American College of Radiology, 3(10), 799–806. doi: 10.1016/j.jacr.2006.05.006

9.   Shamliyan, T. A., Duval, S., Du, J., & Kane, R. L. (2008). Just What the Doctor Ordered.

Review of the Evidence of the Impact of Computerized Physician Order Entry System

on Medication Errors. Health Services Research, 43, 32–53. doi: 10.1111/j.1475-

6773.2007.00751.x

10.  Singh, H., Petersen, L.A., & Thomas, E.J. (2006). Understanding diagnostic errors in

medicine: a lesson from aviation. BMJ Quality & Safety in Health Care, 15(3), 159–164.

doi: 10.1136/qshc.2005.016444

11.  Skitka, L. J., Mosier, K. L., & Burdick, M. (2000). Accountability and automation bias.

International Journal of Human-Computer Studies, 52, 701–717. doi.10.1006/ijhc.1999.0349

12.  Spencer, J. & Dales, J. (2006). Meeting the needs of simulated patients and caring for the

person behind them? Medical Education, 40(1), 3-5.

13.  The University of Chicago Medicine & Biological Sciences. (2014). Clinical Effectives Report

– Fourth Edition. 1-12.

14.  van Rosse, F., Maat, B., Rademaker, C.M.A., van Vught, A.J., Egberts, A.C.G., & Bollen,

C.W. (2009). The Effect of Computerized Physician Order Entry on Medication

Prescription Errors and Clinical Outcome in Pediatric and Intensive Care: A Systematic

Review. Pediatrics, 123(4), 1184-1190. doi: 10.1542/peds.2008-1494