dual computer monitors to increase efficiency of conducting systematic reviews

5
Dual computer monitors to increase efficiency of conducting systematic reviews Zhen Wang a, * , Noor Asi a , Tarig A. Elraiyah a , Abd Moain Abu Dabrh a , Chaitanya Undavalli a , Paul Glasziou b , Victor Montori a , Mohammad Hassan Murad a a Knowledge & Evaluation Research Unit, Mayo Clinic, 200 First Street SW, Rochester, MN 55904, USA b Centre for Research in Evidence-Based Practice, Faculty of Health Sciences and Medicine, Bond University, Queensland 4229, Australia Accepted 1 June 2014; Published online 30 July 2014 Abstract Objective: Systematic reviews (SRs) are the cornerstone of evidence-based medicine. In this study, we evaluated the effectiveness of using two computer screens on the efficiency of conducting SRs. Study Design and Setting: A cohort of reviewers before and after using dual monitors were compared with a control group that did not use dual monitors. The outcomes were time spent for abstract screening, full-text screening and data extraction, and inter-rater agreement. We adopted multivariate difference-in-differences linear regression models. Results: A total of 60 SRs conducted by 54 reviewers were included in this analysis. We found a significant reduction of 23.81 minutes per article in data extraction in the intervention group relative to the control group (95% confidence interval: 46.03, 1.58, P 5 0.04), which was a 36.85% reduction in time. There was no significant difference in time spent on abstract screening, full-text screening, or inter- rater agreement between the two groups. Conclusion: Using dual monitors when conducting SRs is associated with significant reduction of time spent on data extraction. No significant difference was observed on time spent on abstract screening or full-text screening. Using dual monitors is one strategy that may improve the efficiency of conducting SRs. Ó 2014 Elsevier Inc. All rights reserved. Keywords: Evidence-based medicine; Systematic reviews; Research design; Efficiency; Validity; Technology 1. Introduction In 1979, Archie Cochrane [1] urged the medical commu- nity to have critical summaries adapted periodically for all relevant randomized controlled trials. This call for system- atic reviews (SRs) acknowledged that these summaries increase precision and applicability of evidence and should be always sought for clinical and policy decision makings. However, we are very far from this goal. Most published SRs are outdated and for some, they become outdated on the day they were published [2,3]. Arguably, most decisions made in health caredfrom policy, benefits, coverage, guide- lines, quality of care, to clinical decisions at every level of caredare not based on SRs of the best available evidence. Why do not decision makersdpatients, clinicians, pol- icy makersdhave access to the high-quality SRs necessary to make better choices despite the plenty of primary studies? Time is one of the barriers impeding SRs. A typical SR, adequately resourced and using state-of-the- art methods, takes between 6 and 18 months; wider scope projects taking even more time [4]. Shortening this time is essential and strongly required [3]. Until sophisticated software can do SRs, this process remains to be heavily dependent on human factors and skills [5]. Screening studies and extracting data in SRs involve typical computer operations, including cut-and-paste oper- ations, text and spreadsheet editing, tracing, and recording keywords, and so on. These tasks require constantly switch- ing among different computer windows and changing focus from actual work. Dual monitors (ie, two screens for each computer) have been shown to improve productivity in tasks similar to those involved in conducting SRs [6e9]. One study, by James A. Anderson at University of Utah, found that productivity among people working on editing tasks was higher with two monitors than with one [6]. More monitors cut down on toggling time among windows on a single screen, which saved about 10 seconds for every 5 minutes of work. Microsoft researchers conducted several The authors declare no conflict of interest or financial interests. * Corresponding author. Tel.: þ1-507-538-6153; fax: þ1-507-538-0850. E-mail address: [email protected] (Z. Wang). 0895-4356/$ - see front matter Ó 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jclinepi.2014.06.011 Journal of Clinical Epidemiology 67 (2014) 1353e1357

Upload: mohammad-hassan

Post on 05-Apr-2017

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Dual computer monitors to increase efficiency of conducting systematic reviews

Journal of Clinical Epidemiology 67 (2014) 1353e1357

Dual computer monitors to increase efficiency of conductingsystematic reviews

Zhen Wanga,*, Noor Asia, Tarig A. Elraiyaha, Abd Moain Abu Dabrha, Chaitanya Undavallia,Paul Glaszioub, Victor Montoria, Mohammad Hassan Murada

aKnowledge & Evaluation Research Unit, Mayo Clinic, 200 First Street SW, Rochester, MN 55904, USAbCentre for Research in Evidence-Based Practice, Faculty of Health Sciences and Medicine, Bond University, Queensland 4229, Australia

Accepted 1 June 2014; Published online 30 July 2014

Abstract

Objective: Systematic reviews (SRs) are the cornerstone of evidence-based medicine. In this study, we evaluated the effectiveness ofusing two computer screens on the efficiency of conducting SRs.

Study Design and Setting: A cohort of reviewers before and after using dual monitors were compared with a control group that did notuse dual monitors. The outcomes were time spent for abstract screening, full-text screening and data extraction, and inter-rater agreement.We adopted multivariate difference-in-differences linear regression models.

Results: A total of 60 SRs conducted by 54 reviewers were included in this analysis. We found a significant reduction of 23.81 minutesper article in data extraction in the intervention group relative to the control group (95% confidence interval: �46.03, �1.58, P 5 0.04),which was a 36.85% reduction in time. There was no significant difference in time spent on abstract screening, full-text screening, or inter-rater agreement between the two groups.

Conclusion: Using dual monitors when conducting SRs is associated with significant reduction of time spent on data extraction. Nosignificant difference was observed on time spent on abstract screening or full-text screening. Using dual monitors is one strategy thatmay improve the efficiency of conducting SRs. � 2014 Elsevier Inc. All rights reserved.

Keywords: Evidence-based medicine; Systematic reviews; Research design; Efficiency; Validity; Technology

1. Introduction

In 1979, Archie Cochrane [1] urged the medical commu-nity to have critical summaries adapted periodically for allrelevant randomized controlled trials. This call for system-atic reviews (SRs) acknowledged that these summariesincrease precision and applicability of evidence and shouldbe always sought for clinical and policy decision makings.However, we are very far from this goal. Most publishedSRs are outdated and for some, they become outdated onthe day they were published [2,3]. Arguably, most decisionsmade in health caredfrom policy, benefits, coverage, guide-lines, quality of care, to clinical decisions at every level ofcaredare not based on SRs of the best available evidence.

Why do not decision makersdpatients, clinicians, pol-icy makersdhave access to the high-quality SRs necessaryto make better choices despite the plenty of primary

The authors declare no conflict of interest or financial interests.

* Corresponding author. Tel.:þ1-507-538-6153; fax:þ1-507-538-0850.

E-mail address: [email protected] (Z. Wang).

0895-4356/$ - see front matter � 2014 Elsevier Inc. All rights reserved.

http://dx.doi.org/10.1016/j.jclinepi.2014.06.011

studies? Time is one of the barriers impeding SRs. Atypical SR, adequately resourced and using state-of-the-art methods, takes between 6 and 18 months; wider scopeprojects taking even more time [4]. Shortening this timeis essential and strongly required [3]. Until sophisticatedsoftware can do SRs, this process remains to be heavilydependent on human factors and skills [5].

Screening studies and extracting data in SRs involvetypical computer operations, including cut-and-paste oper-ations, text and spreadsheet editing, tracing, and recordingkeywords, and so on. These tasks require constantly switch-ing among different computer windows and changing focusfrom actual work. Dual monitors (ie, two screens for eachcomputer) have been shown to improve productivity intasks similar to those involved in conducting SRs [6e9].One study, by James A. Anderson at University of Utah,found that productivity among people working on editingtasks was higher with two monitors than with one [6]. Moremonitors cut down on toggling time among windows ona single screen, which saved about 10 seconds for every5 minutes of work. Microsoft researchers conducted several

Page 2: Dual computer monitors to increase efficiency of conducting systematic reviews

l Epidemiology 67 (2014) 1353e1357

What is new?

� Using dual computer monitors was associated witha significant reduction of time spent on data extrac-tion when conducting systematic reviews (SRs).

� No significant changes were observed in abstractscreening, full-text screening, or inter-rateragreement.

� Using dual monitors is only one strategy for expe-diting the process of SRs.

� Other methods are also greatly needed.

1354 Z. Wang et al. / Journal of Clinica

studies evaluating the effect of multiple monitors [8]. Userswere asked to complete several different tasks, switchingfrom one task to another. They found that the users’ pro-ductivity increased by 9% on average and at times up to50% for tasks, such as cutting and pasting. Realizing thesebenefits, software manufacturers came up with multiple ap-plications and innovations to support multiple monitors. Inmedicine, dual-monitor views were found to improve lapa-roscopy outcomes perform by reducing errors andimproving visualization of surgical fields [9].

In this study, we evaluated the effectiveness of usingdual monitors (ie, two screens for each computer) on the ef-ficiency of conducting SRs. To our knowledge, this is thefirst study of this topic on SRs.

2. Methods

This study was considered exempt by the Mayo ClinicInstitutional Review Board.

2.1. Study design and setting

The study was conducted at an evidence synthesis centerspecialized in conducting SRs and meta-analyses. Thestudy subjects are the investigators conducting SRs. The in-vestigators consist of a core group with expertise in meth-odology, evidence-based medicine, and evidence synthesis(10e15 investigators) and external collaborators with eithermethodology or topic (content) expertise. The center pro-duces 10e20 SRs per year that are supported by intramuraland extramural fundings.

This study used a convenience sampling in which weincluded all systematic reviewers and all SRs conducted be-tween January 2009 and April 2013. In March 2012, all thecore members started using two computer screens (dualmonitors). Before that date, they were only provided withsingle monitors. Outside, collaborators continued theirnormal practices and were queried via e-mail about whetherthey used a single or dual monitor during the SR process.

Using a quasi-experimental design, we adopted adifference-in-differences approach to compare changes in ef-ficiency and accuracy of conducting SRs for reviewers whoused dual monitors (the intervention group) vs. reviewerswho did not (the control group). Specifically, this approachcompares the change (pre/post) in the intervention group tothe change (pre/post) in the control group. The cutoff timedefining pre- and postperiods was January 3, 2012 (the dateduring which the intervention group started using dual mon-itors). The same date was used as a cutoff to define pre- andpostperiods in the control group. This design with a counter-factual control can potentially control for unobserved treadsin efficiency and accuracy over time.

2.2. Data source

We retrieved data accrued between January 2009 andApril 2013 for both groups from DistillerSR (Evidence Part-ners Incorporated, Ottawa, Ontario, Canada). DistillerSR is aweb-based system specifically designed to conduct andmanage reference screening and data extraction. Supportedby centralized databases, it automatically records time eachreviewer spent on each reference at each stage and summa-rizes performance data per SR per reviewer.

Reviewers with mixed usages (using single monitor anddual monitors at different locations or in the same project)or unable to report were excluded from the analysis. SRsthat started before March 2012 and completed after March2012 were also excluded.

2.3. Variables definition

Experience of the systematic reviewers was defined asthe number of SRs the reviewers had conducted beforethe investigated SR and categorized as substantial experi-ence if he or she participated in more than 10 SRs. A sys-tematic reviewer was considered to have content knowledgeof study topic if they had specialized clinical or researchtraining in the topic of the SR (eg, a vascular surgery resi-dent and a vascular surgeon were considered to have con-tent expertise in an SR about aortic transection). Wedefined the simple questions in data extraction as those onlyneeding to be filled with numbers or simple text (eg, what isthe number of patients in the intervention group? What isthe geographic location of a study?). We defined compli-cated questions as those requiring judgment or inference(eg, was the allocation concealed? Were the two groupsbalanced at baseline?).

2.4. Outcome measures

The primary outcome of interest was the efficiency ofconducting SR, measured by the average minutes per articleduring abstract screening, full-text screening, and dataextraction. The secondary outcome is chance-adjusted in-ter-rater agreement (measure of accuracy and possibleadverse effect of speed).

Page 3: Dual computer monitors to increase efficiency of conducting systematic reviews

1355Z. Wang et al. / Journal of Clinical Epidemiology 67 (2014) 1353e1357

2.5. Statistical analysis

We conducted descriptive analysis to compare the basiccharacteristics of the reviewers and the SRs between the con-trol group and the invention group. Chi-square test was usedto compare dichotomized variables between the two groups,and Student t-test was used for continuous variables. Cohen’skappawas used tomeasure chance-adjusted inter-rater agree-ment. We constructed multivariate linear regression modelsto predict changes in outcomes. Adjusted covariates wereexperience of the systematic reviewers, content knowledgeof study topic, number of studies eligible for abstract, full-text screening and data extraction, and the ratio of simpleand complicated questions in data extraction.

All analyses were conducted using STATA, version 12.1(StataCorp LP, College Station, TX, USA).

3. Results

A total of 60 SRs were conducted by 54 different re-viewers between January 2009 and April 2013 and wereincluded in this analysis. Table 1 shows the basic character-istics of these SRs and reviewers. The intervention group(those who used dual monitors) had less content knowledge(8% vs. 34%) and more SR experience (86% vs. 28%). Ofthe 60 SRs included in this study, 28 were conducted bymembers from both groups, four were only conducted bythe control group, and 27 solely by the intervention group.

Fig. 1A shows the efficiency results ofmultivariate analysisof the difference-in-difference models. We found a significantreduction of 23.81 minutes per article in data extraction in theintervention group relative to the control group (95% confi-dence interval: �46.03, �1.58, P5 0.04). There was no sig-nificant difference in time spent on abstract screening orfull-text screening. No significant difference was found oninter-rater agreement between the two groups (Fig. 1B).

4. Discussion

We adopted difference-in-differences linear regressionmodels to evaluate the effectiveness of using dual monitorsfor conducting SRs by comparing a cohort of SR reviewers

Table 1. Basic characteristics of the reviewers and SRs

Characteristics Control gro

No. of systematic reviewers (N 5 54) 36Topic/content knowledge 34.19%SR experience (O10 prior SRs) 28.21%

No. of SRs (N 5 60) 32No. of studies in abstract screening (mean) 2,400 (range, 10No. of studies in full-text screening (mean) 302 (range, 35No. of studies in data extraction (mean) 123 (range, 7Ratio of simple questions vs. complicated

questions (mean)2.27

Abbreviation: SRs, systematic reviews.

using dual monitors to a control group that did not use dualmonitors. In this cohort of SRs, on average, reviewers spent65 minutes per article in data extraction. The significantreduction of 24 minutes represents a 36.85% savings intime. No significant changes were observed in abstractscreening, full-text screening, or inter-rater agreement.

4.1. Limitations

The sample size of this study is relatively small, whichcan limit our ability to detect beneficial or deleterious ef-fects of using dual monitors. Nevertheless, our group isone of the largest in the field and accruing a larger numberof SRs can only occur via multicenter collaboration. Sec-ond, we only included reviews conducted by our researchgroup and using specific reference management systemsand procedures. Using data from this homogenous groupreduces the risk of potential confounding but can alsoreduce generalizability to groups with different procedures.Third, we used time spent on conducting SRs (abstractscreening, full-text screening, and data extraction) as aproxy of efficiency and inter-rater agreement as a proxyof accuracy. We recognize that these measures are not per-fect, and other factors are also likely to affect efficiency andaccuracy. Finally, reviewers in the intervention and controlgroups were significantly different regarding the experienceof conducting SRs and content knowledge of study topic.We used the difference-in-difference approach with a coun-terfactual control to adjust for any unobserved treads overtime and multivariate linear regression models to adjustfor observed covariates. However, imbalances betweenthe groups may still exist and affect findings. Randomizedcontrolled trial is needed by randomizing reviewers to useone or two monitors for the same set of SRs, which canalleviate or even eliminate the imbalance.

4.2. Implications for research

There is increasing recognition that efficiency inconducting SRs is highly important. For clinicians, patients,and decision makers, the production of timely evidence syn-thesis will lead to better decisions (based on evidence), high-er appreciation for, and increased reliance on SRs. Forsystematic reviewers, efficient methods will result in more

up Intervention group P-value

187.86% !0.001

85.85% !0.00155

4e6,682) 1,653 (range, 110e6,682) 0.01e875) 264 (range, 14e875) 0.19e520) 129 (range, 7e520) 0.57

1.97 0.27

Page 4: Dual computer monitors to increase efficiency of conducting systematic reviews

Fig. 1. (A) Effects of using dual computer monitors when conducting systematic reviews on time spent by reviewers (the invention group vs. thecontrol group). Vertical line indicates no difference. Dots and horizontal lines indicate the effect size (difference in difference) and associated95% confidence interval (CI) for each outcome. A negative effect size implies time saved in minutes favoring the intervention. *Adjusted forthe number of studies screened, experience of the reviewer, and content knowledge of study topic. $Adjusted for the number of studies screened,experience of the reviewer, content knowledge of study topic, and rate of complicated questions. (B) Effects of using dual computer monitors whenconducting systematic reviews on the inter-rater agreement level between reviewers (the invention group vs. the control group). Vertical line indi-cates no difference. Dots and horizontal lines indicate the effect size (difference in difference) and associated 95% confidence interval (CI). Anegative effect size implies declining agreement associated with the intervention. $Adjusted for the number of studies screened, experience ofthe reviewer, content knowledge of study topic, and rate of complicated questions.

1356 Z. Wang et al. / Journal of Clinical Epidemiology 67 (2014) 1353e1357

reviews produced with fewer resources and in less time. Thisincreased productivity means that more insights from thesereviews will become available, helping refine research ques-tions to target knowledge gaps, to help justify research fund-ing decisions, and to quickly place the results of emergingnew research in the context of the extant evidence [10]. In thiscohort of SRs, on average, reviewers spent 65 minutes perarticle in data extraction. The significant reduction of 24 mi-nutes represents a 36.85% savings in time.With an average of127 articles eligible for data extraction per SR, 49 hours canbe saved in total for each SR.

Other methods for expediting SRs have been proposedand can be categorized as either (1) process change (re-stricting number of databases searched, restricting publica-tion language [11e13], eliminating content expertsconsultation [14], reusing existing SRs [4,15]) and (2) tech-nical improvements (online data entry, automated SRs[16e18]). Our intervention falls in the second category. Akey question; however, is to what extent do these methodsimpact accuracy and credibility of SRs? We propose thateach method needs to be tested in various settings to retainthe confidence of SR users in the results.

5. Conclusions

Using dual monitors when conducting SRs is associatedwith significant reduction of time spent on data extraction.

No significant difference was observed on time spent on ab-stract screening or full-text screening. Using dual monitorsis one strategy that may improve the efficiency of con-ducting SRs.

Acknowledgments

None to report.Competing interests: None.

References

[1] The Cochrane Collaboration. Archie Cochrane: the name be-

hind The Cochrane Collaboration. Oxford: The Cochrane Collab-

oration; 2010.

[2] Shojania KG, Sampson M, Ansari MT, Ji J, Doucette S, Moher D.

How quickly do systematic reviews go out of date? A survival anal-

ysis. Ann Intern Med 2007;147:224e33.

[3] Beller EM, Chen JK, Wang UL, Glasziou PP. Are systematic reviews

up-to-date at the time of publication? Syst Rev 2013;2:36.

[4] Smith V, Devane D, Begley CM, Clarke M. Methodology in con-

ducting a systematic review of systematic reviews of healthcare inter-

ventions. BMC Med Res Methodol 2011;11:15.

[5] Kiritchenko S, de Bruijn B, Carini S, Martin J, Sim I. ExaCT: auto-

matic extraction of clinical trial characteristics from journal publica-

tions. BMC Med Inform Decis Mak 2010;10:56.

[6] Colvin J, Tobler N, Anderson JA. Productivity and multi-screen

computer displays. Rocky Mountain Comm Rev 2004;2:31e53.

[7] Dell Computer. Dual monitors boost productivity, user satisfaction.

New York: Ziff Davis Enterprise; 2011.

Page 5: Dual computer monitors to increase efficiency of conducting systematic reviews

1357Z. Wang et al. / Journal of Clinical Epidemiology 67 (2014) 1353e1357

[8] Ross S. Two screens are better than one. Microsoft research news &

highlights. 2006.

[9] Shah RD, Cao A, Golenberg L, Ellis RD, Auner GW, Pandya AK,

et al. Performance of basic manipulation and intracorporeal suturing

tasks in a robotic surgical system: single- versus dual-monitor views.

Surg Endosc 2009;23:727e33.

[10] Murad MH, Montori VM. Focusing on the body of evidencedreply.

JAMA 2013;310:1290.

[11] Pham B, Klassen TP, Lawson ML, Moher D. Language of publication

restrictions in systematic reviews gave different results depending on

whether the intervention was conventional or complementary. J Clin

Epidemiol 2005;58:769e76.[12] Morrison A, Polisena J, Husereau D, Moulton K, Clark M,

Fiander M, et al. The effect of English-language restriction on sys-

tematic review-based meta-analyses: a systematic review of empirical

studies. Int J Technol Assess Health Care 2012;28:138e44.

[13] Shiwa SR, Moseley AM, Maher CG, Pena Costa LO. Language of

publication has a small influence on the quality of reports of controlled

trials of physiotherapy interventions. J Clin Epidemiol 2013;66:78e84.

[14] Gotzsche PC, Ioannidis JP. Content area experts as authors: helpful or

harmful for systematic reviews and meta-analyses? BMJ 2012;345:

e7031.

[15] Thomson D, Russell K, Becker L, Klassen T, Hartling L. The evolu-

tion of a new publication type: steps and challenges of producing

overviews of reviews. Res Synth Methods 2010;1:198e211.[16] Tsafnat G, Dunn A, Glasziou P, Coiera E. The automation of system-

atic reviews. BMJ 2013;346:f139.

[17] Cohen AM. Optimizing feature representation for automated systematic

review work prioritization. AMIA Annu Symp Proc 2008;2008:121e5.

[18] Cohen AM, Ambert K, McDonagh M. Studying the potential impact

of automated document classification on scheduling a systematic re-

view update. BMC Med Inform Decis Mak 2012;12:33.