march 3, 2009: i. siminformatics for clinical research epi 206 – medical informatics ida sim, md,...

46
March 3, 2009: I. Sim Informatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine, and Center for Clinical and Translational Informatics UCSF Methods for Internet-Based Research Copyright Ida Sim, 2009. All federal and state rights reserved for all original material presented in this course through any medium, including lecture or print.

Upload: cecil-gordon

Post on 27-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Ida Sim, MD, PhD

March 3, 2009

Division of General Internal Medicine, and Center for Clinical and Translational Informatics

UCSF

Methods for Internet-Based Research

Copyright Ida Sim, 2009. All federal and state rights reserved for all original material presented in this course through any medium, including lecture or print.

Page 2: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Homeworks

• Questions/discussion on Homework #1• Problem Set #3 due in 2 weeks

– you need to design and deploy online survey and analyze

• Problem Set #4 will be issued next week and due the same time as PS #3

Page 3: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

3

Big Picture of Health Informatics

Virtual Patient

Transactions

Raw data

Medical knowledge

Clinical research

transactions

Raw research

data

Dec

isio

n su

ppor

t

Med

ical

logi

c

PATIENT CARE / WELLNES RESEARCH

Workflow modeling and support, usability, cognitive support, computer-supported cooperative work (CSCW), etc.

CTMSs

Page 4: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

4

Outline

• Leftovers– CDE Browser

– IDR demo

• Internet-based health research– e-health research tools

– methodological considerations

• Summary

Page 5: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

5

Standardization Across Studies• Interventional studies fundamentally done to

demonstrate differences between interventions, by types of patients [Clarke M, Trials 2007]

– common outcome measures necessary for pooling/meta-analysis

• e.g., 5-year cancer free survival, common asthma measures

– also common eligibility criteria, e.g., Post-menopause• post (Prior bilateral ovariectomy, OR >12 mo since LMP with no

prior hysterectomy and not currently receiving therapy with LH-RH analogs [eg. Zolades])

• post (Prior bilateral ovariectomy, OR >12 mo since LMP with no prior hysterectomy)

• pre (<6 mo since LMP AND no prior bilateral ovariectomy, AND not on estrogen replacement)

• above categories not applicable AND Age >=50

Page 6: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

6

NCI Approach in Cancer

• NCI caDSR (Data Standards Repository)– library of Common Data Elements (CDEs) that

others have defined– you can define new CDEs using terms from

NCI Thesaurus

• Let’s go search...– https://cdebrowser.nci.nih.gov/CDEBrowser/

Page 7: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

7

Using UCare via IDR

• Cannot easily query UCare directly– no user interface for group-level queries– may reduce response time for clinical care– lots of important data (e.g., STOR outpatient data)

not in UCare

• Solution is to copy UCare data to IDR– autofeed nightly, data stored securely with backup– supports ad hoc group-level queries, e.g., cohort

identification• how many potentially eligible patients in UCare?

Page 8: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

8

MICU

FinanceResearch

QA

IntegratedData Repository

Internet

ADT Chem EHR XRay PBM Claims

• Integrated historical data common to entire enterprise

Integrated Data Repository

Page 9: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

9

i2b2 Demo• UCSF IDR will be built using the i2b2

software suite from Harvard Partners• Demo of i2b2 query interface to over 5000

anonymized real records from Partners– https://38.99.4.62:8443/i2b2/

Page 10: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

10

UCSF IDR Status

• Current focus on bringing up IDR content– public datasets (e.g., NHANES)– individual PI data (for sharing just with your group,

by request, or with everyone)– negotiating with Med Center on UCare data

• Semantic standardization still problematic– need to map source data to standardized terms

(e.g. asthma)– need data models of clinical research, (e.g.,

primary outcome, Ontology of Clinical Research)

Page 11: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

11

Outline

• Leftovers– CDE Browser

– IDR demo

• Internet-based health research– e-health research tools

– methodological considerations

• Summary

Page 12: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Internet vs. Web

itsa

medicine

ucsf.edu

nci.nih.gov cochrane.uk myhome.com

Main Trunk Cables

local trunk cablethrough Berkeley

amazon.com

at homedial-in to itsa.ucsf.edu via modem

pacbell.net

aol.com

Internet Service Provider (ISP)via DSLor cable

LAN

Page 13: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Internet vs. Web

• Internet = network of networks– computers and cables all linked to one another

and talking to one another using protocols

– supports lots of different internet protocols• e.g., http, ftp, smtp, https, rdf, doi, etc. etc.

• Web is the internet traffic that uses http– servers send out information in HTML

• Hypertext Markup Language

– web browsers can decode HTML and display it

Page 14: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Clients and Servers

itsa

medicine

ucsf.edu

nci.nih.gov cochrane.uk myhome.com

Main Trunk Cables

amazon.com

at home

pacbell.net

aol.com

LAN

Server

Client

Page 15: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Research IT on Internet/Web

• Research IT using Internet (e.g., CTMS over internet)– uses Internet network of networks to send data and

commands back and forth– servers and clients do the storage, query, retrieval,

computation, reporting– may have nothing to do with a web browser

• Research IT using Web– web servers send HTML content over the Internet using

HTTP– web browsers and other “clients” receive that content for

display or computation• What are logistical and methodological issues?

Page 16: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Web-Based Health Research

• Surveys• Interventional studies (e.g., quit smoking trial)

– target audience• English and Spanish-speaking smokers

– pre- and post demographic, etc. survey

– randomized Interventions• downloadable brochure vs. brochure + email reminders +

diary

– outcome• quit rate

Page 17: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Web Surveys are Cheaper

• Web surveys have higher fixed cost but cost per additional respondent is much lower– marginal cost per mail survey respondent $1.93– phone $40 to $100– web $0

• Buy or build?– buy: many companies offer survey design,

deployment, and data management services– build: do-it-yourself

Page 18: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Buying Survey Services

• Many, many companies exist• Survey Monkey www.surveymonkey.com

– free for 10 questions, 100 responses per survey– professional subscription $19.95/mo, or $200/yr unlimited

• up to 1000 responses per month, $0.05 per additional response

• DatStat’s Illume – web-based survey creation and management– real-time data access and complex query capabilities– exports data to SAS, SPSS, etc. – Internet World Health Research Center is beta user

• $7000/yr first year, $3000/yr thereafter

• $4000 license/user (e.g., you)

Disclosure: I have no ties to SurveyMonkey or DatStat

Page 19: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,
Page 20: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,
Page 21: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,
Page 22: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,
Page 23: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,
Page 24: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,
Page 25: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

e-Interventions

• Educational• Behavioral

– e.g., cognitive behavioral therapy

• Simulation– e.g., simulation of infectious virus in Second Life

– for teaching (e.g., med students)

• Modality– web pages, brochures, video, games (for asthma)

– text messaging (for wt loss)

Page 26: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

eHealth Tools Summary

• Survey systems– SurveyMonkey most common, but NOT HIPAA-

compliant

– Enterprise Feedback Management systems often more secure

• Interventional systems– web is new platform for behavioral/educational

interventions (e.g., Illume)

– very little so far on health research through personal devices/cell phone

Page 27: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Outline

• Leftovers– CDE Browser

– IDR demo

• Internet-based health research– e-health research tools

– methodological considerations

• Summary

Page 28: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Methodological Considerations

• eHealth research is very new field– http://www.isrii.org/ and http://www.jmir.org/

• Survey/intervention design– measurement error

– non-response bias

• Subject recruitment– selection bias: who is on the web? who isn’t?

– sampling error

• Sample size

Page 29: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Survey Design• Usual survey design issues apply, PLUS• Technical design of survey

– platform (e.g., Mac) and browser (e.g., Safari) incompatibilities

– use Flash, Java, etc requiring plug-ins or version compatibility

– readiblity (font too small), need to scroll, confusing navigation, bugs

• What technology does respondent group use?– check some browser statistics sources

• e.g., http://www.w3schools.com/browsers/browsers_stats.asp

– need to test and double-test in various platforms and browsers used, various versions of HTML, Java, Flash, etc.

Page 30: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Measurement Bias

• What you designed may not be what respondent sees

• Client’s browser displays the survey based on – platform, browser, monitor, screen/window size

– different users see different survey, e.g., • small screen/window size makes “Next” button not visible

• text doesn’t fit on small window, or requires scrolling for some respondents and not others

• colors, graphics (e.g., visual analog scales) may appear differently

Page 31: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Non-Completion Bias

• Influenced by– respondent familiarity with web (e.g., click on link)– technical design of survey– bandwidth– convenience (can interrupt survey?)

• Can use mixed-mode surveys to address– e.g., combined web/phone, web/mail

Page 32: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Subject Recruitment

• Recruitment is biggest bottleneck of clinical research– 30-40% of clinical trial costs – >80% of trials have recruitment delays– 1/20 recruited patients actually enroll

• Web-based recruitment can be international, cheap, fast– e.g., www.stopsmoking.ucsf.edu Dec 05 - Feb 07

• 350,000 hits, 60,000 entered data, 20,000 enrolled• 2/3 Spanish-speaking, 1/3 English• 131,517 visits from 121 countries Jan 12, 05 to April 5,

06

Page 33: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Visits0=>1=>100=>1,000=>10,000

Distribution of Visits to www.stopsmoking.ucsf.edu Jan 12, 2005 to April 5, 2006

(131,517 visits from 121 countries)

Page 34: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Methodological Considerations

• Survey/intervention design– measurement error

– non-response bias

• Subject recruitment– selection bias: who is on the web? who isn’t?

• digital divide

– sampling error• avoiding biased sampling of subject populations

• Sample size

Page 35: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Digital DivideInternet Access Broadband Access

<$30,000 41% 8%

$30-49,000 71% 16%

>$50,000 89% 39%

No children 59% 16%

Children in home 76% 29%

White 69% 23%

African-American 56% 15%

Hispanic 48% 14%

"Digital Divide" Still Shapes Media Landscape (10/19/04, Knowledge Networks/SRI); http://www.knowledgenetworks.com/info/press/releases/2004/101904_htmtrends.htm

Page 36: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Digital Health Divide

• Spanish-language sites have lower quality– 45% of English-language sites vs. 22% with minimal

coverage & complete accuracy (JAMA 2001; 285:2612-2621)

• Broadband more available to higher-income white households with children– uneven potential access to Flash, tele-consultation,

etc.

• Most of divide attributable to income, not to race

Page 37: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Reducing Sampling Error

• Social sciences and marketing are most advanced in web survey methodology– e.g., Joint Statistical Meetings of the American

Statistical Association

– http://www.knowledgenetworks.com/dmg/index.html

• Recruit a representative sample• Use a pre-assembled representative cohort

Disclosure: I have no relationship with KnowledgeNetworks

Page 38: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Recruit Representative Sample

• Random digit dialing (RDD) analog equally representative as (land-line) telephone RDD– RDD sampling

– if respondent agrees, provide them with free Internet access (via MSNTV, aka WebTV) or other necessary hardware for duration of participation

– e.g.,http://knowledgenetworks.com/

Page 39: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Representative Cohorts

• Maintained by e.g., large survey and marketing firms– www.knowledgenetworks.com

• KnowledgePanel is representative of US• can target specific respondents, “response rates of 65-

75%, abandonment rate <2%”

– www.surveysampling.com• panels in 17 countries totaling 3.8 million respondents

– http://experimentcentral.org/ • NSF-funded representative panel for social science

research

Page 40: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Enrollment Rates

• Response rates typically 30-60%• Affected by

– number of (pre) contacts, whether personalized• most influential factors

– incentives (e.g., Amazon certificate)

– population surveyed, nature of topic, official sponsorship, etc.

Page 41: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Other Recruitment Methods

• With higher risk of sampling bias– search engines, with search engine optimzation

(SEO) techniques• e.g., webrings, Google adwords

– links from related pages– email lists, social networking sites, chat rooms,

newsgroups• friends, twittering, etc.

• Can blend traditional and web– give website on radio, TV, print, brochures

Page 42: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Page 43: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Search Engine Ranking

• Search engines have their own (secret) algorithm for ranking pages– Google uses >100 factors, esp. how many pages

link into a page• Google AdWords

– put in your keywords, see cost-per-click• https://adwords.google.com/select/KeywordToolExternal

?defaultView=3

– pay only if someone clicks

Page 44: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Methodological Considerations

• Survey/intervention design– measurement error

– non-response bias

• Subject recruitment– selection bias: who is on the web? who isn’t?

– sampling error

• Sample size

Page 45: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Note on Sample Size

• Estimating sample size– e.g., Google provides traffic history for various

keywords (adwords.google.com)

• Since incremental cost often negligible, less pressure to minimize sample size– not unusal to get large samples (>10,000)

• But high sample size = high accuracy!– may be precise but inaccurate if sample is non-

representative

Page 46: March 3, 2009: I. SimInformatics for Clinical Research Epi 206 – Medical Informatics Ida Sim, MD, PhD March 3, 2009 Division of General Internal Medicine,

March 3, 2009: I. Sim Informatics for Clinical ResearchEpi 206 – Medical Informatics

Summary

• Clinical research informatics moving towards modular, interoperable world

– standard data elements (CDEs) and case report forms (CRFs)

– large-scale data repositories of semantically integrated diverse data from diverse data sources

• Web surveys and interventional research offer promises and methodological pitfalls