privacy and security workgroup big data and privacy october 27, 2014 deven mcgraw, chair stan...

40
Privacy and Security Workgroup Big Data and Privacy October 27, 2014 Deven McGraw, chair Stan Crosley, co- chair

Upload: myron-briggs

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Privacy and Security Workgroup

Big Data and Privacy

October 27, 2014

Deven McGraw, chairStan Crosley, co-chair

Agenda

• Background– Definition of Big Data– Why are we developing policy recommendations

regarding Big Data?• Topics in Big Data, Privacy and Health Care• Overview of laws, risks, and mitigation

strategies• Policy Questions

2

Background

3

Background

Definition of Big Data

• “There is no rigorous definition of big data”

• “. . . Big data refers to things one can do at a large scale that cannot be done at a smaller one, to extract insights or create new forms of value, in ways that change markets, organizations, the relationship between citizens and governments, and more.”

• “At its core, big data is about predictions . . . It’s about applying math to huge quantities of data in order to infer probabilities . . . .”

4Viktor Mayer-Schoenberger and Kenneth Cukier, Big Data: A Revolution that will Transform How we Live, Work, and Think, Houghton Mifflin Harcourt Publishing, 2013.

Definition of Big Data

• Gartner (Business):– “High‐volume, high‐velocity and high‐variety information assets that

demand cost effective, innovative forms of information processing for ‐enhanced insight and decision making.”

• Adam Barker and Jonathan Stuart Ward (Technical):– “The storage and analysis of large and/or complex data sets using a series

of techniques including, but not limited to, NoSQL, MapReduce, and machine learning.”

• Privacy Context:– “. . . the term ‘big data’ typically means data about one or a group of

individuals, or [data] that might be analyzed to make inferences about individuals.”

5President’s Council of Advisors on Science & Technology, Big Data and Privacy: A Technological Perspective, May 2014. http://www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_privacy_-_may_2014.pdf

Why are we considering Big Data? -

• Big Data: Seizing Opportunities, Preserving Values (May 2014):– “The government should lead a consultative

process to assess how the Health Insurance Portability and Accountability Act (HIPAA) and other relevant federal laws and regulations can best accommodate the advances in medical science and cost reduction in health care delivery enabled by big data.”

6Big Data: Seizing Opportunities, Preserving Values, http://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_may_1_2014.pdf

Why are we considering Big Data?

• White House Open Government Partnership• “Use Big Data to Support Greater Openness and Accountability”• Ensure privacy protection for big data analyses in health.

– Big data introduces new opportunities to advance medicine and science, improve health care, and support better public health.

– To ensure that individual privacy is protected while capitalizing on new technologies and data, the Administration, led by the Department of Health and Human Services, will: • (1) consult with stakeholders to assess how Federal laws and regulations can

best accommodate big data analyses that promise to advance medical science and reduce health care costs; and

• (2) develop recommendations for ways to promote and facilitate research through access to data while safeguarding patient privacy and autonomy.

7http://www.whitehouse.gov/the-press-office/2014/09/24/fact-sheet-announcing-new-us-open-government-commitments-third-anniversa

White House Big Data Report Observations

• Distinction between “big data” and “small data”: Big data is characterized by 3 Vs (Volume, Variety, Velocity)

• Other key observations:– De-identification is insufficient to protect privacy in big

data analytics– Meta data raises significant privacy issues

• Should not necessarily treat as less risky than content

– Focus on assuring responsible uses, vs. trying to control collection; role of notice and consent should be re-examined

8

White House Big Data Recommendations*

• Current policy frameworks may work well enough for small data, but they do not meet the challenges of big data, including in health:– “The complexity of complying with numerous laws when data [is]

combined from various sources raises the potential need to carve out special data use authorities for the health care industry if it is to realize the potential health gains and cost reductions that could come from big data analytics.” (p. 23)

*partial list9

PCAST Recommendations Regarding Big Data and Privacy

1. Policy attention should focus more on the actual uses of big data and less on its collection and analysis.

2. Policies and regulation, at all levels of government, should not embed particular technological solutions, but rather should be stated in terms of intended outcomes.

3. With coordination and encouragement from Office of Science and Technology Policy (OSTP), Networking and Information Technology Research and Development program (NITRD) agencies should strengthen U.S. research in privacy related technologies and in ‐the relevant areas of social science that inform the successful application of those technologies.

4. OSTP, together with the appropriate educational institutions and professional societies, should encourage increased education and training opportunities concerning privacy protection, including professional career paths.

5. The US should take the lead both in the international arena and at home by adopting policies that stimulate the use of practical privacy protecting technologies‐ that exist today. This country can exhibit leadership both by its convening power (for instance, by promoting the creation and adoption of standards) and also by its own procurement practices (such as its own use of privacy preserving cloud services).‐

6. [E]nsure both patient privacy and patient benefit from medical research, in a world where medical data are increasingly in electronic form and where there is a growing need for real time or near real time aggregated data to improve healthcare.

10PCAST, Big Data and Privacy: A Technological Perspective, May 2014. http://www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_privacy_-_may_2014.pdf

Big Data in Healthcare

11

Big Data in Healthcare

Changing Healthcare Landscape is Driving Demand for Big Data Analytics

• Escalating costs, shifts in therapeutic and provider reimbursement trends– Movement from fee-for-service model to risk-

sharing model focused on patient outcomes – “real world evidence”

– Narrowing of approved therapies on formularies drives need to demonstrate effectiveness• The data-driven “closed loop”: illness – symptoms –

therapy – outcomes

– The rise of actual HCP performance ratings and metrics

12The big-data revolution in US health care: Accelerating value and innovation http://www.mckinsey.com/insights/health_systems_and_services/the_big-data_revolution_in_us_health_care.

Changing Healthcare Landscape is Driving Demand for Big Data Analytics

• Shifts in clinical landscape– Clinicians begin embracing evidence-based medicine

• Increased demand for “Translational Medicine” as means of more efficiently translating research/discovery into treatment protocols

– Patient demand for data: self-help trend that started with early internet sites like WebMD, rapidly creating entire “patient-empowered” ecosystem that is data-driven

– Following other industries: banking, financial services from provider-centric to customer-centric, all data-driven

13The big-data revolution in US health care: Accelerating value and innovation http://www.mckinsey.com/insights/health_systems_and_services/the_big-data_revolution_in_us_health_care.

High Volume Data from Varied Sources

• Supply at scale: more data and sources– Clinical data (electronic medical records)– Claims and cost data– Pharmaceutical R&D data – Socioeconomic, demographic, behavior data of

patients, consumers and HCPs (data brokers)– Government data– Patient/consumer generated data• Observational and Sensor-based data

14The big-data revolution in US health care: Accelerating value and innovation http://www.mckinsey.com/insights/health_systems_and_services/the_big-data_revolution_in_us_health_care.

Primary Data Pools

15The big-data revolution in US health care: Accelerating value and innovation http://www.mckinsey.com/insights/health_systems_and_services/the_big-data_revolution_in_us_health_care.

Clinical Data (electronic and medical records)

• Owner: providers• Example data sets: electronic medical records, medical images

Claims and cost data• Owners: payors, providers• Example data sets: utilization of care, cost estimates

Pharmaceutical R&D data• Owner: pharmaceutical companies, academia• Example data sets: clinical trials, high-throughput-screening libraries

Socioeconomic, demographic, behavior data of patients, consumers and HCPs

(data brokers)

• Owner: consumers and stakeholders outside healthcare (e.g., retail, apparel)• Example data sets: patient behaviors and preferences, retail purchase history, exercise data captured in running shoes

Government data• Owner: government stakeholders (e.g., HHS, CDC, NIH)• Example data sets: community health data

Patient/consumer generated data

• Owner: consumers and stakeholders outside healthcare• Example: Observational and sensor-based data

Topics in Big Data, Privacy, and Health Care

1. Research2. Personalized medicine– Pharmacogenetics– Precision and predictive medicine

3. Telehealth4. Consumer-generated and stored data5. Other topics?

16President’s Council of Advisors on Science & Technology, Big Data and Privacy: A Technological Perspective, May 2014. http://www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_privacy_-_may_2014.pdf

Research

17

Research

Research Needs Relevant to Big Data

One view of research in a Learning Health System: The future of health research and healthcare is in data • Granular data about all aspects of individuals’ health, genetic

make-up, behaviors, families, environment, etc.

• Data that will be collected from sensors and interactions, including EHRs, PHRs, home healthcare devices, smart phones, browsing behavior, social media interactions, embedded sensors, and a variety of other sources and analyzed as a whole system — what Lee Hood, President of the Institute for Systems Biology, describes as a “virtual cloud of billions of data points.”

• This approach will facilitate the movement not only to increasingly “personalized medicine,” but to medicine that is “predictive, preventive, personalized, and participatory.”

18Lee Hood Group, Institute for Systems Biology, www.systemsbiology.org/hood-group

Research Needs Relevant to Big Data

• Access to Data– Aggregation of data from clinical trials for

disease/therapeutic areas– Clinical data within and across EHRs– Genetic/biomarker, epidemiological and environmental

data– Patient-level data without having direct patient

identifiers – Dates are typically important

• Access to analytics • Closed loop cycle: research to treatment to research

19

Personalized Medicine

20

Personalized Medicine

Personalized Medicine

• Personalized medicine:– “providing ‘the right patient with the right drug at

the right dose at the right time.’”– “the tailoring of medical treatment to the

individual characteristics, needs, and preferences of a patient during all stages of care, including prevention, diagnosis, treatment, and follow-up.”

21http://www.fda.gov/scienceresearch/specialtopics/personalizedmedicine/default.htm

Pharmacogenetics

• Personalized medicine is often based on pharmacogenetics modeling – Pharmacogenetics is the study of genetic

differences in metabolic pathways which can affect individual responses to drugs, both in terms of therapeutic effect as well as adverse effects.

– Machine learning models are used to guide medical treatments based on a patient’s genotype and background.

22

Pharmacogenetics. http://en.wikipedia.org/wiki/PharmacogeneticsMatthew Fredrikson, Eric Lantz, Somesh Jha, Simon Lin†, David Page, Thomas Ristenpart. Privacy in Pharmacogenetics:An End-to-End Case Study of Personalized Warfarin Dosing. University of Wisconsin, Marshfield Clinic Research Foundation. https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/fredrikson_matthew

Precision and Predictive Medicine

• Precision medicine is the application of panomic analysis and systems biology to analyze the cause of an individual patient's disease at the molecular level and then to utilize targeted treatments (possibly in combination) to address that individual patient's disease process.

• Predictive medicine is a field of medicine that entails predicting the probability of disease and instituting preventive measures in order to either prevent the disease altogether or significantly decrease its impact upon the patient (such as by preventing mortality or limiting morbidity).

23Precision Medicine. http://en.wikipedia.org/wiki/Precision_medicinePredictive Medicine. http://en.wikipedia.org/wiki/Predictive_medicine

Telehealth

24

Telehealth

Big Data & Telehealth

• Telehealth is “the use of electronic information and telecommunications technologies to support long-distance clinical health care, patient and professional health-related education, public health and health administration.” Examples include: – Live interactive video or the use of store and forward transmission of

diagnostic images, vital signs and/or video clips along with patient data for later review.

– Remote patient monitoring to collect and send data to a home health agency or a remote diagnostic testing facility (RDTF) for interpretation.

– Internet and wireless device usage for consumers to obtain specialized health information, education, and on-line discussion groups to provide peer-to-peer support.

25http://www.americantelemed.org/about-telemedicine/what-is-telemedicine#.VC8UFvldXW8

http://www.hrsa.gov/ruralhealth/about/telehealth/

Consumer Generated and Stored Health Data

26

Consumer Generated and Stored Health Data

Consumer-generated Data

• “Wearables can unobtrusively gather and transmit objective, experiential data in real time, 24 hours a day, seven days a week. With this approach, research can evolve from looking at a very small number of data points and burdensome pencil-and-paper patient diaries collected sporadically to analyzing hundreds of readings per second from thousands of patients and attaining a critical mass of data to detect patterns and make new discoveries.”

• Mobile devices might help aging people to detect diseases, such as Alzheimer’s.

• Clear trend to aggregate devices on platforms, e.g., Qualcomm 2net and Apple IO HealthKit.

27

Bolluyt, Jess. August 29, 2014. http://wallstcheatsheet.com/technology/what-are-wearable-devices-really-capable-of.html/?a=viewall#ixzz3ESIxMZpX

President’s Council of Advisors on Science & Technology, Big Data and Privacy: A Technological Perspective, May 2014. http://www.whitehouse.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_privacy_-_may_2014.pdf

Consumer Stored Data

• Health record banks/Personal Health Records– Secure repositories with internet-based interfaces that

store personal health information. – Provides individual accounts that contain copies of

medical records and additional information that may be added by the consumer optionally• Administrative functions include authentication,

authorization, and certification

– May be locally, regionally, or nationally based– Provided by employers, health insurers and

independent consumer-facing entities.

28http://www.healthbanking.org/

http://www.healthbanking.org/docs/HRBA%20Architecture%20White%20Paper%20Jan%202013.pdf

Laws and Policies

29

Laws and Policies

Laws and Policies That May Apply to Protecting Privacy in Health Care

• Federal laws and regulations– Health Insurance Portability and Accountability Act (HIPAA)

• Applies to covered entities– Health care providers, health plans, and health care clearinghouses

• Privacy Rule• Security Rule

– Health Information Technology for Economic and Clinical Health (HITECH) Act• Extends HIPAA to business associates• Are pharmaceutical providers considered business associates?

– The Common Rule, 45 CFR Part 46– Genetic Information Non-Discrimination Act (GINA)– Federal Trade Commission Act, Section 5

• Deceptiveness• Unfairness

– Fair Credit Reporting Act (related to inference of health status)• Various state laws and regulations

McEwen, Julie. Telehealth Privacy Challenges: Reducing the Risk, March 2014. The MITRE Corporation. 30

Laws and Policies

User, Passwo

rdAge,

GenderContact Info

IP Addres

sInsuran

ce covera

geProvider

seen/referred

Biometric dataDiagno

sesProced

ures

Medications

Allergies

Immunizations

Hospitalization

s

Lab results

Genetic info

Online Health Netwo

rk

Employment

Drinking

behaviors

Religious

beliefs

Political views

Memberships Affiliati

ons

Family

Networks

Activities

Preferences and

Interests

Social Netwo

rk

User, Passw

ord

Age, Gende

r

Contact InfoIP

Address

Employment

Drinking behavio

rs

Religious

beliefs

Political

views

Memberships

Affiliations

Family

Networks Activ

itiesPreferences and Interests

Mobile

Phone App

Phone ID Geo-

locationUser,

Password

Age, Gender

Contact Info

Biometric dataDiagnos

es

Procedures

Medications

Allergies

Lab results

Health habits

Drinking

behaviors Religio

us beliefs

Political

viewsNetwo

rks

Activities

Preferences and

Interests

Personal Health Records

User, Passw

ordAge, Gend

erContact Info

IP Addre

ss

Insurance

coverage

Provider seen/ref

erred

Biometric

dataDiagnoses

Procedures

Medications

Allergies

Immunizations

Hospitalizati

ons

Lab result

s

Genetic info

Laws and Policies That May Apply to Protecting Privacy in Health Care

Indiana University, Center for Law, Ethics and Applied Research in Health Information 31

Laws and Policies

32

Laws and Policies That May Apply to Protecting Privacy in Health Care

Indiana University, Center for Law, Ethics and Applied Research in Health Information

Policy Questions

33

Policy Questions

Policy Questions (1 of 2)

• Research:– Are updates or additional policies needed to address

ethical privacy frameworks and research standards? • Personalized Medicine, Pharmacogenetics, Predictive &

Precision Medicine:– What policies and technologies exist to protect the privacy

of databases? – What policies should be considered (including w/r/t

trasnparency, notice/consent) for identifying disease traits, cohort matches, testing recommendations for patients based on data within their EHR? Based on face-face interaction with Clinician?

34

Policy Questions (2 of 2)

• Telehealth/Consumer: – What are individuals’ protections against privacy risks pursuant to telehealth, health apps,

sensor-based data generation? – What policies should exist around use of health and non-health data to infer health status

of individuals?• Access to and use of data to create inference.• Use of inferred health status – treatment, marketing, research• Disclosure of inferred health status – to HCP, to third party, if app-based to app manufacturer

– What if this is a stated condition of use of the app? • Analytics

– What policies can be enacted to encourage the wide-spread implementation of current methods?

– Recognizing the limitations of current guidance, what are additional solutions for the de-identification of data?

• General Policy Questions– The PCAST notes that “the framework of notice and consent is also becoming unworkable

as a useful foundation for policy.” What frameworks should be explored within the healthcare environment? For research, treatment, sharing.

– How can we use big data to improve public health and balance collection, use, and retention needs with privacy and security imperatives?

35

Back-Up

36

Back-up

Risks

37

Risks

Potential Big Data Privacy Risks in Health Care

• There is an opportunity to re-define the conversation around risks with respect to privacy and big data in health care– “ . . . the concept of risk needs a broad frame,

beyond the typical tangible harms like loss of employment or insurance discrimination and encompassing risks like stereotyping, harms to dignity and harms to trust in the historic confidentiality of the clinician-patient relationship.”

38Deven McGraw, Policy Frameworks to Enable Big Health Data

Potential Big Data Privacy Risks in Health Care

• Notice and Consent– Patients may not understand the privacy notices that are provided with health care services – Patients may feel that they must consent to the privacy policies and practices stated in the

privacy notice in order to receive treatment– Big data challenges traditional concepts of notice and consent

• Collection and Use Limitation– Medical device transmissions may be collected by the technology manufacturer in addition to

the health care provider– Patients may be unaware that their use of technology may provide other types of sensitive

information about them besides medical information– Patient information collected by technology may be used in ways that the patient may not

have anticipated– Big data value is driven by the opportunity for n = all

• Access and Accuracy– Patients may not be provided with access to the information about them that is collected by

technology• Security

– Adequate security mechanisms may not be in place within technology and the environment in which it is used.

McEwen, Julie. Telehealth Privacy Challenges: Reducing the Risk, March 2014. The MITRE Corporation. 39

Potential Risks of De-Identification in Big Data

• One of the biggest risks around de-identification is that “. . . de-identification does not eliminate risk of re-identification, protections are still needed for the residual re-identification and other privacy risks that remain in the data.”– There are no standards beyond what is set forth in

HIPAA– Non-covered entities are not required to follow

HIPAA standards

40Deven McGraw, Policy Frameworks to Enable Big Health Data