canadian ai 2014 conference keynote - deploying smc in practice

49
Deploying SMC in Practice Khaled El Emam Electronic Health Information Laboratory & uOttawa EXAMPLES IN HEALTHCARE SETTINGS

Upload: kelemam

Post on 18-Dec-2014

136 views

Category:

Data & Analytics


0 download

DESCRIPTION

There is significant pressure to link and share health data for research, public health, and commercial purposes. However, such data sharing must be done responsibly and in a manner that is respectful of patient privacy. Secure multi-party computation (SCM) methods present one way to facilitate many of these analytic purposes. In fact, in some instances SCM is the only known realistic way allow some of these data disclosures and analyses to happen (without having to change the law to selectively remove privacy protections). This talk will describe two recent real-world projects where SMC was applied to address such data sharing concerns. The first was to measure the prevalence of antimicrobial resistant organism (e.g., MRSA) infections across all long term care homes in Ontario. A SMC system was deployed to collect data from close to 600 long term care homes in the province and establish a colonization and infection rate baseline. The second project pertains to securely linking large databases to allow de-duplication and secure look-up operations without revealing the identity of patients. This system performs approximate matching while maintaining a constant growth in complexity. In both of these cases a number of theoretical and engineering challenges had to be overcome to scale SCM protocols to operate efficiently and to transition them from the laboratory into practice.

TRANSCRIPT

Page 1: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Deploying SMC in Practice

Khaled El EmamElectronic Health Information Laboratory & uOttawa

EXAMPLES IN HEALTHCARE SETTINGS

Page 2: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Researchers’ Need for Data• Digitization, performance-based funding, greater

inter-operability and fiscal pressures make more data available for research

• Linked data allows analyses to span more of the continuum of care and look at social determinants of health

• Severe competition for research funding means there is an urgency to providing access to data to support proposals, funding, and delivery of results

Page 3: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Benefits of Sharing Research Data

• Confirmation of published results• Availability for meta-analyses• Feedback to improve data quality• Cost savings from not collecting the data

again• Minimize need for participants to provide

data repeatedly• Data for instruction and education

1

Page 4: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Benefits of Sharing Data: Commercial• Software testing• Targeted marketing campaigns• Post marketing surveillance• Monetization of data• Information product development• Internal analytics (models for decision

support)• Device diagnostics

1

Page 5: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Regulatory Framework

• Legislation and regulations cover personally-identifiable health data

• When not mandated or permitted, use and disclosure of health data for secondary purposes requires either consent or anonymization in accordance with regulations

1

Page 6: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Secondary Purposes

• Secondary purposes means non-direct care uses of personal health information including:– Research– Public health– Quality/safety measurement– Payment– Provider certification or accreditation– Marketing– Other commercial activities

1

Safran C, Bloomrosen M, Hammond E, Labkoff S, S K-F, Tang P, Detmer D. Toward a national framework for the secondary use of health data: An American Medical Informatics Association white paper. Journal of the American Medical Informatics Association, 2007; 14:1-9.

Page 7: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Types of Data Flows

1

Page 8: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Data Flows

• Uses by an agent/affiliate for secondary purposes (e.g., financial analysis, human resources planning)

• Mandatory disclosures (e.g., communicable diseases, gunshot wounds)

• Permitted discretionary disclosures for secondary purposes (e.g., public health and research)

• Other disclosures for secondary purposes (e.g., marketing)

1

Page 9: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Facilitating Disclosure• Privacy and confidentiality concerns have

made many health organizations very reluctant to share data and to take advantage of large scale analytics on the cloud. Three factors contribute:• Regulations that limit disclosure of personal

information• Legitimate concerns

about potential data leaks• Compelled disclosures

Page 10: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Methods to Facilitate Disclosure• Two options now exist for disclosing

personal information for complex analytics to avoid being the “creepy guy in the room”– Anonymization– Secure multi-party

computation

Page 11: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

ANONYMIZATION

Page 12: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

De-identification

Page 13: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

PARAT

Providing organizations with a scalable solution to automate the anonymization of structured & unstructured data

• Measure risk of re-identification under different attacks

• Transform data to ensure that the risk is below a given threshold

• Configure re-identification risk threshold settings directly from Privacy Analytics’ online Risk Assessment application

• Determine enterprise policies for data sharing to ensure that administrative controls are in place to manage risk

• Automate data sharing agreements and certifications that confirm risks are “very small” for re-identification

Page 14: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

PARAT Software

Page 15: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Page 16: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

SECURE COMPUTATION

What is secure computation?

Page 17: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Secure Computation• A set of techniques (protocols) developed to

allow computations to be performed on encrypted data – do analytics without knowing or exposing the raw data

• Example computations: Public health surveillance: rates, categorical data analysis Rare adverse drug event detection using regression

models (GLM and GEE) for distributed data Secure matching: record lookup without revealing the

record details, matching databases without revealing matching keys

Page 18: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

SECURE COMPUTATION

How does secure computation work?

Page 19: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Public key

Encryption

9

Page 20: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Randomized Public key

Encryption

10

Page 21: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Randomized Public key

Encryption

1 2 1 2 ,If r r then c c but 1 2sk skDec c Dec c m

11

Page 22: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Randomized Public key

Encryption

Notation denoting an encrypted

plaintext

Page 23: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Additively Homomorphic

Encryption

Page 24: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Additively Homomorphic

Encryption

Page 25: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Additively Homomorphic

Encryption

Page 26: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

SECURE SURVEILLANCE

Page 27: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Page 28: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

ARO Surveillance from Long Term Care Homes in Ontario

Disclosure of Colonization / Infection RatesNot Currently Legally RequiredFrom LTCHs in Ontario

Objective:

Compute colonization rates without knowing the values for any single LTCH

Page 29: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Page 30: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

DISEASE SURVEILLANCE

Data Aggregator

Key holder[count1] x [count2] x[count3] x [count4] =

[count1 + count2 +count3 + count4]

[count1]

[count2]

[count3]

[count4]

(count1 + count2 +count3 + count4) / 4

Page 31: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

DISEASE SURVEILLANCE

High Response Rate = 82%

Page 32: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

DISEASE SURVEILLANCE

Region 0-60 61-120 121-180 180 + Facilities participating / total (%)

North 15 / 18 19 / 25 10 / 15 5 / 5 49 / 63 (77.7)East 23 / 23 34 / 34 25 / 25 13 /15 95 / 97 (97.9)Central East 16 / 16 35 / 41 43 / 49 30 / 34 124 / 140 (88.6)Toronto 3 / 6 5 /7 10 / 12 12 / 13 27 / 38 (71.1)Central West 12 / 13 40 / 46 51 / 56 19 / 22 122 / 137 (89.1)West 23 / 34 42 / 68 23 / 34 7 /10 95 / 146 (65.1)Total(%)

89 / 110 (80.9)

175 / 221 (79.2)

162 / 191 (84.8)

86 / 99 (86.7)

512 / 621 (82.4)

Page 33: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

DISEASE SURVEILLANCE (MRSA)

  Regions

Facility number of beds

Central West

North East Toronto West Central East

Bed group prevalence

0-60 3.31 1.57 3.17 -- 8.38 0.72 3.87

61-120 2.73 1.07 2.04 -- 7.88 1.8 3.34

121-180 3.15 0.56 2.54 0.91 7.83 1.08 2.94

180 + 2.91 -- 2.37 2.58 8.63 1.68 2.61

Regional prevalence

3.00 0.79 2.42 1.86 8.04 1.44  

Page 34: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

ANONYMOUS LINKING

Page 35: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Anonymous Linking

• Typical use cases:– The best fields to link databases on are quite

sensitive: health insurance number, social security/insurance number, medical record number

– Organizations do not have the authority to exchange data, but need to de-duplicate databases or do lookups

• Anonymous linking allows the linking of records in remote databases without sharing any sensitive or personal information or sharing any secrets

Page 36: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Page 37: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Page 38: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Generation and

distribution of keys

Page 39: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Encryption of OHIP# using a

public key

Page 40: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Encryption of local OHIP# using the

same

public key

Page 41: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Perform homomorphic

equality test on the two

encrypted values

Page 42: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Decrypt the results of the equality tests

using the

private key

Page 43: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Results of matches can be used to de-duplicate, link, or return a lookup

outcome

Page 44: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

ONLINE & OFFLINE PURCHASES

Page 45: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Page 46: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Chlamidya Screening

Objective: compute screening rates and evaluate impact of interventions to improve them

Pulling data out of EMRs (family doctors) about females 14-24 eligible for Chlamidya screening and match that with lab data to determine how many have been screened (match rates)

Matching on OHIP#, name, DoB

No release of personal information in the process

Page 47: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Page 48: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Critical Success Factors / Risks• Embedding within a healthcare environment• Large multi-disciplinary teams• Supporting software after the initial prototype• Academic evaluation criteria• Publishing outside the traditional computer

science community• Managing and protecting IP

Page 49: Canadian AI 2014 Conference Keynote - Deploying SMC in Practice

Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca

Contact

[email protected]

@kelemam

www.ehealthinformation.ca