anonymizing health data
DESCRIPTION
Slide deck from the O'Reilly webcast on the "Anonymizing Health Data" bookTRANSCRIPT
![Page 1: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/1.jpg)
Anonymizing Health DataWebcast
Case Studies and Methods to Get You Started
Khaled El Emam & Luk Arbuckle
![Page 2: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/2.jpg)
Anonymizing Health Data
Part 1 of Webcast: Intro and Methodology
Part 2 of Webcast: A Look at Our Case Studies
Part 3 of Webcast: Questions and Answers
Khaled El Emam & Luk Arbuckle
![Page 3: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/3.jpg)
Anonymizing Health Data
Part 1 of Webcast: Intro and Methodology
Khaled El Emam & Luk Arbuckle
![Page 4: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/4.jpg)
Anonymizing Health Data
To Anonymize or not to Anonymize
Khaled El Emam & Luk Arbuckle
![Page 5: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/5.jpg)
Anonymizing Health Data
Consent needs to be informed.
To Anonymize or not to Anonymize
Khaled El Emam & Luk Arbuckle
![Page 6: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/6.jpg)
Anonymizing Health Data
Consent needs to be informed.
Not all health care providers are willing to share their patient’s PHI.
To Anonymize or not to Anonymize
Khaled El Emam & Luk Arbuckle
![Page 7: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/7.jpg)
Anonymizing Health Data
Consent needs to be informed.
Not all health care providers are willing to share their patient’s PHI.
Anonymization allows for the sharing of health information.
To Anonymize or not to Anonymize
Khaled El Emam & Luk Arbuckle
![Page 8: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/8.jpg)
Anonymizing Health Data
Consent needs to be informed.
Not all health care providers are willing to share their patient’s PHI.
Anonymization allows for the sharing of health information.
To Anonymize or not to Anonymize
Compelling financial case. Breach cost ~$200 per patient.
Khaled El Emam & Luk Arbuckle
![Page 9: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/9.jpg)
Anonymizing Health Data
Consent needs to be informed.
Not all health care providers are willing to share their patient’s PHI.
Anonymization allows for the sharing of health information.
To Anonymize or not to Anonymize
Compelling financial case. Breach cost ~$200 per patient.
Khaled El Emam & Luk Arbuckle
![Page 10: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/10.jpg)
Anonymizing Health Data
Consent needs to be informed.
Not all health care providers are willing to share their patient’s PHI.
Anonymization allows for the sharing of health information.
To Anonymize or not to Anonymize
Privacy protective behaviors by patients.
Compelling financial case. Breach cost ~$200 per patient.
Khaled El Emam & Luk Arbuckle
![Page 11: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/11.jpg)
Anonymizing Health Data
Masking Standards
Khaled El Emam & Luk Arbuckle
![Page 12: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/12.jpg)
Anonymizing Health Data
Masking Standards
First name, last name, SSN.
Khaled El Emam & Luk Arbuckle
![Page 13: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/13.jpg)
Anonymizing Health Data
Masking Standards
Distortion of data—no analytics.
First name, last name, SSN.
Khaled El Emam & Luk Arbuckle
![Page 14: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/14.jpg)
Anonymizing Health Data
Masking Standards
Creating pseudonyms.
First name, last name, SSN.
Distortion of data—no analytics.
Khaled El Emam & Luk Arbuckle
![Page 15: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/15.jpg)
Anonymizing Health Data
Masking Standards
Removing a whole field.
Creating pseudonyms.
First name, last name, SSN.
Distortion of data—no analytics.
Khaled El Emam & Luk Arbuckle
![Page 16: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/16.jpg)
Anonymizing Health Data
Masking Standards
Removing a whole field.
Creating pseudonyms.
Replacing actual values with random ones.
First name, last name, SSN.
Distortion of data—no analytics.
Khaled El Emam & Luk Arbuckle
![Page 17: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/17.jpg)
Anonymizing Health Data
De-identification Standards
Khaled El Emam & Luk Arbuckle
![Page 18: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/18.jpg)
Anonymizing Health Data
De-identification Standards
Age, sex, race, address, income.
Khaled El Emam & Luk Arbuckle
![Page 19: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/19.jpg)
Anonymizing Health Data
Minimal distortion of data—for analytics.
Age, sex, race, address, income.
De-identification Standards
Khaled El Emam & Luk Arbuckle
![Page 20: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/20.jpg)
Anonymizing Health Data
Minimal distortion of data—for analytics.
Age, sex, race, address, income.
De-identification Standards
Safe Harbor in HIPAA Privacy Rule.
Khaled El Emam & Luk Arbuckle
![Page 21: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/21.jpg)
Anonymizing Health Data
What’s “Actual Knowledge”?
Privacy Rule
Safe Harbor
Khaled El Emam & Luk Arbuckle
![Page 22: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/22.jpg)
Anonymizing Health Data
What’s “Actual Knowledge”?
Info, alone or in combo, that could identify an individual.
Khaled El Emam & Luk Arbuckle
![Page 23: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/23.jpg)
Anonymizing Health Data
What’s “Actual Knowledge”?
Info, alone or in combo, that could identify an individual.
Has to be specific to the data set—not theoretical.
Khaled El Emam & Luk Arbuckle
![Page 24: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/24.jpg)
Anonymizing Health Data
What’s “Actual Knowledge”?
Info, alone or in combo, that could identify an individual.
Has to be specific to the data set—not theoretical.
Occupation Mayor of Gotham.
Khaled El Emam & Luk Arbuckle
![Page 25: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/25.jpg)
Anonymizing Health Data
Heuristics, or rules of thumb.
Minimal distortion of data—for analytics.
Age, sex, race, address, income.
Safe Harbor in HIPAA Privacy Rule.
De-identification Standards
Khaled El Emam & Luk Arbuckle
![Page 26: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/26.jpg)
Anonymizing Health Data
Heuristics, or rules of thumb.
Statistical method in HIPAA Privacy Rule.
Minimal distortion of data—for analytics.
Age, sex, race, address, income.
Safe Harbor in HIPAA Privacy Rule.
De-identification Standards
Khaled El Emam & Luk Arbuckle
![Page 27: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/27.jpg)
Anonymizing Health Data
De-identification Myths
Khaled El Emam & Luk Arbuckle
![Page 28: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/28.jpg)
Anonymizing Health Data
De-identification Myths
Myth: It’s possible to re-identify most, if not all, data.
Khaled El Emam & Luk Arbuckle
![Page 29: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/29.jpg)
Anonymizing Health Data
De-identification Myths
Myth: It’s possible to re-identify most, if not all, data.
Using robust methods, evidence suggests risk can be very small.
Khaled El Emam & Luk Arbuckle
![Page 30: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/30.jpg)
Anonymizing Health Data
De-identification Myths
Myth: It’s possible to re-identify most, if not all, data.
Myth: Genomic sequences are not identifiable, or are easy to re-identify.
Using robust methods, evidence suggests risk can be very small.
Khaled El Emam & Luk Arbuckle
![Page 31: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/31.jpg)
Anonymizing Health Data
De-identification Myths
Myth: It’s possible to re-identify most, if not all, data.
Myth: Genomic sequences are not identifiable, or are easy to re-identify.
In some cases can re-identify, difficult to de-identify using our methods.
Using robust methods, evidence suggests risk can be very small.
Khaled El Emam & Luk Arbuckle
![Page 32: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/32.jpg)
Anonymizing Health Data
A Risk-based De-identification Methodology
Khaled El Emam & Luk Arbuckle
![Page 33: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/33.jpg)
Anonymizing Health Data
A Risk-based De-identification Methodology
The risk of re-identification can be quantified.
Khaled El Emam & Luk Arbuckle
![Page 34: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/34.jpg)
Anonymizing Health Data
A Risk-based De-identification Methodology
The risk of re-identification can be quantified.
The Goldilocks principle: balancing privacy with data utility.
Khaled El Emam & Luk Arbuckle
![Page 35: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/35.jpg)
Anonymizing Health Data
Khaled El Emam & Luk Arbuckle
![Page 36: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/36.jpg)
Anonymizing Health Data
A Risk-based De-identification Methodology
The risk of re-identification can be quantified.
The Goldilocks principle: balancing privacy with data utility.
The re-identification risk needs to be very small.
Khaled El Emam & Luk Arbuckle
![Page 37: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/37.jpg)
Anonymizing Health Data
A Risk-based De-identification Methodology
The risk of re-identification can be quantified.
The Goldilocks principle: balancing privacy with data utility.
De-identification involves a mix of technical, contractual, and other measures.
The re-identification risk needs to be very small.
Khaled El Emam & Luk Arbuckle
![Page 38: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/38.jpg)
Anonymizing Health Data
Steps in the De-identification Methodology
Step 1: Select Direct and Indirect Identifiers
Step 2: Setting the Threshold
Step 3: Examining Plausible Attacks
Step 4: De-identifying the Data
Step 5: Documenting the Process
Khaled El Emam & Luk Arbuckle
![Page 39: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/39.jpg)
Anonymizing Health Data
Step 1: Select Direct and Indirect Identifiers
Khaled El Emam & Luk Arbuckle
![Page 40: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/40.jpg)
Anonymizing Health Data
Direct identifiers: name, telephone number, health insurance card number, medical record number.
Step 1: Select Direct and Indirect Identifiers
Khaled El Emam & Luk Arbuckle
![Page 41: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/41.jpg)
Anonymizing Health Data
Direct identifiers: name, telephone number, health insurance card number, medical record number.
Indirect identifiers, or quasi-identifiers: sex, date of birth, ethnicity, locations, event dates, medical codes.
Step 1: Select Direct and Indirect Identifiers
Khaled El Emam & Luk Arbuckle
![Page 42: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/42.jpg)
Anonymizing Health Data
Step 2: Setting the Threshold
Khaled El Emam & Luk Arbuckle
![Page 43: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/43.jpg)
Anonymizing Health Data
Maximum acceptable risk for sharing data.
Step 2: Setting the Threshold
Khaled El Emam & Luk Arbuckle
![Page 44: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/44.jpg)
Anonymizing Health Data
Maximum acceptable risk for sharing data.
Needs to be quantitative and defensible.
Step 2: Setting the Threshold
Khaled El Emam & Luk Arbuckle
![Page 45: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/45.jpg)
Anonymizing Health Data
Maximum acceptable risk for sharing data.
Needs to be quantitative and defensible.
Is the data in going to be in the public domain?
Step 2: Setting the Threshold
Khaled El Emam & Luk Arbuckle
![Page 46: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/46.jpg)
Anonymizing Health Data
Maximum acceptable risk for sharing data.
Needs to be quantitative and defensible.
Is the data in going to be in the public domain?
Extent of invasion-of-privacy when data was shared?
Step 2: Setting the Threshold
Khaled El Emam & Luk Arbuckle
![Page 47: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/47.jpg)
Anonymizing Health Data
Step 3: Examining Plausible Attacks
Khaled El Emam & Luk Arbuckle
![Page 48: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/48.jpg)
Anonymizing Health Data
Recipient deliberately attempts to re-identify the data.
Step 3: Examining Plausible Attacks
Khaled El Emam & Luk Arbuckle
![Page 49: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/49.jpg)
Anonymizing Health Data
Recipient deliberately attempts to re-identify the data.
Recipient inadvertently re-identifies the data.“Holly Smokes, I know her!”
Step 3: Examining Plausible Attacks
Khaled El Emam & Luk Arbuckle
![Page 50: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/50.jpg)
Anonymizing Health Data
Recipient deliberately attempts to re-identify the data.
Recipient inadvertently re-identifies the data.
Data breach at recipient’s site, “data gone wild”.
Step 3: Examining Plausible Attacks
Khaled El Emam & Luk Arbuckle
![Page 51: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/51.jpg)
Anonymizing Health Data
Recipient deliberately attempts to re-identify the data.
Data breach at recipient’s site, “data gone wild”.
Adversary launches a demonstration attack on the data.
Step 3: Examining Plausible Attacks
Khaled El Emam & Luk Arbuckle
Recipient inadvertently re-identifies the data.
![Page 52: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/52.jpg)
Anonymizing Health Data
Step 4: De-identifying the Data
Khaled El Emam & Luk Arbuckle
![Page 53: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/53.jpg)
Anonymizing Health Data
Step 4: De-identifying the Data
Generalization: reducing the precision of a field.Dates converted to month/year, or year.
Khaled El Emam & Luk Arbuckle
![Page 54: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/54.jpg)
Anonymizing Health Data
Step 4: De-identifying the Data
Generalization: reducing the precision of a field.
Suppression: replacing a cell with NULL.Unique 55-year old female in birth registry.
Khaled El Emam & Luk Arbuckle
![Page 55: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/55.jpg)
Anonymizing Health Data
Step 4: De-identifying the Data
Generalization: reducing the precision of a field.
Suppression: replacing a cell with NULL.
Sub-sampling: releasing a simple random sample.50% of data set instead of all data.
Khaled El Emam & Luk Arbuckle
![Page 56: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/56.jpg)
Anonymizing Health Data
Step 5: Documenting the Process
Khaled El Emam & Luk Arbuckle
![Page 57: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/57.jpg)
Anonymizing Health Data
Step 5: Documenting the Process
Process documentation—a methodology text.
Khaled El Emam & Luk Arbuckle
![Page 58: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/58.jpg)
Anonymizing Health Data
Step 5: Documenting the Process
Results documentation—data set, risk thresholds, assumptions, evidence of low risk.
Khaled El Emam & Luk Arbuckle
Process documentation—a methodology text.
![Page 59: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/59.jpg)
Anonymizing Health Data
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
![Page 60: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/60.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Pr(re-id, attempt) = Pr(attempt) × Pr(re-id | attempt)
Khaled El Emam & Luk Arbuckle
![Page 61: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/61.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”) Pr(re-id, acquaintance) = Pr(acquaintance) × Pr(re-id | acquaintance)
![Page 62: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/62.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)
T3: Data Breach (“data gone wild”) Pr(re-id, breach) = Pr(breach) × Pr(re-id | breach)
![Page 63: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/63.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)
T3: Data Breach (“data gone wild”)
T4: Public Data (demonstration attack) Pr(re-id), based on data set only
![Page 64: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/64.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
![Page 65: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/65.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
Many precedents going back multiple decades.
![Page 66: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/66.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
Many precedents going back multiple decades.
Recommended by regulators.
![Page 67: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/67.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
Many precedents going back multiple decades.
Recommended by regulators.All based on max risk though.
![Page 68: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/68.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
Many precedents going back multiple decades.
Recommended by regulators.All based on max risk though.
![Page 69: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/69.jpg)
Anonymizing Health Data
Part 2 of Webcast: A Look at Our Case Studies
Khaled El Emam & Luk Arbuckle
![Page 70: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/70.jpg)
Anonymizing Health Data
Cross Sectional Data: Research Registries
Khaled El Emam & Luk Arbuckle
![Page 71: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/71.jpg)
Anonymizing Health Data
Cross Sectional Data: Research Registries
Khaled El Emam & Luk Arbuckle
Better Outcomes Registry & Network (BORN)of Ontario
![Page 72: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/72.jpg)
Anonymizing Health Data
Cross Sectional Data: Research Registries
Khaled El Emam & Luk Arbuckle
Better Outcomes Registry & Network (BORN)of Ontario
140,000 births per year.
![Page 73: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/73.jpg)
Anonymizing Health Data
Cross Sectional Data: Research Registries
Khaled El Emam & Luk Arbuckle
Better Outcomes Registry & Network (BORN)of Ontario
140,000 births per year.
Cross-sectional—mothers not traced over time.
![Page 74: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/74.jpg)
Anonymizing Health Data
Cross Sectional Data: Research Registries
Khaled El Emam & Luk Arbuckle
Better Outcomes Registry & Network (BORN)of Ontario
140,000 births per year.
Cross-sectional—mothers not traced over time.
Process of getting de-identified data from a research registry.
![Page 75: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/75.jpg)
Anonymizing Health Data
Cross Sectional Data: Research Registries
Khaled El Emam & Luk Arbuckle
Better Outcomes Registry & Network (BORN)of Ontario
140,000 births per year.
Cross-sectional—mothers not traced over time.
Process of getting de-identified data from a research registry.
![Page 76: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/76.jpg)
Anonymizing Health Data
Researcher Ronnie wants data!
Khaled El Emam & Luk Arbuckle
![Page 77: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/77.jpg)
Anonymizing Health Data
Researcher Ronnie wants data!
Khaled El Emam & Luk Arbuckle
919,710 recordsfrom 2005-2011
![Page 78: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/78.jpg)
Anonymizing Health Data
Researcher Ronnie wants data!
Khaled El Emam & Luk Arbuckle
919,710 recordsfrom 2005-2011
![Page 79: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/79.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
![Page 80: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/80.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
Average risk of 0.1 for Researcher Ronnie(and the data he specifically requested).
![Page 81: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/81.jpg)
Anonymizing Health Data
Choosing Thresholds
Khaled El Emam & Luk Arbuckle
0.05 if there were highly sensitive variables(congenital anomalies, mental health problems).
Average risk of 0.1 for Researcher Ronnie
![Page 82: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/82.jpg)
Anonymizing Health Data
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
![Page 83: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/83.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
Low motives and capacity
![Page 84: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/84.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
Low motives and capacity; low mitigating controls.
![Page 85: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/85.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
Pr(attempt) = 0.4
![Page 86: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/86.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)119,785 births out of a 4,478,500 women ( = 0.027)
![Page 87: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/87.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)Pr(aquaintance) = 1- (1-0.027)150/2 = 0.87
![Page 88: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/88.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)
T3: Data Breach (“data gone wild”)Based on historical data.
![Page 89: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/89.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)
T3: Data Breach (“data gone wild”)Pr(breach)=0.27
![Page 90: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/90.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)
T3: Data Breach (“data gone wild”)
T4: Public Data (demonstration attack)
![Page 91: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/91.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)
T3: Data Breach (“data gone wild”)
Overall riskPr(re-id, T) = Pr(T) x Pr(re-id | T) ≤ 0.1
![Page 92: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/92.jpg)
Anonymizing Health Data
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)Pr(aquaintance) = 1- (1-0.027)150/2 = 0.87
Overall risk Pr(re-id, acquaintance) = 0.87 × Pr(re-id | acquaintance) ≤ 0.1
![Page 93: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/93.jpg)
Anonymizing Health Data
De-identifying the Data Set
Khaled El Emam & Luk Arbuckle
![Page 94: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/94.jpg)
Anonymizing Health Data
Meeting Thresholds: k-anonymity
Khaled El Emam & Luk Arbuckle
k
![Page 95: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/95.jpg)
Anonymizing Health Data
Meeting Thresholds: k-anonymity
Khaled El Emam & Luk Arbuckle
![Page 96: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/96.jpg)
Anonymizing Health Data
De-identifying the Data Set
Khaled El Emam & Luk Arbuckle
MDOB in 1-yy; BDOB in wk/yy; MPC of 1 char.
![Page 97: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/97.jpg)
Anonymizing Health Data
De-identifying the Data Set
Khaled El Emam & Luk Arbuckle
MDOB in 1-yy; BDOB in wk/yy; MPC of 1 char.
MDOB in 10-yy; BDOB in qtr/yy; MPC of 3 chars.
![Page 98: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/98.jpg)
Anonymizing Health Data
De-identifying the Data Set
Khaled El Emam & Luk Arbuckle
MDOB in 1-yy; BDOB in wk/yy; MPC of 1 char.
MDOB in 10-yy; BDOB in qtr/yy; MPC of 3 chars.
MDOB in 10-yy; BDOB in mm/yy; MPC of 3 chars.
![Page 99: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/99.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
![Page 100: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/100.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
In 2006 Researcher Ronnie asks for 2005.
![Page 101: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/101.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
In 2006 Researcher Ronnie asks for 2005—deleted.In 2007 Researcher Ronnie asks for 2006.
![Page 102: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/102.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
In 2006 Researcher Ronnie asks for 2005.In 2007 Researcher Ronnie asks for 2006—deleted.In 2008 Researcher Ronnie asks for 2007.
![Page 103: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/103.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
In 2006 Researcher Ronnie asks for 2005.In 2007 Researcher Ronnie asks for 2006.In 2008 Researcher Ronnie asks for 2007—deleted.In 2009 Researcher Ronnie asks for 2008.
![Page 104: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/104.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
In 2006 Researcher Ronnie asks for 2005.In 2007 Researcher Ronnie asks for 2006.In 2008 Researcher Ronnie asks for 2007.In 2009 Researcher Ronnie asks for 2008—deleted.In 2010 Researcher Ronnie asks for 2009.
![Page 105: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/105.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
In 2006 Researcher Ronnie asks for 2005.In 2007 Researcher Ronnie asks for 2006.In 2008 Researcher Ronnie asks for 2007.In 2009 Researcher Ronnie asks for 2008—deleted.In 2010 Researcher Ronnie asks for 2009.
Can we use the same de-identification scheme every year?
![Page 106: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/106.jpg)
Anonymizing Health Data
Khaled El Emam & Luk Arbuckle
![Page 107: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/107.jpg)
Anonymizing Health Data
Khaled El Emam & Luk Arbuckle
![Page 108: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/108.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
BORN data pertains to very stable populations.
![Page 109: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/109.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
BORN data pertains to very stable populations.
No dramatic changes in the number or characteristics ofbirths from 2005-2010.
![Page 110: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/110.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
BORN data pertains to very stable populations.
No dramatic changes in the number or characteristics ofbirths from 2005-2010.
Revisit de-identification scheme every 18 to 24 months.
![Page 111: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/111.jpg)
Anonymizing Health Data
Year on Year: Re-using Risk Analyses
Khaled El Emam & Luk Arbuckle
BORN data pertains to very stable populations.
No dramatic changes in the number or characteristics ofbirths from 2005-2010.
Revisit de-identification scheme every 18 to 24 months.
Revisit if any new quasi-identifiers are added or changed.
![Page 112: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/112.jpg)
Anonymizing Health Data
Longitudinal Discharge Abstract Data:State Inpatient Databases
Khaled El Emam & Luk Arbuckle
![Page 113: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/113.jpg)
Anonymizing Health Data
Longitudinal Discharge Abstract Data:State Inpatient Databases
Khaled El Emam & Luk Arbuckle
Linking a patient’s records over time.
![Page 114: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/114.jpg)
Anonymizing Health Data
Longitudinal Discharge Abstract Data:State Inpatient Databases
Khaled El Emam & Luk Arbuckle
Linking a patient’s records over time.
Need to be de-identified differently.
![Page 115: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/115.jpg)
Anonymizing Health Data
Meeting Thresholds: k-anonymity?
Khaled El Emam & Luk Arbuckle
k?
![Page 116: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/116.jpg)
Anonymizing Health Data
Meeting Thresholds: k-anonymity?
Khaled El Emam & Luk Arbuckle
![Page 117: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/117.jpg)
Anonymizing Health Data
Meeting Thresholds: k-anonymity?
Khaled El Emam & Luk Arbuckle
![Page 118: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/118.jpg)
Anonymizing Health Data
De-identifying Under Complete Knowledge
Khaled El Emam & Luk Arbuckle
![Page 119: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/119.jpg)
Anonymizing Health Data
De-identifying Under Complete Knowledge
Khaled El Emam & Luk Arbuckle
![Page 120: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/120.jpg)
Anonymizing Health Data
De-identifying Under Complete Knowledge
Khaled El Emam & Luk Arbuckle
![Page 121: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/121.jpg)
Anonymizing Health Data
De-identifying Under Complete Knowledge
Khaled El Emam & Luk Arbuckle
![Page 122: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/122.jpg)
Anonymizing Health Data
State Inpatient Database (SID) of California
Khaled El Emam & Luk Arbuckle
![Page 123: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/123.jpg)
Anonymizing Health Data
State Inpatient Database (SID) of California
Khaled El Emam & Luk Arbuckle
Researcher Ronnie wants public data!
![Page 124: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/124.jpg)
Anonymizing Health Data
State Inpatient Database (SID) of California
Khaled El Emam & Luk Arbuckle
Researcher Ronnie wants public data!
![Page 125: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/125.jpg)
Anonymizing Health Data
State Inpatient Database (SID) of California
Khaled El Emam & Luk Arbuckle
![Page 126: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/126.jpg)
Anonymizing Health Data
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
![Page 127: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/127.jpg)
Anonymizing Health Data
T1:Deliberate Attempt
Measuring Risk Under Plausible Attacks
Khaled El Emam & Luk Arbuckle
T2: Inadvertent Attempt (“Holly Smokes, I know her!”)
T3: Data Breach (“data gone wild”)
T4: Public Data (demonstration attack) Pr(re-id) ≤ 0.09 (maximum risk)
![Page 128: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/128.jpg)
Anonymizing Health Data
De-identifying the Data Set
Khaled El Emam & Luk Arbuckle
![Page 129: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/129.jpg)
Anonymizing Health Data
De-identifying the Data Set
Khaled El Emam & Luk Arbuckle
BirthYear in 5-yy (cut at 1910-);AdmissionYear unchanged;DaysSinceLastService in 28-dd (cut at 7-, 182+);LengthOfStay same as DaysSinceLastService.
![Page 130: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/130.jpg)
Anonymizing Health Data
De-identifying the Data Set
Khaled El Emam & Luk Arbuckle
BirthYear in 5-yy (cut at 1910-);AdmissionYear unchanged;DaysSinceLastService in 28-dd (cut at 7-, 182+);LengthOfStay same as DaysSinceLastService.
![Page 131: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/131.jpg)
Anonymizing Health Data
Connected Variables
Khaled El Emam & Luk Arbuckle
![Page 132: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/132.jpg)
Anonymizing Health Data
Connected Variables
Khaled El Emam & Luk Arbuckle
QI to QI
![Page 133: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/133.jpg)
Anonymizing Health Data
Connected Variables
Khaled El Emam & Luk Arbuckle
QI to QI
Similar QI? Same generalization and suppression.
![Page 134: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/134.jpg)
Anonymizing Health Data
Connected Variables
Khaled El Emam & Luk Arbuckle
QI to QI
Similar QI? Same generalization and suppression.
QI to non-QI
![Page 135: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/135.jpg)
Anonymizing Health Data
Connected Variables
Khaled El Emam & Luk Arbuckle
QI to QI
Similar QI? Same generalization and suppression.
QI to non-QI
Non-QI is revealing?Same suppression so both are removed.
![Page 136: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/136.jpg)
Anonymizing Health Data
Other Issues Regarding Longitudinal Data
Khaled El Emam & Luk Arbuckle
![Page 137: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/137.jpg)
Anonymizing Health Data
Other Issues Regarding Longitudinal Data
Khaled El Emam & Luk Arbuckle
Date shifting—maintaining order of records.
![Page 138: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/138.jpg)
Anonymizing Health Data
Other Issues Regarding Longitudinal Data
Khaled El Emam & Luk Arbuckle
Date shifting—maintaining order of records.
Long tails—truncation of records.
![Page 139: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/139.jpg)
Anonymizing Health Data
Other Issues Regarding Longitudinal Data
Khaled El Emam & Luk Arbuckle
Date shifting—maintaining order of records.
Long tails—truncation of records.
Adversary power—assumption of knowledge.
![Page 140: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/140.jpg)
Anonymizing Health Data
Other Concerns to Think About
Khaled El Emam & Luk Arbuckle
![Page 141: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/141.jpg)
Anonymizing Health Data
Other Concerns to Think About
Khaled El Emam & Luk Arbuckle
Free-form text—anonymization.
![Page 142: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/142.jpg)
Anonymizing Health Data
Other Concerns to Think About
Khaled El Emam & Luk Arbuckle
Free-form text—anonymization.
Geospatial information—aggregation and geoproxy risk.
![Page 143: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/143.jpg)
Anonymizing Health Data
Other Concerns to Think About
Khaled El Emam & Luk Arbuckle
Free-form text—anonymization.
Geospatial information—aggregation and geoproxy risk.
Medical codes—generalization, suppression, shuffling (yes, as in cards).
![Page 144: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/144.jpg)
Anonymizing Health Data
Other Concerns to Think About
Khaled El Emam & Luk Arbuckle
Free-form text—anonymization.
Geospatial information—aggregation and geoproxy risk.
Medical codes—generalization, suppression, shuffling (yes, as in cards).
Secure linking—linking data through encryption before anonymization.
![Page 145: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/145.jpg)
Anonymizing Health Data
Part 3 of Webcast: Questions and Answers
Khaled El Emam & Luk Arbuckle
![Page 146: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/146.jpg)
Anonymizing Health Data
Khaled El Emam & Luk Arbuckle
More Comments or Questions: Contact us!
![Page 147: Anonymizing Health Data](https://reader037.vdocuments.us/reader037/viewer/2022103015/54c8d2094a7959a5058b457f/html5/thumbnails/147.jpg)
Anonymizing Health Data
Khaled El Emam & Luk Arbuckle
Khaled El Emam: [email protected]
Luk Arbuckle: [email protected]
More Comments or Questions: Contact us!