the effects of applying cell-suppression and perturbation to aggregated genetic data
Post on 23-Feb-2016
29 Views
Preview:
DESCRIPTION
TRANSCRIPT
Linked2Safety Project (FP7-ICT-2011-7 – 5.3)A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR
SEMANTICALLY-INTERCONNECTING ELECTRONIC HEALTH RECORDSAND CLINICAL TRIALS SYSTEMS
ADVANCING PATIENTS SAFETY IN CLINICAL RESEARCH
12th International Conference on Bioinformatics and Bioengineering, Larnaka
The effects of applying cell-suppression andperturbation to aggregated genetic data
Athos Antoniades, John Keane, Aristos Aristodimou, Christa Philipou, Andreas Constantinou, Christos Georgousopoulos, Federica Tozzi, Kyriacos Kyriacou, Andreas Hadjisavas, Maria Loizidou, Christiana Demitriou and Constantinos
Pattichis
FP7, ICT-2011 – 5.3 Page 2
Introduction
Why Share Data? What are the current legal and ethical
limitations? How have scientists shared medical data so far? Key Problems Perturbation Cell Suppression
FP7, ICT-2011 – 5.3 Page 3
The Problem
Why share data:Replication TestingStatistical PowerMultiple Testing Problem
Legal and Ethical IssuesAnonymization vs PseudoanonimizationLimitations derived from consent form signed by subjectsOther, regional, study, or subject specific issues.
FP7, ICT-2011 – 5.3 Page 4
How have scientists shared medical data Contingency Table and Data Cube
example
aa aA AA
Case U00 U01 U02
Control U10 U11 U12
FP7, ICT-2011 – 5.3 Page 5
16 year old widow Problem
A paper that analyzes data from a specific study reports:
Marital Status
AgeAge Married Widowed Single0-16 0 1 50
18-24 10 5 5025-34 40 7 4035~ 60 15 20
FP7, ICT-2011 – 5.3 Page 6
16 year old widow Problem
A paper that analyzes data from a specific study reports:
Marital Status
AgeAge Married Widowed Single0-16 0 1 50
18-24 10 5 5025-34 40 7 4035~ 60 15 20
FP7, ICT-2011 – 5.3 Page 7
16 year old widow Problem
A paper that analyzes data from a specific study reports:
Marital Status
AgeAge Married Widowed Single0-16 0 1 50
18-24 10 5 5025-34 40 7 4035~ 60 15 20
FP7, ICT-2011 – 5.3 Page 8
Categorization Differences
Paper 1 that analyzes data from a specific
study reports:Marital Status
Age
Age MarriedWidowe
d Single0-16 NA NA 50
18-24 10 7 5025-34 40 7 4035~ 60 15 20
Marital Status
Age
Age MarriedWidowe
d Single0-16 NA NA 50
18-25 10 8 5026-35 45 7 4036~ 55 14 20
Paper 2 that analyzes data from the same
study reports:
FP7, ICT-2011 – 5.3 Page 9
Perturbation and Cell Suppression
Original Data
Marital Status
Age
Age MarriedWidowe
d Single0-16 0 1 50
18-24 10 7 5025-34 40 7 4035~ 60 15 20
Marital Status
Age
Age MarriedWidowe
d Single0-16 NA NA 51
18-24 9 8 4925-34 40 7 4135~ 61 14 21
Perturbation (+-1) andCell Suppression (<5)
FP7, ICT-2011 – 5.3 Page 10
Evaluation
• Most common parameters testedPerturbation:[0], [-1,1], [-3,3], [-5,5], [-10,10]Cell Supression: <0, <=1, <=3,<=5,<=10
• Standard main effect test using Chi Square
• Pearson’s Correlation Coefficient used to evaluate deviation of each parameter combination to original results.
• A-priory defined threshold for Pearson’s correlation coefficient <=0.95.
FP7, ICT-2011 – 5.3 Page 11
Evaluating Parameters with a matrix of graphs
FP7, ICT-2011 – 5.3 Page 12
Conclusion and Future Work
We were able to identify for this dataset, the maximum noise that can be added to the data without significantly affecting the outcomes.
Results only relevant to MASTOS, all other datasets need to repeat the analytical approach described.
Further investigation is necessary to identify the minimum parameter settings to satisfy legal and ethical requirements.
FP7, ICT-2011 – 5.3 Page 13
Who to Contact
Athos AntoniadesUniversity of Cyprus
email: athos@cs.ucy.ac.cy
top related