grid security/edinburgh 5 th & 6 th december 2002 confidentiality, consent & access peter...
TRANSCRIPT
Grid Security/EdinburghGrid Security/Edinburgh
55thth & 6 & 6thth December 2002 December 2002
Confidentiality, Confidentiality, Consent & Consent &
AccessAccess
Peter Singleton - Cambridge Health Informatics
CLEFCLEFCLinical e-Science CLinical e-Science
FrameworkFramework• Supported by MRC Funding
• To capture broad medical information
• Render it safe/confidential and useful/ accessible for research community
• Develop exemplar approaches for confidentiality and security
• Bring together new technologies for managing complex data-sets and formats
CLEF ConsortiumCLEF Consortium
• University of Manchester
• CHIME/University College London
• University of Brighton
• University of Sheffield
• Cambridge University Health
Started 1st October 2002
Cambridge University Health Cambridge University Health TeamTeam
• Prof. Don Detmer Dennis Gillings Professor of Healthcare Management, Judge Institute, University of Cambridge
• Peter SingletonSenior Associate, Judge Institute/Cambridge Health Informatics
Past Related WorkPast Related Work• Detmer
– Chair, Board of Regents, National Library of Medicine, NIH (1989-91)
– Chair, National C’ttee on Vital & Health Statistics, DHHS (1996-98)
– Chair, IOM C’ttee: The Computer-based Patient Record (1991 & 1997)
– Review “Information for Health” UK Undersecretary for State, 2000
– IOM Committee Member, Crossing the Quality Chasm, 2002
• Singleton– DoH: Gaining Patient Consent to Disclosure (2000)– DoH: Confidentiality Code of Practice (2002)– NHS LifeHouse: Confidentiality & Security paper (2002)– NHS IA – Security & Confidentiality in ERDIP Programme– N&Mid Hants – Confidentiality Policy– S. Staffs. HA – Public Awareness Campaign
RMH
UCLHClinical records
Narrative record (letters, reports)Structured data
Original EHR Data
Pseudonymisation
One-way key encrypt
Patient information from Royal Marsden, UCLH and cancer network..
..is pseudonymised with one way key encryption..
..to derive a repository of patient information, with data held in original form.
RMH
UCLHClinical records
Narrative record (letters, reports)Structured data
Original EHR Data
Pseudonymisation
One-way key encrypt
G R I D
Derived encoded
and generated
text
National Service Framework and minimum dataset
Idealised domain ontology
metadata
Text encoding / Information Extraction
template
Generated feedback text
The structure of the EHR will be informed by an idealised ontology of the
domain (VUM), which in turn will be derived by examining the original
data...
..with data transfer via the GRID..
The original data will then be processed using information
extraction technology (Sheffield)...
..and the extraction templates also informed by the ontology.
A separate repository, derived from the first, will be
constructed , comprising encoded and extracted data..
..and also text generated from the structured data (Brighton).
..and the NSF for Cancer.
archetypes
G R
I D
Queries
E-scientistClinical TrialsCase-basedNHS management for cancer
Workbench Services
Generated bullet point summaries - for patients ?
Trial recruitment
Outcomes
Protocol Choice
An e-Science workbench will be constructed, with
which researchers can query the derived
repository..
Data will be transfered via the GRID....with controlled and audited access
..for a variety of research tasks.
Reports are generated for the benefit of patients.
Access ControlAudit Trail
Decode the key
Data managers ?
Generated text & coding is fed back to the original sites....to help existing data managers.
RMH
UCLHClinical records
Narrative record (letters, reports)Structured data
Original EHR Data
Pseudonymisation
One-way key encrypt
G R I D
Derived encoded
and generated
text
National Service Framework and minimum dataset
Idealised domain ontology
metadata
Text encoding / Information Extraction
template
Generated feedback text
archetypes
Encoded entries & Generated text
SAFETY
Verify pseudonymisationRepresent patient wishes on
disclosure ?
Governance
Security issues will be addressed at key points..
..including verifying the pseudonymisation.
G R
I D
Queries
E-scientistClinical TrialsCase-basedNHS management for cancer
Workbench Services
Generated bullet point summaries - for patients ?
Trial recruitment
Outcomes
Protocol Choice
Access ControlAudit Trail
Decode the key
Data managers ?
RMH
UCLHClinical records
Narrative record (letters, reports)Structured data
Original EHR Data
Pseudonymisation
One-way key encrypt
G R I D
Derived encoded
and generated
text
National Service Framework and minimum dataset
Idealised domain ontology
metadata
Text encoding / Information Extraction
template
Generated feedback text
archetypes
Encoded entries & Generated text
De-identification computer
Import into EHR
Clinical codingof narratives
Text generation from codes
CLEF Anonymised Data Repository
Ad hoc queries
Detailed queries
Record extracts
Define classes of EPR data to be excluded, or to be marked as sensitiveIdentify patients for CLEFStrike-out Patient Name in narratives
Extract relevant EPR data
Ethics Committee approvedIncludes restricted dataMay drill down to individualsOnly includes narratives if specifically approved
Ethics Committee approvedIncludes restricted dataMay only access aggregate dataMay not drill down to individuals
Recognised research or health care organisationExcludes restricted dataMay only access aggregate dataMay not drill down to individuals
RMH EPR system
Royal Marsden Hospital
UCLBrighton,Manchester,Sheffield
Research community
Access control filterAudit trail
CLEF Anonymised Data Repository
Structured data only
Advice on security and confidentiality issues
Cambridge University Health Remove any remaining identifiersCreate CLEF 'patient' IDManually review all narratives
Encrypt dataTransfer securely to UCLDecrypt data
Label sensitive data items as restrictedLabel narratives as very restricted
[Mark up codes and generated text are not restricted, but original narratives remain very restricted