joining up clinical care and biomedical research clef/ clef-services alan rector et al....
TRANSCRIPT
Joining up clinical care and Joining up clinical care and biomedical researchbiomedical research
CLEF/CLEF/
CLEF-ServicesCLEF-Services
Alan Rector et al.Alan Rector et [email protected]@cs.man.ac.uk
www.clinical-escience.orgwww.clinical-escience.org
A Convergence of NeedA Convergence of Need
Post genomic research
more & better more & better clinical clinical
informationinformation
Clinical Practice, Audit &
Governance
National Policyfor IT (NPfIT)
Evidence based
health care
Clinical trials recruitment & support
Patient Care Cycle
Trials & Research Cycle
Making itMaking itWorkWork
ExtractInformation
Integrate &Aggregate
(provenance)
Construct‘Chronicle’(inference)
Individual Summaries& Queries
Depersonalisetext
PseudonymiseIn Hospital
ReidentifyBy Hospital(if agreed)
Chronicle
PseudonymisedRepository Ethical oversight
committee
FormulateQueries
Knowledgeenrichment
(workflows+)Hazard
Monitoring
StatisticalDisclosure
Control
The repository so far: The repository so far: 20,000 patients, 20,000 patients,
1.7 million EHR nodes, 1.7 million EHR nodes, 880,000 data values880,000 data values 365,000 narratives, 365,000 narratives,
ready for Information Extractionready for Information Extraction
“…located in the lower or middle lobe of the lung..”
Location of the tumour is not held in structured form: it is only found consistently in the narrative of radiology
reports
IE and the Chronicle are needed to IE and the Chronicle are needed to enrich the original structured dataenrich the original structured data
The coded data are not sufficient to The coded data are not sufficient to meet the needs of research queries:meet the needs of research queries:
Information ExtractionInformation Extraction
InformationExtraction
Engine
…There is a lobulated mass adjacent to suture material in the right lower zone.…
Chest 8/5/96
…Once again numerous nodules are seen in the right lower lobe.…
Chest 30/5/96
…A further new lung nodule is seen in the left upper zone.…
Chest 6/8/97
Date: 8/5/96Sign-1:
Name: massLocus-1:
Name: lower zoneLocation-1:
Sign: Sign-1Locus: Locus-1
Date: 30/5/96Sign-2:
Name: nodulesLocus-2:
Name: lower lobeLocation-2:
Sign: Sign-2Locus: Locus-2
Date: 6/8/97Sign-3:
Name: lung noduleLocus-3:
Name: upper zoneLocation-3:
Sign: Sign-3Locus: Locus-3
Information IntegrationInformation Integration
Chronicler
Date: 30/5/96Sign-2:
Name: nodulesLocus-2:
Name: lower lobeLocation-2:
Sign: Sign-2Locus: Locus-2
Date: 8/5/96Sign-1:
Name: massLocus-1:
Name: lower zoneLocation-1:
Sign: Sign-1Locus: Locus-1
Date: 6/8/97Sign-3:
Name: lung noduleLocus-3:
Name: upper zoneLocation-3:
Sign: Sign-3Locus: Locus-3
Problem:Name: lung cancerLocation: lower lobe of lung
ICD-O-T code: C34.3
CLEF WYSIWYM Query Writer – L2Login Query OMIM Exit
Relevant Subjects
Treatment Profiles
Outcome Measures
Patients with [this type of tumour] at [this site]
Percentage of patients in [this condition] after [this interval of time].
Patients who received [this type of treatment], compared with patients who received [this type of treatment].
AND [another characteristic].
For communities ofFor communities of biomedical E-Scientists:biomedical E-Scientists: Safe &Safe & easy analysis easy analysis
e-Scientists use query writer
Percentage of patients [alive] after [1 year] and after [2 years] and after [5 years].
Patients who received [radiotherapy] [daily], compared with patients who received [radiotherapy] [every other day] and those who received [no radiotherapy].
AND [BRCA1 (OMIM 113705)].Patients with [adenocarcinoma] at [this site]bladder
bloodbrainbreastcervixcolonendometriumkidneylarynxlunglymphnodeoesophagusovarypancreasprostaterectumskinstomachtestistongue
Patients with [adenocarcinoma] of [this laterality] of [this part] of [breast]
…or graphically
For all patients with adenocarcinoma of the breast, compare the survival at 1, 2 and 5 years for those patients who had daily radiotherapy, those who had radiotherapy on alternate days, and those who had no radiotherapy
Feedback TextQUERY RESULT1792 patients diagnosed with adenocarcinoma of the breast were found. 788 had radiotherapy daily, 513 had it on alternate days and 491 had no radiotherapy.
After 5 years, 20% (n=158) of patients who had a daily treatment were alive. After 5 years, 10% (n=49) who had alternate day treatment were alive. After 5 years, 5% (n=27) of the patients who had no treatment were alive.
Result of running query displayed as generated text…
Generated text confirms the nature of the query
SubmitSubmit
monitored for risk of reidentification
SAFE
AND [this genetic marker].
Links out to other bioscience resources e.g. OMIM, PubMed, Gene Ontology
Query construction: Query construction: specifying the age rangespecifying the age range
Query construction: Query construction: selecting a cancer tissue diagnosisselecting a cancer tissue diagnosis
The final query: The final query: ready to submit to the repository query ready to submit to the repository query
serviceservice
1975 1980 1985 1990 1995 2000
DiedGrade III infiltrating
ductal carcinoma left breast
NodesLiverSpleenKidneyBone
NodesLiverSpleenKidneyBone
T1>N1>M0
T1N3cM0
T1>N3cM1
>Stage IIA Stage IIIc Stage IV
SSSSSSS S S SS SSStaging CT
RRecurrence
R R R
TAMOXIFEN ARIMIDEX
RADIO
CHEMO
Visualisation of Chronicle: Visualisation of Chronicle: A A timeline of index eventstimeline of index events
CLEF ChronicleCLEF Chronicle• Inferred “best view” of what is known about a
patient– What was done and why?– What happened and why?
• Includes detailed provenance - sources & inferences
• Enriched by information on analyses, conclusions, workflows, related searches, etc.
• Workflows
• Designed for scalable retrieval, aggregation, and alignment– Simulations and real data for testing of alternatives
• Virtual view of EHR ... or ...• Persistent network using Semantic Web/Grid Technology ...or...• Temporal DB
A CLEF ChronicleA CLEF ChroniclePatient:1382
Mass:1666
locus
Pain:5735
locus
Radio:1812
plansplans
Chemo:6502
plans
treats
treats
locus
target
attends
attendsattends
Ulcer:1945
finding
Cancer:1914
finding
Breast:1492
locus
Clinic:4096
Biopsy:1066
Clinic:1024plans Clinic:2010
about
PATIENT
CONSULT
INTERVENTION
PROBLEM
PATHOLOGY
LOCUS
Other Feature
Clinical Course
Presence / absence
Status
GoalINVESTIGATION
recommend finding
Status
Evidence for
Sex
Age
GeneticProfile
Diagnostic Status
DRUG
REGIMEN
target
Size
treat
s/in
dica
tes
about
compare
has-locus
involvessubpart
causes
indi
cate
dBy
causes
has-locus
after
has-locusre
com
men
d
indication
Time
SchemaSchema
GeneticGenomicImaging…
Personal data Privacy:Personal data Privacy:Policy and Legal LevelPolicy and Legal Level
• in the UK– Common Law of Confidentiality– Data Protection Act 1998– Human Rights Act 1998– Section 60 of Heath & Social Care Act 2001– BMA Guidance Oct 1999– GMC Guidance Sept 2000
– at a European Level– European Community Directive 95/46/EC (1995)– Council of Europe Recommendation R(97)5
(1997)
Personal data PrivacyPersonal data Privacy
• The UK Data Protection Act defines "personal data" as:– "data which relate to a living individual who can be
identified (a) from those data, or (b) from those data and other information which is in the possession of, or is likely to come into the possession of, the data controller“
• This is likely to apply to any clinically useful information about living patients
• Patient consent would be required for CLEF to acquire the data into its repository, and for each new kind of research access to the data– This is not scalable
Anonymised data PrivacyAnonymised data Privacy
• If legitimately processed for research or statistical purposes,
“can be kept indefinitely and are exempt from the subject access rights if the results of the work are not made available in a form from which data subjects can be identified”
• If CLEF could make sure the data is anonymous, consent would not be required and the data could be used for any reasonable research purpose– This is the only scalable approach
• But.. no anonymisation can be perfect
Socio-Technical Response: Socio-Technical Response: brief summarybrief summary
1) De-identify the data– make overt identifying data unusable
– Several layers of pseudonymisation– No source should be able to generate or obtain a central CLEF ID – CLEF should never be able to get to source IDs
– Collaborating with e-science/Grid projects , BioBank, NHS,...
2) De-personalise the parts of the record which risk re-identifying the data
– depersonalising text requires specialised tools
3) Still treat the data as having some small potential risk of re-identification
– regulate, restrict and monitor access – FAME-Permis– statistical disclosure control
• Cathie Marsh Centre for Census and Survey Research
CLEF to CLEF-Services CLEF to CLEF-Services from development to usefrom development to use
• Trials: Initially– Institute of Genetic Medicine, London
collaboration sponsored by the London Development Agency
• Close links to Clinical Trials Centres - MRC / UCL / CRUK – Genetic effects on outcome and treatment response
• Start with cancer, but ...– Aim for re-use in cardiovascular disease, mental health,
diabetes...
– Trialists and statisticians• New forms of data require new forms of analysis
• Technical deployment: build – An E-Science/Grid environment
• Collaborative and collection based research• Re-usable components within a Grid Services
architecture• Build on and extend myGrid• Collaborate with CaGrid, PsyGrid, eDiamond, ...
To link Grid & NHS computingTo link Grid & NHS computingNHS Information (NPfIT) Research Information
genetics
biosciences
clinical trials and longitudinal studies
knowledge management
decision support
data mining
health improvement
patient centred medicine
clinical service framework
clinical governance
outcome: effectiveness/ efficiency
evidence
Information forpatients & public
To realise the research potential of To realise the research potential of the nhsthe nhs
SecurityImages, Language, Genomics
Architecture, Web/Grid Services, Terminology, Standards
Software componentsSoftware componentsSteps towards wide deployment Steps towards wide deployment
• Reducing the effort of re-use• Common services & standards for...
– Language technology – Repositories– Chronicles– Workbench analysis and workflows
• Build on myGrid & other E-Science / Grid efforts
– Knowledge resource management as Grid Services
• Build on CO-ODE/HyOntUse
The key is collaborationThe key is collaboration• Post-genomic research & cancer
– NCRI – designed to fill gaps in planning matrix• NTRAC/NCTR, NCRN, CaGrid, eDiamond
– PsyGrid– BioBank
• NHS, DoH and Industry– DTI – Funding for linkage to industry and NHS
• International– US NCICB, Mouse/Human Anatomy projects– Cancer Networks & National Programme
• E-Science & Standards– myGrid, CO-ODE/HyOntUse, ESNW, Semantic Web/Grid,...– HL7, CEN TC251, ISO TC215– Semantic Mining NoE in 6th Framework
• Bioinformatics– OBO (GO, GONG, MGED), EBI, SAEL, ...
CLEF ConsortiumCLEF Consortium• Clinical
– Royal Marsden Hospital Trust– London Institute for Genetic Medicine– North and North Central London Cancer Networks
• Technical and E-Science– University of Manchester (E-Science & knowledge)
• Biohealth Information Forum – link to myGrid• Cathie Marsh Centre for Census and Social Statistics• E-Science Centre including link to Security projects (FAME-Permis)
– University College London (Electronic Health Records)• Centre for Health Informatics and Multiprofessional Education• E-Science Centre including link to Security projects
– University of Sheffield (Language technology)• Natural Language Group, Department of Computer Science
– Cambridge University (Privacy and policy)• Judge Institute for Management Studies,
– University of Brighton (Language technology and HCI)• Information Technology Research Institute,
• Clinical and Industrial/NHS steering committees– DTI Funded companion project
www.clinical-escience.orgwww.clinical-escience.org
CLEF/CLEF-SERVICESCLEF/CLEF-SERVICES
Empowering the e-Clinical ResearchEmpowering the e-Clinical ResearchMaking new research possibleMaking new research possible
Making existing research more effectiveMaking existing research more effective
byby
Removing barriers to data sharingRemoving barriers to data sharingJoining up health care and biomedical researchJoining up health care and biomedical research
Re-usable components for clinical e-scienceRe-usable components for clinical e-science