aleksandar zivaljevic - annotation of clinical datasets using openehr archetypes as a solution for...
TRANSCRIPT
Presented by:Aleksandar ZivaljevicAuckland Bioengineering Institute
Annotation of clinical datasets using openEHR Archetypes
A solution for data access issues faced in biomedical projects
Authors: Aleksandar Zivaljevic, Koray Atalag, Jim Warren, Mike Cooling, David Nickerson, Peter Hunter
Prepared for HINZ 2015
Agenda
1. Problem definition2. Study findings3. Proposed solution4. Proposed technique5. Stage of development
Shape and form of clinical informationCurrent practice:• Clinical information is collected and stored in
disparate and diverse formats• Not much attention is put on potential reuse of
information
As a result:• Required information is difficult to find• Manual mapping by experts is required for
making use of information• Clinical data is not directly machine
processable, impacting interoperability and reuse of information
Impact on projects:• Increased project time• Increased project cost• Validity of projects is impacted
List of clinical dataset repositories:https://accelerate.ucsf.edu/research/celdac
Case studiesProject Declared issue Resolution
@neurIST complex information processing tool-chain for the integrated management of cerebral aneurysms [3]
Workload (timeframes / cost)
Manual mapping based tool called @neuInfo was developed and used as an infrastructure component that facilitates data access.
Computational modelling and evaluation of cardiovascular response under pulsatile impeller pump support [2]
Research validity None, as no clinical data has been used in modelling.
euHeart: personalized and integrated cardiac care using patient-specific cardiovascular modelling [1]
Research validity None. Project hasn’t validated the models as access to clinical data was not available.
Clinically Oriented Translational Cancer Multilevel Modelling: The ContraCancrum Project [4]
Workload (timeframes / cost)
Clinicians participating in the project manually prepare clinical data in accordance to the prescribed model and upload that data using web portal.
Ontology Based Data Management Systems for Post-Genomic Clinical Trials within a European Grid Infrastructure for Cancer Research [5]
Workload (timeframes / cost)
Querying engine that is capable of returning clinical data as if it originates from a single source created. Manual mapping involved in preparing the data.
Comparative Effectiveness Research [6] Research validity None. However, development of data mapping applications is suggested as a necessary step.
Findings:• Evidence found that disparate forms that clinical information is stored in has negative
effects on the (bioengineering) research projects• There is a need for a computer based system for discovery of clinical information concepts
within clinical datasetsProposed solution:• We propose that standards based information systems, openEHR Archetypes in particular,
be used as metadata of clinical datasets and that they are allocated to datasets through the process of annotation
Standards based information models openEHR Archetypes
openEHR Archetypes are:• Standards based information models• “Maximum datasets” for a given clinical concept• “Ontologies of information” as opposed to “ontologies
of reality”• Created by domain experts• Building blocks for larger archetypes (earlier called templates)• Composite clinical information models, built of Reference Models (RMs)
Current annotation method
Issues:• Concept attributes are being annotated, not clinical concepts which are normally
composite in nature• Discovery of data is relatively easy, while discovery of information is not
(information is usually contained in composite concept, not just an attribute)• Corresponding systems have to be aware of all ontologies used to annotate
concepts
Terminology – SNOMED, ICD, LOINC(for illustration purposes only)
Proposed annotation method
Benefits:• Complete clinical concepts that are contained in the dataset are being
annotated, not just individual attributes• Discovery of information becomes possible
Source: OpenEHR foundation 2015 (www.openehr.org/ckm)
Framework
1
3
2
4Project diagram. Phases of the project are numbered as 1, 2, 3 and 4. Subject to change.
Phases of the current project
Phase 1Transform openEHR Archetype into ontology of reality
Source: OpenEHR foundation 2015 (www.openehr.org/ckm)
ADLS
Find composite concepts
Find simple concepts
Code all found terms
Enlarge and enrich found terms
Assemble ontology of reality
Phase 1 – detail (subject to change)
Find terminology code for the CC
Add CC’s terminology code to
ontology
Code listed in the archetype or found
in step 1?Yes
Find code in SNOMED
No
More concepts?Yes
Find all CCs in the Archetype
Find all elements forming the CC
Add element’s terminology code to ontology under the
concept
More attributes?No Yes
Attribute listed in terminologyYes
Use data/information level methods to find the element in terminology
No
Element found with certainty?
Yes
Add element name to ontology under the
conceptNo
End
Start
CC’s code found?
Yes
Use data/information level methods to find the
CC in terminology
No
CC found with certainty?
Yes Add CC’s name to ontology
No
CC – Composite ConceptCC example: Blood Pressure
Element – a concept or an elementElement example: Systolic
Issue: Where is the list of composite (clinical) concepts and their elements? Potential solutions: 1) create a database that will be an index of CCs described by archetypes, RIMs, FHIR resources...2) Guess from SNOMED – find first common predecessor of all or majority of elements3) Use information level methods matching between the potential combination of elements in the archetype and combination of elements in the SNOMED
Issue: SNOMED serverPotential solution: Shrimp
Find out what the terminology code is by finding first common parent for all or as many sub-concepts/elements. Explore literature for other options
String based techniques:· edit distances, n-gram similarity, prefixes and suffixes (Cohen, Ravikumar, & Fienberg, 2003)Language based techniques:· semantic expansion (Arias et al., 2012, p. 91) and semantic enrichment (Meziane, 2004, p. 217)Constraints based techniques:· data type of the element, value ranges that element's value can belong to and similar
Annotation of clinical datasets using openEHR Archetypes
A solution for data access issues faced in biomedical projects
?Presented by:Aleksandar Zivaljevic (Alex)Auckland Bioengineering Institute
References
[1] Smith N, Vecchi A de, McCormick M, Nordsletten D, Camara O, Frangi AF, Delingette H, Sermesant M, Relan J, Ayache N, Krueger MW, Schulze WHW, Hose R, Valverde I, Beerbaum P, Staicu C, Siebes M, Spaan J, Hunter P, Weese J, Lehmann H, Chapelle D, Rezavi R. euHeart: personalized and integrated cardiac care using patient-specific cardiovascular modelling. Interface Focus. 2011;1(3):349–64.
[2] Shi Y, Brown AG, Lawford PV, Arndt A, Nuesser P, Hose DR. Computational modelling and evaluation of cardiovascular response under pulsatile impeller pump support. Interface Focus. 2011;1(3):320–37.
[3] Villa-Uriol MC, Berti G, Hose DR, Marzo A, Chiarini A, Penrose J, Pozo J, Schmidt JG, Singh P, Lycett R, Larrabide I, Frangi AF. @neurIST complex information processing toolchain for the integrated management of cerebral aneurysms. Interface Focus. 2011;1(3):308–19.
[4] Marias K, Sakkalis V, Roniotis A, Farmaki C, Stamatakos G, Dionysiou D, Giatili S, Uzunoglou N, Graf N, Bohle R, Messe E, Coveney PV, Manos S, Wan S, Folarin A, Nagl S, Büchler P, Bardyn T, Reyes M, Clapworthy G, Mcfarlane N, Liu E, Bily T, Balek M, Karasek M, Bednar V, Sabczynski J, Opfer R, Renisch S, Carlsen IC. Clinically Oriented Translational Cancer Multilevel Modeling: The ContraCancrum Project. In: Dössel O, Schlegel WC, editors. World Congress on Medical Physics and Biomedical Engineering, September 7 - 12, 2009, Munich, Germany. Springer Berlin Heidelberg; 2009. p. 2124–7.
[5] Weiler G, Brochhausen M, Graf N, Schera F, Hoppe A, Kiefer S. Ontology Based Data Management Systems for Post-Genomic Clinical Trials within a European Grid Infrastructure for Cancer Research. 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2007 EMBS 2007. 2007. p. 6434–7.
[6] Sox HC. Comparative Effectiveness Research: A Progress Report. Ann Intern Med. 2010 Oct 5;153(7):469–72.