populangthei2b2databasewith* … · populangthei2b2databasewith* heterogeneous*emrdata:*aseman’c*...
Post on 09-Feb-2021
5 Views
Preview:
TRANSCRIPT
-
Popula'ng the i2b2 Database With Heterogeneous EMR Data: A Seman'c
Network Approach
Sebas&an Matea, Thomas Bürklea, Felix Köpckea, Bernhard Breilb, Bernd Wullichc, Mar&n Dugasb, Hans-‐Ulrich Prokoscha,d, Thomas Ganslandtd
a Chair of Medical Informa1cs, University Erlangen-‐Nuremberg, Erlangen, Germany b Ins1tute of Medical Informa1cs, University of Münster, Münster, Germany c Department of Urology, Erlangen University Hospital, Erlangen, Germany d Center for Medical Informa1on and Communica1on, Erlangen University Hospital, Erlangen, Germany
28.09.2011, Oslo
MIE2011 full paper
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
2
Introduc'on: Replacing Double Data Entry With “Single Source” – The DPKK Example
PSA:
EMR DPKK Partner A
PSA:
EMR DPKK Partner B
Original DPKK database
DPKK Researcher
NEW! i2b2 System
DPKK Researcher
NEW! ETL
PSA:
EMR DPKK Partner A
PSA:
EMR DPKK Partner B
EMR database EMR database
Erlangen Münster
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
3
EMR in Münster: AGFA ORBIS EMR in Erlangen: Siemens Soarian
Problem 1: How to align and merge data from independent sources?
n Different data structures (different vendors) n Similar informa&on (full prostate cancer documenta&on in EMR)
Introduc'on: Developing the Data Integra'on Approach
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
Erlangen as an example: > 42,000 data elements (including data element versioning)
4
Problem 2: How to efficiently access and process the huge piles of data elements?
Introduc'on: Developing the Data Integra'on Approach
Data:
n Structured n Uncoded
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
5
Oracle SQL Ser
ver MySQL
COGNOS
Talend JDBC ETL SQL SQ
L SQL
Tools / Data Access Layer
Data / Storage Layer
Ontology / Knowledge Base Layer
Methods: The “Ontology Layer”
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
6
Gleason3
PSA
Target Ontology Target Dataset
Gleason1 PSA:
PSA_value
Source Ontology Source System
Gleason2
n Express the target dataset with a target ontology. n Express the source system(s) with one or more source ontologies. n Create a mapping ontology to connect concepts from the target ontology to
concepts from the source ontology.
n Some concepts can be mapped directly, others require transforma&ons and/or data filtering, which can be expressed with intermediate nodes.
Methods: Approach Overview
Mapping Ontology
ADD
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
Target Ontology Mapping Ontology Source Ontology
7
Target
n To allow complex data transforma&ons, nodes can be cascaded into „expression trees“. n OWL-‐to-‐SQL transla&on: For each mapping node, an SQL statement, which performs the
opera&on expressed by the node, can be constructed automa&cally .
A
B
C
D
E
t=1 t=2
t=4 t=3
IF
GREATER
ADD SUBTR
processing order
Methods: Approach Overview
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
Target Ontology Mapping Ontology Source Ontology
8
Target
A
B
E
t=1 t=2
t=4 IF
GREATER
t=3 SUBTR
Methods: OWL to SQL Transla'on
ADD C
D
INSERT INTO TEMPTABLE(E, A, V) SELECT OP1.E, „Sum…“, OP1.V + OP2.V FROM (SELECT E, A, V FROM …) OP1 FULL OUTER JOIN (SELECT E, A, V FROM …) OP2 ON OP1.E = OP2.E WHERE OP1.V IS NOT NULL AND …
INSERT INTO TEMPTABLE(E, A, V) SELECT OP1.E, „Sum…“, OP1.V + OP2.V FROM (SELECT E, A, V FROM …) OP1 FULL OUTER JOIN (SELECT E, A, V FROM …) OP2 ON OP1.E = OP2.E WHERE OP1.V IS NOT NULL AND …
How to process nodes
How to access data
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
OntoExport
9
OntoEdit QuickMapp i2b2
Gleason3
PSA
Target Ontology
Target Dataset
Gleason1
PSA_value
Source Ontology
Gleason2 PSA:
Source System
Mapping Ontology
ADD
Results: Developed Prototypical Tools
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
10
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
Results: Ontology Mapping for DPKK Results in the Münster and Erlangen Mapping
11
n Described i2b2 as a target system in OWL (incl. DPKK dataset) n Created source ontologies for both EMRs (Soarian and ORBIS) n Expressed many types of Oracle database opera'ons inside the ontology (as it
was illustrated with the blue “cloud”)
n Preliminary mapping results for Erlangen and Münster (from a total of 166 concepts):
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
12
n Re-‐implementa'on of the current proof-‐of-‐concept n Client/server architecture to improve administra&ve aspects n Integra&on with other data integra&on tools (e. g. Talend OpenStudio)
to simplify access to mul'ple databases
n Expression of complex and abstract rela'onships between data elements, e. g. between data elements at different hierarchy levels
n Allow certain calcula&ons between data elements from mul&ple forms or different source systems (=> handling of temporal aspects)
n Strong desire to export EMR data to other systems besides i2b2: allow descrip&on of arbitrary target systems and database models
Discussion Limita'ons and How to Tackle Them: TODOs
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
13
n Abstrac'on of data records and database opera&ons to single ontology nodes n Simplifies data iden&fica&on, extrac&on and transforma&on from source
systems (e.g. EMRs) n Allows re-‐use of created “ETL knowledge” (compared to SQL code) n OWL format enables easy linkage to medical ontologies (SNOMED, NCIt, …)
n Adds the missing “easy-‐to-‐use” ETL part to the i2b2 n Comprehensive solu&on covers
1. Access to heterogeneous data 2. “Future-‐proof” (ontology-‐based) data processing 3. Interac&ve transla&onal querying of clinical data (i2b2)
=> Step forward in implementing “secondary use / single source”
Discussion: Benefits
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
Thank you for your aden'on!
Sebas1an Mate would like to thank the “Arbeitskreis SoKware-‐Qualität und –Fortbildung e. V.” (ASQF) and the commission of computer science of the University of Erlangen who have awarded this work with the 500€ ASQF diploma prize. Thank you!
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
Backup slides
-
Sebas&an Mate // Chair Of Medical Informa&cs // University of Erlangen-‐Nuremberg [MIE2011]
top related