icsm bioinformatics infrastructures towards a new frontier · 2018-10-29 · ohdsi observational...

26
Danila Vella, PhD Laboratory of Informatics and Systems Engineering for Clinical Research, Pavia ICSM bioinformatics infrastructures towards a new frontier: large-scale observational research

Upload: others

Post on 03-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

Danila Vella, PhD

Laboratory of Informatics and Systems Engineering for Clinical Research, Pavia

ICSM bioinformatics infrastructures towards a new frontier:

large-scale observational research

Page 2: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Outline

Introduction

Learning Health System Cycle

FAIR principles

Methods

Bioinformatics infrastructures at ICMS:

– REDCap

– I2b2

– OHDSI

Results

Data collections and data processing

Conclusions

Page 3: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Introduction

Learning Healthcare System

Cycle use of clinical data to

improve clinical practice:

• enable data usability

• provide large-scale

databases for observational

studies

FAIR principles

Findability:

persistent identifiers, indexed data

Accessibility:

standard, free and shared protocol to

retrive data, authentication procedure

Interoperability:

vocabolaries, shared language and

codes

Reusability:

data richly described, source

information

FAIR Principles concern:

• data

• metadata

• informatics tools and infrastructures

leading to data

Methodological challenges:

• common data standards

• multinational collaboration

• compliance with regulatory

laws of nations

• standard methods and

tools to process data

Page 4: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Introduction

FAIR principles

Findability:

persistent identifiers, indexed data

Accessibility:

standard, free and shared protocol to

retrive data, authentication procedure

Interoperability:

vocabolaries, shared language and

codes

Reusability:

data richly described, source

information

FAIR Principles concern:

• data

• metadata

• informatics tools and infrastructures

leading to data

Learning Healthcare System

Cycle use of clinical data to

improve clinical practice:

• enable data usability

• provide large-scale

databases for observational

studies

Methodological challenges:

• common data standards

• multinational collaboration

• compliance with regulatory

laws of nations

• standard methods and

tools to process data

Page 5: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Bioinformatics infrastructures at ICMS

Observational studies

Support decision health

systems

I2b2

REDCap

ETL ETL

Page 6: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Bioinformatics infrastructures at ICMS

Observational studies

Support decision health

systems

I2b2

REDCap

ETL ETL

Clinical Scientific

Institute Maugeri:

• IRCCS (Institute for

Research and Health

Care) hospital

network

• 18 centers in Italy

• reference point in the

Italian rehabilitative

medicine field.

REDCap (Research

Electronic Data

Capture) is

responsible of Data

Entry process

ETL: Extract, Trasform, Load

i2b2 (Informatics for Integrating Biology

and the Bedside): enables the building of

a data warehouse

OHDSI (Observational Health

Data Sciences and Informatics):

standardized model for data

sharing

Page 7: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Bioinformatics infrastructures at ICMS

Observational studies

Support decision health

systemsICSM

I2b2

REDCap

ETL ETL

Page 8: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

REDCap: Research Electronic Data Capture

REDCap is one of the most popular web-based applications to support data capture for research

studies and registries.

Page 9: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

record_id redcap_event_name redcap_data_access_group id_cod data_nascita sesso diagnosi_eziologica_1

43136 ingresso_arm_1 pavia 1289917569 01/01/1956 0 437.1

43164 ingresso_arm_1 pavia 178440 18/01/1939 1 430

43164 dimissione_arm_1 pavia

43195 ingresso_arm_1 pavia 439742059 29/05/1946 0 434.11

43225 ingresso_arm_1 pavia 69642 03/10/1953 1 430

43225 dimissione_arm_1 pavia

REDCap FAIRness

F principle

local identifiers; adding fields containing

public identifiers to obtain a publicly shared

identifier schemes

Page 10: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

record_id redcap_event_name redcap_data_access_group id_cod data_nascita sesso diagnosi_eziologica_1

43136 ingresso_arm_1 pavia 1289917569 01/01/1956 0 437.1

43164 ingresso_arm_1 pavia 178440 18/01/1939 1 430

43164 dimissione_arm_1 pavia

43195 ingresso_arm_1 telese 439742059 29/05/1946 0 434.11

43225 ingresso_arm_1 telese 69642 03/10/1953 1 430

43225 dimissione_arm_1 telese

REDCap FAIRness

F principle

local identifiers; adding fields containing

public identifiers to obtain a publicly shared

identifier schemes

A principle

smart user management system, center-specific data

access, data usability limited by privacy issues

Page 11: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

record_id redcap_event_name redcap_data_access_group id_cod data_nascita sesso diagnosi_eziologica_1

43136 ingresso_arm_1 pavia 1289917569 01/01/1956 0 437.1

43164 ingresso_arm_1 pavia 178440 18/01/1939 1 430

43164 dimissione_arm_1 pavia

43195 ingresso_arm_1 pavia 439742059 29/05/1946 0 434.11

43225 ingresso_arm_1 pavia 69642 03/10/1953 1 430

43225 dimissione_arm_1 pavia

REDCap FAIRness

R principle

metadata consists of many attributes facilitating data

understanding and usability

F principle

local identifiers; adding fields containing

public identifiers to obtain a publicly shared

identifier schemes

A principle

smart user management system, user-specific data

access, data usability limited by privacy issues

Page 12: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

record_id redcap_event_name redcap_data_access_group id_cod data_nascita sesso diagnosi_eziologica_1

43136 ingresso_arm_1 pavia 1289917569 01/01/1956 0 437.1

43164 ingresso_arm_1 pavia 178440 18/01/1939 1 430

43164 dimissione_arm_1 pavia

43195 ingresso_arm_1 pavia 439742059 29/05/1946 0 434.11

43225 ingresso_arm_1 pavia 69642 03/10/1953 1 430

43225 dimissione_arm_1 pavia

I principle

‘Text Box’ field allows the restrictive insertion of terms from over 400 different BioPortal ontologies (including

most used HL7, ICD9-CM, LOINC, etc…).

REDCap FAIRness

R principle

metadata consists of many attributes facilitating data

understanding and usability

F principle

local identifiers; adding fields containing

public identifiers to obtain a publicly shared

identifier schemes

A principle

smart user management system, user-specific data

access, data usability limited by privacy issues

Page 13: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Bioinformatics infrastructures at ICMS

Observational studies Support

decision health systems

ICSM

I2b2

REDCap

ETL ETL

Page 14: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

I2b2: Informatics for Integrating Biology and the Bedside

Objectives: a software infrastructure designed to

1. integrate data form clinical heterogeneous

sources (data warehouse)

2. easily query them

Data structure: CRC (Clinical Research Chart)

schema is a set of defined tables containing

patient’s clinical data (image)

Ontologies: data are mapped into concepts

organized in an tree-like structure

Page 15: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

I2b2 FAIRness

• ontology-oriented structure-> data

richly described (R)

• use of known standard

ontologies/classification (ATC,

SNOMED, …) (I)

• interface and data

structure facilitating query

run (A)

• rich metadata, data

indexed in a searchable

source (F)

Page 16: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Bioinformatics infrastructures at ICMS

Observational studies Support

decision health systems

ICSM

I2b2

REDCap

ETL ETL

Page 17: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

OHDSI Observational Health Data Sciences and Informatics

• OHDSI: international network of researchers and observational health databases (since 2014)

• Collect data from heterogeneous source

• OHDSI leverages on OMOP CDM (Observational Medical Outcomes Partnership Common

Data Model), a standard to store data into common database

OHDSI Network:

• >200 researchers in

academia, industry and

government

• >82 databases from 17

countries

• 1.2 billion patients

records

Page 18: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Observational Medical Outcomes Partnership Common Data Model (OMOP CDM)

Page 19: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Patient-Centric

• patient information stored in the table

PERSON

• the primary key serves as external key for

almost all other tables: Drug_exposure,

Condition_Occurrence,…

Observational Medical Outcomes Partnership Common Data Model (OMOP CDM)

Page 20: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

OMOP CDM FAIRness

I principle:

OMOP supplies a unique standard term (concept_id) when more

vocabularies intersect describing the same concept

A principle:

OMOP model ensures that the same query can be applied consistently

to different database

F principle:

unique identifiers and available community-tools allow data

management

R principle:

details about source database are stored in a dedicated table,

‘CDM_SOURCE’

Concept_id: OMOP

identifier for the standard

vocabulary

Source_concept_id:

OMOP identifier for the

source vocabularies

Page 21: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Results

REDCap registries Records Involved Structures

Heart Failure 4569Pavia, Pavia-Boezio, Montescano, Tradate, Lumezzane,

Veruno, Telese, Torino, Milano

Stroke 47 Pavia, Boezio, Telese

Respiratory disease 1289 Tradate

Palliative Care 1700 Pavia

Page 22: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Results

REDCap registries Records Involved Structures

Heart Failure 4569Pavia, Pavia-Boezio, Montescano, Tradate, Lumezzane,

Veruno, Telese, Torino, Milano

Stroke 47 Pavia, Boezio, Telese

Respiratory disease 1289 Tradate

Palliative Care 1700 Pavia

Diabets

Cardiology

Oncology

Respiratory Disease

Nephrology

HIS

BioBank

Registries

DataSources

64318 patients

158819 visits

8458062 observations

i2b2

Discharge

letters

Page 23: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Results

REDCap registries Records Involved Structures

Heart Failure 4569Pavia, Pavia-Boezio, Montescano, Tradate, Lumezzane,

Veruno, Telese, Torino, Milano

Stroke 47 Pavia, Boezio, Telese

Respiratory disease 1289 Tradate

Palliative Care 1700 Pavia

Diabets

Cardiology

Oncology

Respiratory Disease

Nephrology

HIS

BioBank

Registries

DataSources

64318 patients

158819 visits

8458062 observations

ETL from REDCap to i2b2

Heart Failure registry

i2b2

Discharge

letters

Page 24: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Results

REDCap registries Records Involved Structures

Heart Failure 4569Pavia, Pavia-Boezio, Montescano, Tradate, Lumezzane,

Veruno, Telese, Torino, Milano

Stroke 47 Pavia, Boezio, Telese

Respiratory disease 1289 Tradate

Palliative Care 1700 Pavia

Diabets

Cardiology

Oncology

Respiratory Disease

Nephrology

HIS

BioBank

Registries

DataSources

64318 patients

158819 visits

8458062 observations

ETL from REDCap to i2b2

Heart Failure registry

ETL from i2b2 to OMOP

Mock database

i2b2

Discharge

letters

Page 25: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

Conclusions

• ETL pipelines represent a valuable base to design specific applications for

REDCap registries and i2b2 databases to transfer ICSM data to OHDSI, an

international network of researchers and observational health databases

• The currently used infrastructure REDCap and i2b2 incorporate many FAIR

services

• Some problems have already been addressed for data mapping between

different database architectures and others should still be addressed (ETL from

i2b2 to OMOP 98.72% mapping data)

• This architecture is a contribute allowing ICSM to implement the Learning

Healthcare System Cycle

Page 26: ICSM bioinformatics infrastructures towards a new frontier · 2018-10-29 · OHDSI Observational Health Data Sciences and Informatics •OHDSI: international network of researchers

NETTAB, Genova, 2018Danila Vella

NETTAB 2018

Thanks for attention!