icic 2014 increasing the efficiency of pharmaceutical research through data integration
DESCRIPTION
The pressures of pharmaceutical research and development demand increasing efficiency from scientists. High-quality decisions must be made faster and encompass all available information. At the same time there is a growing desire to better utilize the multi-billion dollar research investment recorded in laboratory notebooks and bioassay databases. Key values for data integration in a data exploration environment include gathering data from disparate E-notebooks and bioassay databases into a single searchable “virtual” system and increased discoverability by accessing data through a system designed for exploration. Key benefits are better chemistry decisions through easier access to broader data and reduced time for preparing patent filings. The ability to interlink in-house and reported assay data with in-house and published chemistry provides a data-rich environment for developing insights and predictive models. We will discuss our experience with integrating information from journals, patents, bio-assay databases, and E-lab notebooks to address these needs.TRANSCRIPT
![Page 1: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/1.jpg)
INCREASING THE EFFICIENCY OF PHARMACEUTICAL RESEARCH THROUGH DATA INTEGRATION
Dr. Roland Bauer
12-15.Oct. 2014 ICIC 2014 Heidelberg
Project Manager Content Integration & Development Elsevier Information Systems GmbH, Frankfurt [email protected]
Matthew Clark Ph.D. Consultant, Life Science Services Elsevier Inc. Philadelphia, PA [email protected]
![Page 2: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/2.jpg)
2
ABOUT ME:
- “Babes –Bolyai” University, Cluj-Napoca,
Romania
- Max-Planck-Institute for Polymer Research,
Mainz, Germany
- Elsevier
![Page 3: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/3.jpg)
3
Introduction & Setting the Stage
Why?
Content Integration : The Reaxys Case
Integration Process Project Overview
AGENDA
![Page 4: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/4.jpg)
4
INTRODUCTION & SETTING THE STAGE: THE DRUG DISCOVERY INFORMATION LANDSCAPE
![Page 5: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/5.jpg)
5
INTRODUCTION & SETTING THE STAGE: TENDENCIES IN THE DRUG DISCOVERY INFORMATION LANDSCAPE
![Page 6: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/6.jpg)
6
INTRODUCTION & SETTING THE STAGE: TENDENCIES IN THE DRUG DISCOVERY INFORMATION LANDSCAPE
![Page 7: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/7.jpg)
WHY?
7
![Page 8: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/8.jpg)
ISSUE: CHEMICAL INFORMATION ACCESS IS FRAGMENTED
• End users must learn many
interfaces
• Different data sources
have different
capabilities for searching
• Scientists may not search all
appropriate data sources
Licensed
Database
Licensed
Database
Catalog
Catalog
E-Notebook
E-Notebook References/
Full Text
![Page 9: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/9.jpg)
INTEGRATION OF DATA PROVIDES BETTER ANSWERS
Searching multiple sources with one search via a single
interface increases efficiency
Harmonized indexing allows asking similar question
among all sources
9
Easier
Access Enhanced
usage
Better
value for
investment
Better
decisions
Faster
progress
![Page 10: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/10.jpg)
CONTENT INTEGRATION : THE REAXYS CASE
10
![Page 11: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/11.jpg)
11
THE REAXYS DATABASE: CONTAINS INTEGRATED PUBLISHED CHEMISTRY DATA
![Page 12: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/12.jpg)
12
THE REAXYS DATABASE: …ALONG WITH EXPANDED BIBLIOGRAPHICAL INFORMATION
![Page 13: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/13.jpg)
13
THE REAXYS TREE : BROWSE CONTENT BY ONTOLOGY
![Page 14: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/14.jpg)
TWO APPROACHES TOWARDS INTEGRATED CONTENT
14
Analysis system
End-User
Central
Storage
FEDERATED MODEL WAREHOUSE MODEL
![Page 15: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/15.jpg)
TWO APPROACHES TOWARDS INTEGRATED CONTENT
15
FEDERATED MODEL WAREHOUSE MODEL
Pros: - Easy scalability in case of new
data sources - Delivery of short term „wins“
- Maintenance costs
Cons: - Lack of normalization and
harmonized indexing
- Performance and availability dependent on the source systems
Pros: - High data quality trough
normalization
- Unified Queries and Filters applicable
Cons: - Long implementation times &
higher starting costs
- Expensive and difficult to accommodate changes in data types
![Page 16: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/16.jpg)
16
REAXYS EXTERNAL CONTENT INTEGRATION
Database
End-User
ELN 1
ELN 2
CUSTOM IN HOUSE
REACTIONS SOURCE
Indexed Storage
RX CONTENT EXTERNAL CONTENT
![Page 17: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/17.jpg)
Customer Hosted
17
REAXYS EXTERNAL CONTENT INTEGRATION: IN HOUSE SCENARIO
Database
End-User
ELN 1
ELN 2
CUSTOM IN HOUSE
REACTIONS SOURCE
Indexed Storage
RX CONTENT EXTERNAL CONTENT
![Page 18: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/18.jpg)
Customer Hosted
Elsevier Hosted
18
REAXYS EXTERNAL CONTENT INTEGRATION: ELSEVIER HOSTED SCENARIO
Database
End-User
ELN 1
ELN 2
CUSTOM IN HOUSE
REACTIONS SOURCE
Indexed Storage
RX CONTENT EXTERNAL CONTENT
![Page 19: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/19.jpg)
Customer Hosted Elsevier Hosted
19
REAXYS EXTERNAL CONTENT INTEGRATION: HYBRID HOSTING SCENARIO
Database
End-User
ELN 1
ELN 2
CUSTOM IN HOUSE
REACTIONS SOURCE
Indexed Storage
RX CONTENT EXTERNAL CONTENT
![Page 20: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/20.jpg)
REAXYS PROVIDES A UNIFIED INFORMATION PORTAL
• Provides a single powerful
interface
• Can integrate several
notebook systems
• Links chemistry, structures,
sourcing, citations, and
full-text of articles
Structures,
reactions, and
Full-Text
Licensed
Reaction and
Structure
Databases
E-Notebook Binding,
Properties
E-Notebook
Patents
![Page 21: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/21.jpg)
INTEGRATED SOLUTION SEARCH
21
List of integrated
sources
Sources list can include licensed
databases, and multiple e-
notebooks from organizational
units
All e-notebooks can be integrated
and searched together
![Page 22: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/22.jpg)
REACTION SEARCH RESULTS SEPARATED BY SOURCE
22
Results from
each source on
separate tab
Show corresponding
substances in …
PubChem
eMolecules
Licensed
PharmaCo e-notebook
PharmaCo2 e-notebook
Cross link to
substance in all
other sources
where it is found
E-notebooks
![Page 23: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/23.jpg)
SUBSTANCE RESULTS
23
Results from each source on separate tab
Including PubChem and eMolecules
All filters fully
active
![Page 24: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/24.jpg)
INTEGRATION CASE STUDY: ROCHE IN HOUSE HOSTED
Integrated Reaxys with several data sources:
• Medicinal Chemistry E-notebooks
• Development Chemistry E-notebooks
• Several E-notebook systems of acquired organizations
• Licensed Databases
• Current Chemical Reactions
• Several other databases
Links to many more sources
• Roche stockroom availability
• Patent/Literature full text
• Link to original e-notebook pages
24
Reaxys integrates these e-
notebooks with each other,
while they are still maintained
as separate systems
![Page 25: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/25.jpg)
CASE STUDY: ROCHE KEY DRIVERS
From ACS Presentation by
Michael Kapler, Roche Pharma Research and Early Development
http://abstracts.acs.org/chem/245nm/program/view.php?obj_id=188977
![Page 26: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/26.jpg)
INTEGRATION PROCESS
PROJECT OVERVIEW
26
![Page 27: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/27.jpg)
PROCESS OVERVIEW FOR AN INTEGRATION PROJECT
Initialisation:
- Evaluation of Datasources and needed
resources
- Determine hosting scenario
- Commercial and legal framework
Kick-of:
-requirements harvesting
-establish milestones and top down workstreams
-Refine & finalize plans
Execution:
-implement automatised ETL process
-implement application customisation
-install IT infrastructuee and interfaces
Delivery
- BETA release
- Refinement
- GoLive
Sprint 1
Sprint 2
Sprint 3
Sprint 4
Hand over to BAU /Maintenance
- Sprint Iterations a 3 weeks -Sprint number dependent on complexity (5-…)
![Page 28: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/28.jpg)
DATA MODEL AND USER INTERFACE PROCESS
Determine
data
sources
Map to Reaxys
Integration Data
Model (XML)
User interface
configuration for new
fields
Unit
conversions,
data cleaning
E-notebooks
Licensed databases
°C, K
moles, grams
Identify URL
links to E-
notebooks
an other
resources
This is a key step for
the integration project Design location, nature of displaying
the fields, urls etc.
![Page 29: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/29.jpg)
AUTOMATISED FABRICATION PROCESS (ETL)
E-notebook
1
E-notebook
2
Bioassay db
Transmit to
fabrication
server (sftp,
scp)
Fabrication
combines data
with Reaxys data
for production
Daily
extraction to
XML using
defined data
model …
![Page 30: ICIC 2014 Increasing the efficiency of pharmaceutical research through data integration](https://reader033.vdocuments.us/reader033/viewer/2022052903/55796eb4d8b42a3a5c8b4f4c/html5/thumbnails/30.jpg)
30
THANK YOU – QUESTIONS?
Project Manager Content Integration & Development Elsevier Information Systems GmbH, Frankfurt [email protected]
Matthew Clark Ph.D. Consultant, Life Science Services Elsevier Inc. Philadelphia, PA [email protected]
Dr. Roland Bauer