harnessing edge informatics to accelerate collaboration in biopharma (bio-it world 2016)
TRANSCRIPT
Tom Plasterer, PhD.Research & Development Information (RDI)Director, US Cross-Science
Harnessing Edge Informatics to Accelerate Collaboration in
BioPharma
WINNER
The US Cross-Science Team in Research and
Development Information (RDI) is a group of
informaticians, mathematicians, project/
program managers, developers, architects
dedicated to data science—data discovery, data
reuse, data harmonization, analytics and self-describing data
or Smart Data
We strive to create tangible digital and social
artefacts used to accelerate delivering
medicines to patients and improving their impact
once in the clinical setting.
These artefacts include web-based software,
community-driven data models, data sharing best
practices, data science communities of practice and strong advocacy of
Smart Data inside and out of AstraZeneca.
R&D | RDI
Sharing & Collaboration: Are you a Data Parasite?
R&D | RDI
Sharing & Collaboration: Are you a Data Parasite?
‘A second concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends,
possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what
the original investigators had posited. There is concern among some front-line researchers that the system will be taken
over by what some researchers have characterized as “research parasites.”’
Dan Longo and Jeffrey Drazen, the deputy editor and editor-in-chief, NEJM
R&D | RDI
Sharing & Collaboration: Are you a Data Parasite?
‘A second concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends,
possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what
the original investigators had posited. There is concern among some front-line researchers that the system will be taken
over by what some researchers have characterized as “research parasites.”’
Dan Longo and Jeffrey Drazen, the deputy editor and editor-in-chief, NEJM
R&D | RDI
Sharing & Collaboration: Are you a Data Parasite?
‘A second concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends,
possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what
the original investigators had posited. There is concern among some front-line researchers that the system will be taken
over by what some researchers have characterized as “research parasites.”’
Dan Longo and Jeffrey Drazen, the deputy editor and editor-in-chief, NEJM
‘The condescension implicit in this statement is deeply
troubling. Drazen and Longo are saying, essentially, that only the people who originally collect a
data set can truly understand it, and anyone else who wants to
take a look is a parasite.’
Steven Salzberg, JHU
R&D | RDI
Sharing & Collaboration: Are you a Data Parasite?
‘A second concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends,
possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what
the original investigators had posited. There is concern among some front-line researchers that the system will be taken
over by what some researchers have characterized as “research parasites.”’
Dan Longo and Jeffrey Drazen, the deputy editor and editor-in-chief, NEJM
‘The condescension implicit in this statement is deeply
troubling. Drazen and Longo are saying, essentially, that only the people who originally collect a
data set can truly understand it, and anyone else who wants to
take a look is a parasite.’
Steven Salzberg, JHU
R&D | RDI
Sharing & Collaboration: Are you a Data Parasite?
‘A second concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends,
possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what
the original investigators had posited. There is concern among some front-line researchers that the system will be taken
over by what some researchers have characterized as “research parasites.”’
Dan Longo and Jeffrey Drazen, the deputy editor and editor-in-chief, NEJM
‘The condescension implicit in this statement is deeply
troubling. Drazen and Longo are saying, essentially, that only the people who originally collect a
data set can truly understand it, and anyone else who wants to
take a look is a parasite.’
Steven Salzberg, JHU
‘But the science, data, and research results are trapped in silos,
preventing faster progress and greater reach to patients. It’s not
just about developing game-changing treatments — it’s about
delivering them to those who need them.’
Vice President Biden’s Blog
R&D | RDI
Public Research, Private Results?
R&D | RDI
Public Research, Private Results?
R&D | RDI
Public Research, Private Results?
R&D | RDI
Public Research, Private Results?
‘Payment of 32 dollars is just insane when you need to skim or read tens or hundreds of these papers to do research. I obtained these papers
by pirating them. Later I found there are lots and lots of researchers (not
even students, but university researchers) just like me, especially
in developing countries. They created online communities
(forums) to solve this problem.’
Alexandra ElbakyanSci-hub operator
R&D | RDI
Sharing Clinical Trial Results
R&D | RDI
Sharing Clinical Trial Results
R&D | RDI
Sharing Clinical Trial Results
Thousands of clinical trials have not reported their results; some have not even
been registered.
Information on what was done and what was found in these trials could be lost
forever to doctors and researchers, leading to bad treatment decisions, missed
opportunities for good medicine, and trials being repeated.
All trials past and present should be registered, and the full methods and the
results reported.
We call on governments, regulators and research bodies to implement measures to
achieve this.
AllTrials.Net Petition(2015)
R&D | RDI
Sharing Clinical Trial Results
R&D | RDI
Sharing Clinical Trial Results
• Maximize the benefits while minimizing the risks of sharing clinical trial data
• Respect individual participants whose data are shared
• Increase public trust in clinical trials and the sharing of trial data
• Conduct the sharing of clinical trial data in a fair manner
IOM Report: Sharing Clinical Trial Data: Maximizing Benefits, Minimizing Risk
(2015)
R&D | RDI
Edge InformaticsInterfaces within the Drug Development Process
Target Discovery
NGS Exome analysis
Pathway Analysis
StructureAnalysis
Lead Discovery
RNAi
Assay Development
HTS
Lead Optimization
SAR
In vivo non-human testing
Exploratory PK
Exploratory Tox
Pre-Clinical Development
GLP Tox
Formulation
ADME
PK
Efficacy
Clinical Development
IND
Safety, Tolerability
Phase I-III
Registration
NDA/BLA
MAA
Marketing & Sales
PMRREMS
PSUR
Observational Research
Pathway Enrichment
Disease Contextualization
Seamless information connectivity (an EDGE) needed across domain NODEs
R&D | RDI
Integration Quandary: Content Does Not Combine Easily
Fit-for-Purpose to “Standards”
Models
Structured
Triplestores
Semi-StructuredUnstructured
Content
Lack of Compatible
Containers → the ”Plumbing
Problem”
Lack of Compatible
Semantics→ the ”Meaning Problem”
R&D | RDI
What’s Needed?Linked Data!
LOD Cloud 2014Schmachtenberg, Bizer, Jentzsch and Cyganiak.http://lod-cloud.net/
R&D | RDI
What’s Needed?Linked Data!
LOD Cloud 2014Schmachtenberg, Bizer, Jentzsch and Cyganiak.http://lod-cloud.net/
“Smart Data” means information that actually makes sense.
Wired Magazine, April 2013
R&D | RDI
Thanks to: Eric Little, VP Data Science, Osthus
The Emergence of Smart DataStandards Driven at Container Interfaces
R&D | RDI
Competitive Intelligence 360 (CI360) ApproachFlexibly Addressing Key Questions
23
Capture Business Questions and
Sources
Domain Expert Concept Map
Build Formal Ontology
Challenge with Linked Data
Examine with a Faceted Browser
Share insights with a Knowledge
Base
R&D | RDI
Capture Business QuestionsCapture Business
Questions and Sources
R&D | RDI
Translate Questions into ConceptsDomain Expert Concept Map
“Where are the key clinical studies in NSCLC and who are the principle investigators?”
R&D | RDI
Challenge with Data“Where are the key clinical studies in NSCLC and who are the principle investigators?”
(one example)
Challenge with Linked Data
Source: https://clinicaltrials.gov/ct2/show/NCT02027428
R&D | RDI
Refine the AnswerExamine with a
Faceted Browser
“What are the open trials in metastatic breast cancer and what drugs are being tested?”
R&D | RDI
Share Insights as a Community“Can a biomarker defined population be added to a trial record?”
Share insights with a Knowledge
Base
R&D | RDI
Data FAIRport
To be Findable:F1. (meta)data are assigned a globally unique and persistent identifierF2. data are described with rich metadata (defined by R1 below)F3. metadata clearly and explicitly include the identifier of the data it describesF4. (meta)data are registered or indexed in a searchable resource
To be Accessible:A1. (meta)data are retrievable by their identifier using a standardized communications protocolA1.1 the protocol is open, free, and universally implementableA1.2 the protocol allows for an authentication and authorization procedure, where necessaryA2. metadata are accessible, even when the data are no longer available
To be Interoperable:I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.I2. (meta)data use vocabularies that follow FAIR principlesI3. (meta)data include qualified references to other (meta)data
To be Reusable:R1. meta(data) are richly described with a plurality of accurate and relevant attributesR1.1. (meta)data are released with a clear and accessible data usage licenseR1.2. (meta)data are associated with detailed provenanceR1.3. (meta)data meet domain-relevant community standards
InteroperabilityInvestment
R&D | RDI
Naming ThingsUse Resolvable URIs
InteroperabilityInvestment
http://purl.uniprot.org/uniprot/P30453
http://www.uniprot.org/uniprot/P30453
http://purl.uniprot.org/uniprot/P30453.ttl
R&D | RDI
Describing DataReuse, Reuse, Reuse (build only if essential)
InteroperabilityInvestment
R&D | RDI
Describing DataReuse, Reuse, Reuse (build only if essential)
InteroperabilityInvestment
R&D | RDI
Describing DataReuse, Reuse, Reuse (build only if essential)
InteroperabilityInvestment
R&D | RDI
Finding DataVocabulary of Interlinked Datasets (VoID)
InteroperabilityInvestment
R&D | RDI
Cross-BioPharma Data Standardization EffortsInteroperability
Investment
R&D | RDI
Open Data, Open Science Efforts
36
R&D | RDI
Open Data, Open Science Efforts
37
R&D | RDI
Open Data, Open Science Efforts
38
R&D | RDI
Open Data, Open Science Efforts
39
R&D | RDI
Open Data, Open Science Efforts
40
R&D | RDI
Open Data, Open Science Efforts
41
R&D | RDI
Get your plumbing right• And your data won’t be stuck in a silo
Leverage working public solutions• Don’t reinvent the wheel
Use Edge Informatics• Consider handoffs—you don’t know how your data will be used in the
future
Invest in Data Stewardship• Small tax to future-proof your efforts
Data and Collaboration ARE Business Assets