Linked ATC and WHO Drug in a
Semantic Web enabled world
Kerstin Forsberg (@kerfors on Twitter, SlideShare etc.)
Informatics Analyst and Lifetime Learner
AZ IT | R&D Information
Representing and linking data, schemas, models,
data standards and terminologies
Web 1.0 (25 years) and Web 2.0 (the last 10 years)
2
Web of (Linked) Data
Web of Documents
An Intro To The Semantic Web: Why You Need To
Know About It Sooner Than Later , by Samantha Wong
Image Source: Frederic Martin
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Web 3.0 (RDF, foundation standard, 15 years)
3
Web of Data
Web of Documents
subject predicat object
Common Model (“Triples”)
Resource Description Framework
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Web of Data
Fine granularity
Web of Data as RDF Triples for “Things”
4
80+ RDF Triples (here are 4 of them)
describing Ticagrelor
Web of Data view of structured data
in Wikipedia pages
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Linked Data, Semantic Web and Graphs
BBC http://www.bbc.co.uk/things/
5
Linked Data, Semantic Web and Graphs
Nobel Prize http://www.nobelprize.org/nobel_organizations/nobelmedia/nobelprize_org/developer/
6
Linked Data, Semantic Web and Graphs
Google Knowledge Graph /
7
http://searchengineland.com/demystifying-knowledge-graph-201976
RDF @ NCBI
Example http://id.nlm.nih.gov/mesh/D015242
for Ofloxacin in MeSH
AstraZeneca engagements
• Public
- IMI Open PHACTS (Innovative Medicine Initiative, Open
Pharmacology Space)
- FDA/PhUSE and CDISC, Semantic technology project
- W3C Health Care and Life Science (HCLS)
• Internal
- iSIM
- i2 Semantic Framework
Semantic Web and Linked Data
9 Author | 00 Month Year Set area descriptor | Sub level 1
The Innovative Medicines
Initiative
• EC funded public-private
partnership for
pharmaceutical research
• Focus on key problems
– Efficacy, Safety,
Education & Training,
Knowledge
Management
The Open PHACTS Project• Create a semantic integration hub (“Open
Pharmacological Space”)…
• Delivering services to support on-going drug
discovery programs in pharma and public domain
• Not just another project; Leading academics in
semantics, pharmacology and informatics, driven by
solid industry business requirements
• 23 academic partners, 8 pharmaceutical companies,
3 biotechs
• Work split into clusters:
• Tehnical Build
• Scientific Drive
• Community & Sustainability
• CDISC2RDF started Oct 2012 as a pre-competitive
project with AZ, Roche, W3C et al. to show case
Semantic Web standards and Linked Data principles.
• FDA meeting Nov 2012: Solutions for Study Data
Exchange Standards Meeting – W3C Semantic Web
presentation.
• June 2013 the Semantic Technology project,
a FDA/PhUSE working group for Emerging
Technologies, with 25+ repr. from FDA,
CDISC, Pharma:s, CRO:s and software vendors.
• Oct 2013 press release: Representing
existing standards (SDTM, CDASH,
SEND, ADaM) in RDF.
• Dec 2014, Public review of CDISC in RDF Guide.
Clinical standards in the Semantic WebCommunity building and knowledge sharing
11
CDISC Interchange Europe 2011 and 2012
presentations from Roche and AstraZeneca
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
12 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Clinical data standards in the Semantic Web
Example from CDISC SDTM, Adverse Event domain (AE)
RDF triples describing one variable/data elementand linking to related standard parts
“Pushing back” – Use standards for standardsAZ Vocabulary Management team shared this with MedDRA MSSO
13 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D InformationCourtland Yockey, Informatics Analyst
AstraZeneca R&D Information, USA
A very simple SKOS-rendering
of MedDRA
• term skos:Concept
• hierarchy level
skos:ConceptScheme
• SMQ skos:Collection
Approach should be augmented with
VoID representation of MedDRA
versions and term properties
distinguishing active from inactive
terms.
Skos:Collection is likely not sufficient
to support SMQ versioning nor
context of terms in an SMQ (e.g.
weight)
“Pushing back” – Use standards for standardsAZ Vocabulary Management team created a RDF representation of
ATC codes using the SKOS Schema
14 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D InformationCourtland Yockey, Informatics Analyst
AstraZeneca R&D Information, USA
4 example RDF Triples
representing part of a ATC code
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Semantic Web
Standards
A stack of standards to
represent data and semantics
based on Resource
Description Framework
(RDF). RDF is a framework
for creating statements in a
form of so-called triples
OWL and SKOS: RDF-based
standards to represent
vocabularies of terms
representing identified entities
and concepts
SPARQL: query language for
RDF triples
Building Linked Data
Applications
Use of Semantic Web
standards and Linked Data
principles enabling us to ask
questions and solve business
problems across a
heterogeneous information
landscape across open and
closed sources
Capture Business
Questions and Sources
Domain Expert
Concept Map
Build Formal Ontolog!
Challenge with Linked Open Data
Model Business Questions (SPARQL)
Interact with RDF answer in a Faceted
Browser
Web of Data
Open and Closed
Open data sources applying
the Linked Data principles
and semantic web standards
as a Web of Data
Central is the Wikipedia’s
structured content via
DBpedia used by e.g.
Google’s KnowledgeGraph
and IBM’s Watson.
Closed data sources now
also form internal Webs of
Data
Linked Data
Principles
Use URIs (Uniform Resource
Identifiers) as names for
things.
Use HTTP URIs so that
people can look up
(dereference) those names.
When someone looks up a
URI, provide useful
information.
Include links to other URIs so
that they can discover more
things
Linked Data in One slide