edf2014: paul groth, department of computer science & the network institute, vu university...

33
A Data Platform for Drug Discovery Paul Groth (@pgroth) http://www.few.vu.nl/~pgroth

Upload: european-data-forum

Post on 10-May-2015

1.276 views

Category:

Technology


0 download

DESCRIPTION

Invited Talk by Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands at the European Data Forum 2014, 20 March 2014 in Athens, Greece: Open PHACTS: A Data Platform for Drug Discovery.

TRANSCRIPT

Page 1: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

A Data Platform for Drug Discovery

Paul Groth (@pgroth)

http://www.few.vu.nl/~pgroth

Page 2: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

1. WHY2. THE PLATFORM3. APPS4. THE FUTURE

Page 3: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery
Page 4: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Pre-competitive Informatics:Pharma are all accessing, processing, storing & re-processing external research data

LiteraturePubChem

GenbankPatents

DatabasesDownloads

Data Integration Data AnalysisFirewalled Databases

Repeat @ each

companyx

Lowering industry firewalls: pre-competitive informatics in drug discovery Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944

Page 5: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Number sum Nr of 1 Question

15 12 9 All oxidoreductase inhibitors active <100nM in both human and mouse

18 14 8Given compound X, what is its predicted secondary pharmacology? What are the on and off,target safety concerns for a compound? What is the evidence and how reliable is that evidence (journal impact factor, KOL) for findings associated with a compound?

24 13 8Given a target find me all actives against that target. Find/predict polypharmacology of actives. Determine ADMET profile of actives.

32 13 8 For a given interaction profile, give me compounds similar to it.

37 13 8The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X.

38 13 8Retrieve all experimental and clinical data for a given list of compounds defined by their chemical structure (with options to match stereochemistry or not).

41 13 8

A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the compounds known to modulate the target directly? What are the compounds that may modulate the target directly? i.e. return all cmpds active in assays where the resolution is at least at the level of the target family (i.e. PKC) both from structured assay databases and the literature.

44 13 8 Give me all active compounds on a given target with the relevant assay data

46 13 8Give me the compound(s) which hit most specifically the multiple targets in a given pathway (disease)

59 14 8 Identify all known protein-protein interaction inhibitors

Business Question Driven Approach

http://www.sciencedirect.com/science/article/pii/S1359644613001542

Page 6: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

ChEMBL DrugBankGene

OntologyWikipathways

UniProt

ChemSpider

UMLS

ConceptWiki

ChEBI

TrialTrove

GVKBio

GeneGo

TR Integrity

“Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”

“What is the selectivity profile of known p38 inhibitors?”

“Let me compare MW, logP and PSA for known oxidoreductase inhibitors”

Page 7: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery
Page 8: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

THE OPEN PHACTSDISCOVERY PLATFORM

Page 9: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

RDFNanopub

Db

VoID

Data Cache (Virtuoso Triple Store)

Semantic Workflow Engine

Linked Data API (RDF/XML, TTL, JSON)DomainSpecificServices

Identity Resolution

Service

Chemistry RegistrationNormalisation & Q/C

IdentifierManagement

Service

Indexing

Co

re P

latf

orm

P12374EC2.43.4

CS4532

“Adenosine receptor 2a”

RDF

VoID

Db

RDFNanopub

Db

VoID

RDF

Db

VoID

RDFNanopub

VoID

Public Content Commercial

Public Ontologies

User Annotations

Apps

Page 10: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Data Sources Compound

Disease(in testing)

PathwayTarget ✔

Page 11: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Play! https://dev.openphacts.org/

Page 12: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Secure Cloud Hosted + VirtualizedTriple Store- Virtuoso 7 column store- Scale to > 100 billion triples

Network- AMX-IS- Extensive memcache- Monitored

Hardware (development)- 2 x Intel Xeon E5-2640 - 384 GB DDR3 1333MHz RAM- 1.5 TB SSD - 3TB 7200rpm

Page 13: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Dealing With The Really Tough Parts

JohnWilbankshttp://del-fi.org/

Data Licensing

Page 14: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Provenanceeverywhere

Page 15: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Its easy to integrate, difficult to integrate well:

Page 16: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

PubChemDrugbankChemSpider

Imatinib

Mesylate

What Is Gleevec?

Page 17: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Strict Relaxed

Analysing Browsing

Dynamic Equality

LinkSet#1 { chemspider:gleevec hasParent imatinib ... drugbank:gleevec exactMatch imatinib ...}

chemspider:gleevec drugbank:gleevec

Page 18: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

APPS

Page 19: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

API Hits (April 2013 – March 2014)

Page 20: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery
Page 21: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

http://explorer.openphacts.org

Page 22: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

ChemBioNavigtor

1 March 2013 Open PHACTS Tech Talk @ CSHALS2013 22

Page 23: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery
Page 24: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery
Page 25: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

THE FUTURE

Page 26: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

AppDevelopers

DataProviders

PharmaCompanies

AcademicResearch

Next GenIT

Life ScienceCompanies

Connecting Communities

Page 27: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Sustaining Impact

“Software is free like puppies are free - they both need money for maintenance”

…and more resource for future development

Page 28: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery
Page 29: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Pfizer Limited – Coordinator

Universität Wien – Managing entity

Technical University of Denmark

University of Hamburg, Center for Bioinformatics

BioSolveIT GmBH

Consorci Mar Parc de Salut de Barcelona

Leiden University Medical Centre

Royal Society of Chemistry

Vrije Universiteit Amsterdam

Spanish National Cancer Research Centre

University of Manchester

Maastricht University

Aqnowledge

University of Santiago de Compostela

Rheinische Friedrich-Wilhelms-Universität Bonn

AstraZeneca

GlaxoSmithKline

Esteve

Novartis

Merck Serono

H. Lundbeck A/S

Eli LillyNetherlands Bioinformatics CentreSwiss Institute of BioinformaticsConnectedDiscoveryEMBL-European Bioinformatics

Institute

Janssen

OpenLink

[email protected] @Open_PHACTS Open PHACTS

Page 30: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Backup

Page 31: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

Present Content

Page 32: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery

hTRPV1 2328 ligands from Open PHACTS

HEK293

408 compounds

capsaicin

http://www.openphacts.org

TRPV1

Page 33: EDF2014: Paul Groth, Department of Computer Science & The Network Institute, VU University Amsterdam, Netherlands Open PHACTS: A Data Platform for Drug Discovery