wilkinson bosc2010 moby-to-sadi

Post on 11-Jun-2015

489 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

From BioMoby to SADI

The Quest for the Holy Grail!

BioMoby Stats in a nutshell• >1800 services worldwide (~1300 “alive” at any given time)• 4 major installations of the Moby Service registry

– Genome Canada, SUN Center of Excellence, Calgary– Genome España, Barcelona Supercomputing Center– International Rice Research Institute, Philippines – Max Planck, Cologne

• Canadian service registry brokers ~400,000 requests/month• Canadian BioMoby services receive ~700,000 uses/month• Canadian server just had a significant memory upgrade to

improve performance

“The report of my death was an exaggeration”-- Mark Twain

Model Organism Bring Your-Own Database Interface Conference

“MOBY-DIC”

Emma Lake, SaskatchewanSept 21, 2001

Are we going after The Holy Grail

here?

The Holy Grail:(this slide created circa 2002)

Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.

Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.

http://sadiframework.org

Founding partner

MicrosoftResearch

Holy Grail Demo #1

Imagine there is a “virtual database” containing all of the data from all of the databases,together with the output of

every conceivable analysis

How do we query that database?

“SHARE”Semantic Health And Research Environment

SADI client application

http://biordf.net/cardioSHARE (Pellet)

http://dev.biordf.net/cardioSHARE (Pellet 2)

What pathways does UniProt protein P47989 belong to?

PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE {

uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway .

}

Recapwhat we just saw

A standard SPARQL query was entered into SHARE, a SADI-aware query engine

Recapwhat we just saw

The query was interpreted to extract the “triple” patterns

subject, predicate, object

being requested

Recapwhat we just saw

Triple-patterns are passed to SADI for Web Service discovery

Recapwhat we just saw

Services capable of generating those triple-patterns are automatically executed,

the triples are stored, and the query is resolved.

Recapwhat we just saw

We posed, and answered a ~complex database query

WITHOUT A DATABASE

(in fact, the data didn’t even have to exist...)

Recapwhat we just saw

Note that there is no centralized ontology

Unlike BioMoby, SADI supports all (OWL) ontologies and

does not invent any of its own

Holy Grail Demo #1

Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.

Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.

Holy Grail Demo #2

Show me the latest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplants

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {

?patient rdf:type patient:LikelyRejecter .?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .

}

Start burrowing through the LikelyRejector OWL class find that we need a regression model OWL class

Regression models have features like slopes and intercepts, and so on.The class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties

Decomposition of the OWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data

VOILA!

We just dynamically evaluated if individuals matching a particular high-level concept definition exist

…or can exist

Holy Grail Demo #2

Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.

Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.

How does

SADI + SHARE

do that?

Please see other presentations uploaded to SlideShare for a full explanation

of SADI Functionality

See also the Taverna and Protégé plug-insfor discovering, running and creating services

TavernaSentient Knowledge Explorer

The Holy Grail may not yet be in-handbut I think we can at least see it from here!

So… now what?

Mark’s Manifesto

What is my next “Holy Grail”?

Science

Support for the in silico Scientific Method

Reproducibility

Clarity (hypothesis)

Discourse

Disagreement

Clarity (experiment)

The Scientific Method

Discourse: What do you believe? What do I believe?

Disagreement: You’re wrong! And I’m gonna prove it!

Clarity: This is the experiment I am going to do

Reproducibility: This is how I did it (“provenance”)

Clarity: This is my new hypothesis

The Scientific Method

Discourse: What do you believe? What do I believe?

Disagreement: You’re wrong! And I’m gonna prove it!

Clarity: This is the experiment I am going to do

Reproducibility: This is how I did it (“provenance”)

Clarity: This is my new hypothesis

Workflows (e.g. myExperiment)

Reproducibility

Clarity (hypothesis)

Discourse

Disagreement

Clarity (experiment)

In opposition to the lessons we learnt from Web 2.0

The Semantic Web in Healthcare and Life Sciences

is currently solving the problems of science…

…by forming institutions

Result:

Large, centrally-designed and centrally-curated ontologies

that enforce “community agreement” about “biological reality”

Science ≠ Consensus

Reproducibility

Clarity (hypothesis)

Discourse

Disagreement

Clarity (experiment)

Reproducibility

Clarity (hypothesis)

Institutions & Consortia

Disagreement

Clarity (experiment)

Reproducibility

Clarity (hypothesis)

Institutions & Consortia

Consensus

Clarity (experiment)

Reproducibility

????

Institutions & Consortia

Consensus

Clarity (experiment)

To bring the “traditions of Science”

to in silico Science

we need Web 3.0 tools that encourage and facilitate

personal opinion and debate

What has this got to do with SADI and SHARE?

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {

?patient rdf:type patient:LikelyRejecter .?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .

}

Likely Rejecter

I created a small ontologydescribing my definition of

a Likely Rejecter

… it was MY ontology!

I can re-use it

I can modify it as I change my world-view

Reproducibility

Clarity (hypothesis)

Discourse

Disagreement

Clarity (experiment)

I can publish it for others to use

Reproducibility

Clarity (hypothesis)

Discourse

Disagreement

Clarity (experiment)Others can modify it and/or

compare it to THEIR world-view

Reproducibility

Clarity (hypothesis)

Discourse

Disagreement

Clarity (experiment)

Sharing my ontology also gives opportunities for micro-attribution;

“Citation” of me is transparent and automatic when someone extends my ontology

Using SADI and SHAREmy personal world-view is

explicitly expressed and can bedynamically evaluated against

global data and knowledge

Ontology development is distributed and personal rather than centralized

no institutions

“an ecosystem of ideas!”

…but there’s more…

“Likely Rejecter”

I made that up! It came out of my head!

What’s another word for a world-view that you make-up?

Hypothesis

Reproducibility

Hypotheses

Discourse

Disagreement

Clarity (experiment)The “Likely Rejecter” OWL Classis an explicitly-expressed hypothesis;

Members of that class may or may not exist!

Reproducibility

Hypotheses

Discourse

Disagreement

Experiment

Ontologically-expressed Hypotheses drive the discovery, assembly, and analysis of data capable of evaluating their validity

Blood Pressure

Hypertension

Ischemia

Hypothesis

Database 1 Database 2

SADI+

SHARE

Analytical Algorithm

Join us!

SADI and CardioSHARE are Open-Source projects

Come join us – we’re having a lot of fun!!

http://sadiframework.org

#SADIFrameworkSADI SemanticWeb Services Page

C-BRASS: Canadian Bioinformatics Resources As Semantic Services

together with Michel Dumontier, Chris Baker

~$1M funding to help us deploy SADI services and provide training for new service providers

We can help you get started!

“C-BRASS” is on Facebook! Like

Credits

Benjamin VanderValk (SADI & SHARE)

Luke McCarthy (SADI & SHARE)

Soroush Samadian (CardioSHARE)

Microsoft Research

Fin

This presentation available on SlideShare: keywords ‘wilkinson’ ‘bosc’

top related