bio ontologies and semantic technologies

103
Introduction to Bio Ontologies and The Semantic Web M. Devisscher Biological Databases

Upload: prof-wim-van-criekinge

Post on 16-Apr-2017

1.558 views

Category:

Education


2 download

TRANSCRIPT

Page 1: Bio ontologies and semantic technologies

Introduction to Bio  Ontologiesand The  Semantic Web

M.  DevisscherBiological Databases

Page 2: Bio ontologies and semantic technologies

Overview

• Bio  ontologies• Semantic technologies

• Practical  sessions:  – Protégé and a  bio  database– DYI  SPARQL  endpoint

Page 3: Bio ontologies and semantic technologies

Introduction

• Ontologies:  what are  ontologies ?

• Ontologies in  the  bio  domain:  OBO  Foundry• Ontologies in  the  semantic web

• OBO• RDF,  IRI,  TTL,  SPARQL,  OWL

Page 4: Bio ontologies and semantic technologies

What is  an ontology ?

• Ontology =  a  specification of  a  conceptualization (Gruber 1993)

• In  practice:  controlled vocabularies– Disambiguation (e.g.  Bank,  Running)– Language/species  independence

• Very useful in  biology – complex  hierarchies of  terms

Page 5: Bio ontologies and semantic technologies

Ontologies in  the  bio  Domain

• OBO  Foundry -­‐ open  Biological andBiomedical Ontologies

• Common  principles• List  of  ontologies at  http://www.obofoundry.org

• OBO  is  also a  data  format  .obo

Page 6: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• The  mother of  bio-­‐ontologies:  the  GO– Oldest bio  – ontology– Many practical  applications:• Cross  species  studies• Term  abundance studies

• GO  is  an OBO  ontology

Page 7: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• Collection  of  terms

Page 8: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• Relationships between terms:– Subsumption:  is_a– Partonomic:  part_of

• These  terms are  transitive• Terms form  a  DAG  (directed,  acyclic graph)• Some information  can be inferred

Page 9: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

Page 10: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

Page 11: Bio ontologies and semantic technologies

SideTrack – The  Gene  Ontology

• Knowmore:  www.geneontology.org• AMIGO  :  the  GO  browser

Page 12: Bio ontologies and semantic technologies

Gene  Ontology  Annotation

• Gene  ontology  annotations  GOA  =  entities  labeled  with  GO  terms– E.g.  Uniprot-­‐GOA

Page 13: Bio ontologies and semantic technologies

Semantic Technologies

• The  semantic web:  Tim  Berners Lee  et  al,  Scientific American  2001

Page 14: Bio ontologies and semantic technologies

Semantic Technologies

• W3C:  a  set  of  specificationshttp://www.w3.org/standards/semanticweb/

• A  mature toolset– Dedicated data  formats– Storage– Query  language

Page 15: Bio ontologies and semantic technologies

Semantic Technologies

• Basic  data  element  =  a  Triple– A  mini  sentence– Contains three Terms:• Subject  Predicate Object

Page 16: Bio ontologies and semantic technologies

Semantic Technologies

• Representation of  triples– Basic  data  format:  RDF/XML– All data  expressed in  RDF  (Resource  DescriptionFramework)

– Several compatible  syntaxes:  TTL  (Terse Triple  Language)  most  human  readable

Page 17: Bio ontologies and semantic technologies

Example

Page 18: Bio ontologies and semantic technologies

The  Turtle Syntax

• Basic  Triple

<http://bioinformatics.be/entities#martijn><http://bioinformatics.be/relations#has_favorite_beer><http://bioinformatics.be/entities#karmeliet>.

Page 19: Bio ontologies and semantic technologies

The  Turtle Syntax

• Prefix

@prefix  b4x:  <http:bioinformatics.be/terms#>b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet.

Page 20: Bio ontologies and semantic technologies

The  Turtle Syntax

• Predicate lists

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet;

foaf:name “Martijn  Devisscher”.

Page 21: Bio ontologies and semantic technologies

The  Turtle Syntax

• Object  lists

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet,

b4x:chimay_blauw;foaf:name “Martijn  Devisscher”.

Page 22: Bio ontologies and semantic technologies

IRI’s and Literals

• Terms can be either IRI’s,  Literals or  blank  nodes• IRI  = Internationalized Resource  Identifier• Unique  id – a  virtual  URI– Example:  http://bioinformatics.be/terms#martijn– There is  no  requirement for resolving– Now:  Open  Data  initiatives:  please do  use resolvableURI’s http://linkeddata.org

– Unique  identifiers can be registered on  http://identifiers.org

Page 23: Bio ontologies and semantic technologies

Introduction

• Literals:  can be typed,  allowed types  from the  XSD  namespace:– E.g.  “This is  a  string  example”^^xsd:string– E.g.  “5”^^xsd:integer

• IRI’s are  used for entities and attributes• Literals are  used for attribute values thataren’t entities

Page 24: Bio ontologies and semantic technologies

The  Turtle Syntax

• Typed literals

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .@prefix  xsd:  <http://www.w3.org/2001/XMLSchema#>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet,

b4x:chimay_blauw;b4x:length  “184”^^xsd:integer;foaf:name “Martijn  Devisscher”^^xsd:string.

Page 25: Bio ontologies and semantic technologies

The  Turtle Syntax

• Blank  nodes

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .@prefix  xsd:  <http://www.w3.org/2001/XMLSchema#>  .b4x:martijn  b4x:has_favorite_beer  b4x:karmeliet,

b4x:chimay_blauw;b4x:length  “184”^^xsd:integer;foaf:name “Martijn  Devisscher”^^xsd:string;b4x:owns_cat  [  b4x:color  “Gray”  ].

Page 26: Bio ontologies and semantic technologies

Classes  and Individuals

• rdf:type

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  rdf:type foaf:Person.

Page 27: Bio ontologies and semantic technologies

Classes  and Individuals

• Shorthand:  a

@prefix  b4x:  <http:bioinformatics.be/terms#>  .@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .b4x:martijn  a  foaf:Person;

foaf:knows b4x:geert.b4x:geert  a foaf:Person.

Page 28: Bio ontologies and semantic technologies

Example

<http://xmpl/entities#martijn><http://xmpl/relations#has_favorite_beer><http://xmpl/entities#karmeliet>.

Page 29: Bio ontologies and semantic technologies

Semantic Technologies

• Sets  of  triples form  a  Graph

Page 30: Bio ontologies and semantic technologies

Graphs

• Triples are  building  blocks of  Graphs

• Combining sets  of  triples allows the  construction of  arbitrarily complex  graphs

b4x:martijn b4x:karmeliethas_favorite_beer

Page 31: Bio ontologies and semantic technologies

Add meaning !

• Reuse terms from existing,  well  definedvocabularies – ontologies (foaf,  dc,  go,  so)

• Describe new  terms =  Ontologies

• Contain– A  crisp  human  definition– Some machine  readable facts

Page 32: Bio ontologies and semantic technologies

Metadata

• Ontologies are  also described in  RDF– RDFS:  RDF  -­‐ Schema– OWL:  Web  Ontology Language– Also expressed in  RDF

• For  clarity,  file  extension  can be .rdfs or  .owl

Page 33: Bio ontologies and semantic technologies

RDFS  Essentials

• Descriptions– rdfs:label– rdfs:comment

Page 34: Bio ontologies and semantic technologies

RDFS

• Relationships between properties,  classes– rdfs:Class– rdfs:subClassOf– rdf:Property– rdfs:subPropertyOf– rdfs:range– rdfs:domain

Page 35: Bio ontologies and semantic technologies

RDFS:  Example

@prefix  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>.@prefix  foaf:  <http://xmlns.com/foaf/0.1/>  .@prefix  xsd:  <http://www.w3.org/2001/XMLSchema#>  .b4x:karmeliet  a  b4x:Trappist  .b4x:Beer  a  rdfs:Class .b4x:Trappist  a  rdfs:Class .b4x:Trappist  rdfs:subClassOf b4x:Beer  .b4x:has_favorite_beer  a  rdf:Property ;

rdfs:domain foaf:Person ;rdfs:range b4x:Beer  .

b4x:Beer  rdfs:subClassOf b4x:Drink  .

Page 36: Bio ontologies and semantic technologies

Analogy

• RDF  =  database  =  data• RDFS/OWL  =  schema  =  metadata

• Both  are  described in  RDF,  but  have  a  different  scope

Page 37: Bio ontologies and semantic technologies

Semantic Technologies

• Inference– Enhance dataset  using knowledge frommetadata(e.g.  rdfs,  owl)

• Types  of  inference engines– RDFS  inference• RDFS  entailment regime

– OWL  inference• Under  active research• Engines  exist for specific subsets of  OWL  (OWL-­‐DL)

Page 38: Bio ontologies and semantic technologies

RDFS  Entailment

Page 39: Bio ontologies and semantic technologies

RDFS:  Inference

b4x:kevin  b4x:has_favorite_beer  b4x:stella

Q:  What can we  infer from this using RDFS  entailment ?

Page 40: Bio ontologies and semantic technologies

RDFS:  Inference

b4x:kevin  b4x:has_favorite_beer  b4x:stellaInferred triples:b4x:kevin  a  foaf:Person [from domain]b4x:stella  a  b4x:Beer  [from range]b4x:stella  a  b4x:Drink  [from subClassOf]

Page 41: Bio ontologies and semantic technologies

DuckTyping

• Watch  out  with inference !

Example:  You want  to express that people canhave  lengths

b4x:length  a  rdf:Property;rdfs:domain foaf:Person;rdfs:range xsd:integer.

Page 42: Bio ontologies and semantic technologies

DuckTyping

• Problem:

ex:VW_Transporter b4x:length  “600”^xsd:integer.

• Would infer that VW_Transporter is  a  Person  !• This is  called DuckTyping

If  it  looks  like  a  duck,  swims  like  a  duck,  and  quacks  like  a  duck,  then  it  probably  is  a  duck

Page 43: Bio ontologies and semantic technologies

Task

• Find  a  solution:  express  in  rdfs that  people  can  have  lengths

Page 44: Bio ontologies and semantic technologies

Task

• Find  a  solution:  express  in  rdfs that  people  can  have  lengths

b4x:havingLenght  a  rdfs:Class.b4x:length  a  rdf:Property;

rdfs:domain b4x:havingLength;rdfs:range xsd:integer.

foaf:Person rdfs:subClassOf b4x:havingLength.

Page 45: Bio ontologies and semantic technologies

Storing  RDF

• As  an RDF  file  for download• In  a  Triplestore– Database  optimised for storing  triples– Examples:  BlazeGraph,  Fuseki,  Sesame

Page 46: Bio ontologies and semantic technologies

Semantic Technologies

• Querying over  RDF  data:  SPARQL• Cool  features:– Distributed  querying =  actual distribution of  data  and computing  resources

– SPARQL/Update:  modify data

• SPARQL  endpoints:  SPARQL  over  HTTP

Page 47: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• First  example:

SELECT  ?subject  ?predicate ?object  WHERE  {?subject  ?predicate ?object.

}

(Generally  not a  good idea as  it will pull  down  the  whole dataset)

Binding  variables

Graph matching

Page 48: Bio ontologies and semantic technologies

?

SELECT  ?person  WHERE  {?person  b4x:has_favorite_beer b4x:karmeliet

}

Page 49: Bio ontologies and semantic technologies

?

Page 50: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Limit  result size :

SELECT  ?subject  ?predicate ?object  WHERE  {?subject  ?predicate ?object.

}  LIMIT  10

Page 51: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Find all classes:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.

}

(This will only retrieve classes  that have  a  label)

Page 52: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Find all classes:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.OPTIONAL  {

?class  rdfs:label ?label.}

}

Page 53: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Find all classes  that contain “duck”  in  the  label:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (str(?label)  ,  “duck”  )  )

}

Page 54: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Make  it case  insensitive:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}

Page 55: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Search  in  specific graph:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  FROM  <http://example.org/animals>WHERE  {

?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}

Page 56: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Search  in  specific graph:

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?class  ?label  WHERE  {

GRAPH  <http://example.org/animals>  {?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}}

Page 57: Bio ontologies and semantic technologies

SPARQL  Query  Syntax

• Can also search  for graphs :

PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#>SELECT  ?g  WHERE  {

GRAPH  ?g  {?class  a  rdfs:Class.?class  rdfs:label ?label.FILTER(  CONTAINS  (  UCASE(str(?label))  ,  “DUCK”  )  )

}}

Page 58: Bio ontologies and semantic technologies

Summary:  Querying RDF  data

RDF  Data InferenceEngine

RDFS/OWL

RDF  Data

Inferred

SPARQLEndpoint

Page 59: Bio ontologies and semantic technologies

• Basic data element = a Triple– A mini sentence– Contains three Terms:– Subject Predicate Object

• Example:

<http://xmpl/entities#martijn><http://xmpl/relations#has_favorite_beer><http://xmpl/entities#karmeliet>.

Take  home  Summary

Page 60: Bio ontologies and semantic technologies

• Combine triples to represent knowledge

Page 61: Bio ontologies and semantic technologies

• Use terms from ONTOLOGIES

– COMMON VOCABULARIES– POSSIBLE TO INFER

MEANING• OMIABIS• OBIB• SNOMED/ICD• MESH

Page 62: Bio ontologies and semantic technologies

?

• SPARQL searches for patterns

Page 63: Bio ontologies and semantic technologies

?

Page 64: Bio ontologies and semantic technologies

Interoperability between OBO  andSemantic Technologies

• Originated from two separate  academic worlds• Computing  applications of  OBO  mainlyconsistency checkingand overrepresentationanalysis

• Semantic Technologies:  much broader toolset

• Interoperability ?– Direct  offering in  both formats– Automatedmapping

Page 65: Bio ontologies and semantic technologies

Where to find ontologies

• OBO  Foundry• Bioportal;  NCBO• Biogateway• Bio2RDF

Page 66: Bio ontologies and semantic technologies

Where to find RDF  data

• Google  for SPARQL  endpoint• =>  e.g.  EBI  databases

• Non  biological:  DBpedia

Page 67: Bio ontologies and semantic technologies

How  about Tim  Berners Lee’s vision

• We’re not there yet,  but  for bio  data  we’regetting quite close– The  explicitome– Crowd sourcing– Nanopublications

Page 68: Bio ontologies and semantic technologies

SPARQL  in  PRACTICE

Page 69: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

Page 70: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

Page 71: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x

Page 72: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE variable  ?label

Page 73: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE  variable  ?label• PREFIX:  replace  rdfs:label by  <http://www.w3.org/2000/01/rdf-schema#>

Page 74: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE  variable  ?label• PREFIX:  replace  rdfs:label by  <http://www.w3.org/2000/01/rdf-schema#>• FILTER results  to  labels  containing  “dimethylalinine”

Page 75: Bio ontologies and semantic technologies

SPARQL  :  Recap

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>SELECT ?label FROM <http://graphName> WHERE {

?x rdfs:label ?label.FILTER ( CONTAINS(?label, “dimethylalinine”) )

} LIMIT 10 ORDER BY ?label

• FIND  the  pattern  ?x rdfs:label ?label.

• BIND  variables  ?label,  ?x• RETRIEVE  variable  ?label• PREFIX:  replace  rdfs:label by  <http://www.w3.org/2000/01/rdf-schema#>• FILTER  results  to  labels  containing  “dimethylalinine”• LIMIT  results  to  first  10  matches  ordered  by  label

Page 76: Bio ontologies and semantic technologies

SPARQL  :  Recap

DESCRIBE <http://rdf.wikipathways.org/Pathway/WP1425_r74390/WP/Interaction/e077e>

• Useful  short  query  to  get  direct  links  from/to  a  given  node

Page 77: Bio ontologies and semantic technologies

SPARQL  REFERENCE

http://www.w3.org/TR/sparql11-­‐overview/

Page 78: Bio ontologies and semantic technologies

Running  SPARQL• From  a  web  interface

Page 79: Bio ontologies and semantic technologies

• From  a  web  interface• Using  http

– HTTP  GET

– HTTP  POST  :  for  larger  query  strings– Headers  determine  response  type  (JSON,  XML,  HTML)

http://…/sparql?default-graph-uri=<http://graphName>&query=URLENCODEDQUERYSTRING

Running  SPARQL

Page 80: Bio ontologies and semantic technologies

BIO-­‐ONTOLOGIES

Page 81: Bio ontologies and semantic technologies

BioPortal

Page 82: Bio ontologies and semantic technologies

Access

• From  the  web  interface  !• SPARQL  endpoint:  using  API  key;  on  request  • Running  a  local  copy:  download  VM  image;  on  request

Page 83: Bio ontologies and semantic technologies

Exercises

• Find  a  term• Find  ontologies  containing  a  term• Browse  some  ontologies• Check  the  NCBO  annotator  !

Page 84: Bio ontologies and semantic technologies

BIO-­‐DATA

Page 85: Bio ontologies and semantic technologies

EBI  RDF  Resources

Page 86: Bio ontologies and semantic technologies

EBI  RDF  Resources

Page 87: Bio ontologies and semantic technologies

Ensembl

Page 88: Bio ontologies and semantic technologies

Exercise

• From  uniprot find  proteins  that  are  annotated  with  a  given  Gene  Ontology  term

Page 89: Bio ontologies and semantic technologies

PREFIX up:<http://purl.uniprot.org/core/> PREFIX taxon:<http://purl.uniprot.org/taxonomy/> PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>PREFIX obo:<http://purl.obolibrary.org/obo/>SELECT * WHERE {

?protein up:classifiedWith obo:GO_0004499.?protein up:organism taxon:9606.

}

http://sparql.uniprot.org

Page 90: Bio ontologies and semantic technologies

Exercise

• From  Expression  Atlas  find  proteins  that  are  differentially  expressed  (P  <  1e-­‐12)  in  Crohn’sdisease

Page 91: Bio ontologies and semantic technologies

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX owl: <http://www.w3.org/2002/07/owl#>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX obo: <http://purl.obolibrary.org/obo/>PREFIX sio: <http://semanticscience.org/resource/>PREFIX efo: <http://www.ebi.ac.uk/efo/>PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/>PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/>PREFIX up:<http://purl.uniprot.org/core/> PREFIX biopax3:<http://www.biopax.org/release/biopax-level3.owl#>SELECT distinct ?protein ?expressionValue ?pvalue WHERE {

?factor rdf:type efo:EFO_0000384 . ?value atlasterms:hasFactorValue ?factor . ?value atlasterms:isMeasurementOf ?probe . ?value atlasterms:pValue ?pvalue . ?value rdfs:label ?expressionValue . ?probe atlasterms:dbXref ?protein . FILTER ( ?pvalue < 1e-12 )FILTER ( strstarts(str(?protein),"http://purl.uniprot.org/uniprot/") )}

}ORDER BY ASC (?pvalue)

https://www.ebi.ac.uk/rdf/services/atlas/sparql

Page 92: Bio ontologies and semantic technologies

• Links  pathways  with  genes,  terms  from  Pathway,  Cell  line  and  Disease  ontology,  PubMed  references

• Models  individual  Interactions• Can  be  downloaded  as  RDF• Has  an  experimental  SPARQL  endpoint

WikiPathways

Page 93: Bio ontologies and semantic technologies

• Define  a  query  to  find  pathways  linked  to  TNFalpha gene

Exercise

Page 94: Bio ontologies and semantic technologies

PREFIX wp: <http://vocabularies.wikipathways.org/wp#>PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?PathwayName where {?geneProduct a wp:GeneProduct .?geneProduct dc:identifier ?GeneID .?geneProduct dcterms:isPartOf ?pathway . ?geneProduct rdfs:label ?geneName .?pathway dc:identifier ?pathwayid . ?pathway dc:title ?PathwayName . FILTER(str(?geneName) = "TNFalpha" )

}

http://sparql.wikipathways.org

Page 95: Bio ontologies and semantic technologies
Page 96: Bio ontologies and semantic technologies
Page 97: Bio ontologies and semantic technologies

• Try  this,  or  another  query– Using  web  interface– Using  http  get• Define  a  simple  describe• Use  a  web  tool  to  URLEncode the  query• Submit  query  as  a  URL  parameter

Exercise

Page 98: Bio ontologies and semantic technologies

DisGeNet

Page 99: Bio ontologies and semantic technologies

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX skos: <http://www.w3.org/2004/02/skos/core#>PREFIX void: <http://rdfs.org/ns/void#>PREFIX sio: <http://semanticscience.org/resource/>PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>PREFIX up: <http://purl.uniprot.org/core/> SELECT DISTINCT ?gene WHERE {

?gda sio:SIO_000628 ?gene,?disease .?gene a ncit:C16612 . ?gene skos:exactMatch ?GeneID .?disease a ncit:C7057 .?disease dcterms:title ?DiseaseName .?gda sio:SIO_000216 ?scoreIRI .?scoreIRI sio:SIO_000300 ?score .FILTER (?score > "0.35"^^xsd:decimal) FILTER (contains(str(?DiseaseName),"Crohn"))

}

http://rdf.disgenet.org/lodestar

Page 100: Bio ontologies and semantic technologies
Page 101: Bio ontologies and semantic technologies

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX owl: <http://www.w3.org/2002/07/owl#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX dcterms: <http://purl.org/dc/terms/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX skos: <http://www.w3.org/2004/02/skos/core#>PREFIX void: <http://rdfs.org/ns/void#>PREFIX sio: <http://semanticscience.org/resource/>PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>PREFIX up: <http://purl.uniprot.org/core/>PREFIX wp: <http://vocabularies.wikipathways.org/wp#>PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX dcterms: <http://purl.org/dc/terms/>

http://rdf.disgenet.org/lodestar

Page 102: Bio ontologies and semantic technologies

SELECT DISTINCT ?PathwayName WHERE {?gda sio:SIO_000628 ?gene, ?disease .?gene a ncit:C16612 .?disease a ncit:C7057 .?disease dcterms:title ?DiseaseName .?gda sio:SIO_000216 ?scoreIRI .?scoreIRI sio:SIO_000300 ?score .FILTER (?score > "0.35"^^xsd:decimal) FILTER (contains(str(?DiseaseName),"Crohn")) SERVICE <http://sparql.wikipathways.org/> {

?geneProduct a wp:GeneProduct .?geneProduct dc:identifier ?gene .?geneProduct dcterms:isPartOf ?pathway .?pathway dc:identifier ?pathwayid . ?pathway dc:title ?PathwayName .

} }

http://rdf.disgenet.org/lodestar/sparql

Page 103: Bio ontologies and semantic technologies