Department of Bioinformatics - BiGCaT 1
RDF creation with scripts
(and Open PHACTS specs)
Egon Willighagen (@egonwillighagen)17 June 2013, #MIILS2013, Leiden
Department of Bioinformatics - BiGCaT 2
Scripting?
• Reproducible• Electronic Lab Notebook
–Put everything in a Git repository
• Automates–Data set creation/conversion–Autogeneration of VoID headers
Department of Bioinformatics - BiGCaT 3
RDF scripting in Bioclipse?
• Access to Jena• Groovy or JavaScript• Domain extensions
Try:
http://pele.farmbio.uu.se/jenkins/job/OpenTox_QSAR_DS/
Department of Bioinformatics - BiGCaT 4
The Bioclipse Workbench
Department of Bioinformatics - BiGCaT 5
Bioclipse: New project, new file
• File New General Project→ → →–Enter a name–Finish
• File New General File→ → →–Select project–Enter a name, e.g. test.n3–Finish
Department of Bioinformatics - BiGCaT 6
Department of Bioinformatics - BiGCaT 7
RDF formats
• Save files:– rdf.saveRDFN3(..)– rdf.saveRDFNTriple(..)– rdf.saveRDFXML(..)
• As String:– rdf.asRDFN3(..)– rdf.asTurtle(..)
Department of Bioinformatics - BiGCaT 8
Department of Bioinformatics - BiGCaT 9
Install RDF graph visualizer
Department of Bioinformatics - BiGCaT 10
Install RDF graph visualizer
Department of Bioinformatics - BiGCaT 11
Open with RDF Graph Viewer→
Department of Bioinformatics - BiGCaT 12
JavaScript or Groovy ?
Window → Show View → Other... →
Department of Bioinformatics - BiGCaT 13
CSV files in Bioclipse
Department of Bioinformatics - BiGCaT 14
CSV files in Bioclipse: path...
Right click on Project →Properties
Department of Bioinformatics - BiGCaT 15
CSV files in Groovy
filename =
"/home/egonw/workspaces/dataman/Test/test.csv"
line = 0
new File(filename).splitEachLine(",") { fields ->
line++
if (line != 1) { // skip the header
println fields
println fields[0]
}
}
Department of Bioinformatics - BiGCaT 16
CSV files in Groovy
[foo, 2, 3]
foo
[bar, 3, 4]
bar
Department of Bioinformatics - BiGCaT 17
Setting up and filling a triples store
s = rdf.createInMemoryStore()
rdf.addPrefix(s, "foo", "http://example.org/")
rdf.addDataProperty(s,
"http://example.org/foo",
"http://www.w3.org/2000/01/rdf-schema#label",
"Foo"
)
rdf.addDataProperty(s,
"http://example.org/bar",
"http://www.w3.org/2000/01/rdf-schema#label",
"Bar"
)
rdf.addObjectProperty(s,
"http://example.org/foo",
"http://example.org/interactsWith",
"http://example.org/bar"
)
rdf.saveRDFN3(s, "/Test/csv.n3")
Department of Bioinformatics - BiGCaT 18
csv.n3
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix foo: <http://example.org/> .
foo:bar
rdfs:label "Bar" .
foo:foo
rdfs:label "Foo" ;
foo:interactsWith foo:bar .
Department of Bioinformatics - BiGCaT 19
Cleaning your code
exNS = "http://example.org/"
rdfsNS = "http://www.w3.org/2000/01/rdf-schema#"
s = rdf.createInMemoryStore()
rdf.addPrefix(s, "foo", "http://example.org/")
rdf.addDataProperty(s, exNS + "foo", rdfsNS + "label", "Foo" )
rdf.addDataProperty(s, exNS + "bar", rdfsNS + "label", "Bar" )
rdf.addObjectProperty(
s, exNS + "foo", exNS + "interactsWith", exNS + "bar"
)
rdf.saveRDFN3(s, "/Test/csv.n3")
Department of Bioinformatics - BiGCaT 20
Help? → man rdf
Department of Bioinformatics - BiGCaT 21
Other (Bioclipse) managers
bioclipse browser cdk chemspider cml ds gist inchi jcp jcpglobal jmol js matrix molTable opentox opsin owl pellet pubchem qsar rdf signatures ui userManager ws xml
And you can install more features.
Department of Bioinformatics - BiGCaT 22
Pellet: an OWL reasonerexNS = "http://example.org/"owlNS = "http://www.w3.org/2002/07/owl#"rdfNS = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
// s = rdf.createInMemoryStore()s = pellet.createInMemoryStore()rdf.addObjectProperty(s, exNS + "hasAncestor", rdfNS + "type", owlNS + "TransitiveProperty");
rdf.addObjectProperty(s, exNS + "Elizabeth", exNS + "hasAncestor", exNS + "Carl");rdf.addObjectProperty(s, exNS + "Carl", exNS + "hasAncestor", exNS + "Nils");
allAncestors = "SELECT ?ancestor { <" + exNS + "Elizabeth" + "> <" + exNS + "hasAncestor" + "> ?ancestor . }"
rdf.sparql(s, allAncestors)
Department of Bioinformatics - BiGCaT 23
OWL reasoning
• regular rdf store
[["ancestor"],
["http://example.org/Carl"]
]
• pellet OWL store
[["ancestor"],
["http://example.org/Nils"],
["http://example.org/Carl"]
]
Department of Bioinformatics - BiGCaT 24
SPARQL for unit testing
results = rdf.sparql(s, unitTest1)
// expect 1 row
if (results.getRowCount() < 1) {
println (“FAIL: too little data!”)
}
Department of Bioinformatics - BiGCaT 25
Share your script via myExperiment