bio2rdf - make the most of virtuoso open source

14
Making the most of Virtuoso Opensource Tutorial @ ICBO 2013

Upload: alisoncallahan

Post on 01-Nov-2014

1.749 views

Category:

Technology


0 download

DESCRIPTION

Learn how to deploy Bio2RDF data in a triple store and SPARQL endpoint using Virtuoso Open Source

TRANSCRIPT

Making the most of Virtuoso Opensource

Tutorial @ ICBO 2013

Tutorial roadmap

Triplestores are Database Management System for data modeled in RDF

● Optimized for triples, queryable via the SPARQL query language

● There are three types of TripleStores:○ Native

Persistent storage systems with their own implementation of databases. Eg: Virtuoso

○ In MemoryRDF Graph is stored as triples in RAM Eg: Jena

○ Non-native non-memory: Persistent storage systems setup to run on third party DBMS Eg. Jena SDB

Managing a VOS installation for Bio2RDF

● Each Bio2RDF dataset is loaded into a Virtuoso Triplestore and available at its own SPARQL endpoint

● We have developed a PHP manager script for creating and configuring Virtuoso installations that is publicly available at: https://github.com/micheldumontier/php-lib/blob/master/apps/manager.php

● Allows you to create, start, stop and change memory requirements of multiple VOS instances○ manager.php is available for download here

● Requires: a tab-delimited file listing the desired instance names and the HTTP/ISQL ports they use

Use manager.php to configure multiple VOS instances

● Script options○ create : creates a virtuoso instance binary for the

specified instance name and starts it ○ start : stops, then starts the corresponding instance

of VOS○ stop : stops the specified instance○ refresh : creates a fresh copy of the instance's

virtuoso.ini file with default values○ apacheconfig : create an Apache VirtualHost file ○ GB of memory to use : specifies the amount of

RAM that an instance can use in GB

manager.php cli options

Set up manager.php

● Set the location of local virtuoso installation○ $virtuoso_dir = '/usr/local/virtuoso-opensource';

● Set the location of your base directory○ $base_dir = '/media/320/bio2rdf/manager';

● Set the sub-directory where the virtuoso instances will live○ $instance_dir = $base_dir.'/virtuoso';

● Set the location of the instances.tab file○ $instance_file = $base_dir.'/instances.tab';

make sure you have appropriate permissions to create target directories

Live demo -> Using manager.php

Load Bio2RDF data into your VOS endpoint

● Loading data into a VOS endpoint can be done using web interface (Conductor) or ISQL:○ http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtBulkRDFLoader

● You can also use the Bio2RDF loader script to load RDF into an endpoint○ uses a tab-delimited instances file to specify VOS

instance to load into○ can load single files or recursively load files inside a

given directory○ will record loading errors/incorrectly formatted RDF○ will re-try a certain number of times to reload data on

error

loader.php cli options

● Script options○ file: file to load○ dir: directory of files to load○ graph: name of graph to load into○ instance: instance name from tab-delimited○ port: ISQL port of VOS instance○ user: username for VOS instance (default: dba)○ pass: password for VOS instance (default: dba)

Using loader.php

● Script options cont'd:

○ threads: number of threads to use when loading○ updatefacet: update VOS faceted browser

(true|false)○ deletegraph: delete graph before loading

(true|false)○ deleteonly: delete graph without loading (true|false)○ setns: set namespaces for faceted browser○ setpassword: set password for VOS conductor○ format: format of RDF being loaded (default: N3)○ ignoreerror: ignore RDF format errors (true|false)

Using loader.php with Bio2RDF DrugBank data

● Live demo -> load Drugbank into your newly installed VOS instance

How to install Virtuoso Opensource

1. Download VOS 6.1.6 binary from here available via github: https://github.com/openlink/virtuoso-opensource

2. Verify that you have installed these packages

3. Uncompress the VOS binary, cd into that directory and run: ○ ./autotogen.sh (to prepare for building)○ Set the appropriate compiler flags○ ./configure ( > 20 minutes)

4. Install VOS by running: ○ sudo make install

■ install location: /usr/local/virtuoso-opensource

How to build OpenLink Virtuoso OS: Ubuntu

Reference: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSMake#Requirements

VOS Post-installation setup 1. Go to your installation directory and copy

○ var/lib/virtuoso/db/virtuoso.ini to bin/virtuoso.ini

2. Allow virtuoso to read other (or all) directories in your filesystem○ Edit virtuoso.ini and add '/' at the end of DirsAllowed

■ see example here

3. Start your virtuoso instance by running:○ sudo ./virtuoso-t -f & (run from /usr/local/virtuoso-opensource/bin)○ visit http://localhost:8890/conductor and login with default credentials:

user:dba, pass:dba

4. Install Faceted Browser○ Go to System Admin -> Packages, select the fct package and click on

Install/Upgrade -> proceed○ Verify that http://localhost:8890/fct responds

Reference: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main#How Do I...