1 © 2008 openlink software, all rights reserved. virtuoso product family orri erling - program...
TRANSCRIPT
1© 2008 OpenLink Software, All rights reserved.
Virtuoso Product Family
Orri Erling - Program Manager, Virtuoso
2© 2008 OpenLink Software, All rights reserved.
Virtuoso Product Categories
Virtual Database EngineNative Data Management (multi-model covering:
SQL, RDF, XML, and Free Text)Discussion PlatformMail Proxy ServicesClient Connectivity Kit Virtuoso Universal Server
2
3© 2008 OpenLink Software, All rights reserved.
Virtual Database Engine
External ODBC or JDBC accessible SQL Data Sources
External XML based Data SourcesExternal SOAP or RESTful Web ServicesExternal RDF Data (e.g. Oracle)Custom Data Sources via Server Extensions API
3
RDF, XML, SQL Conceptual Views over:
4© 2008 OpenLink Software, All rights reserved.
Virtual Database Engine Contd.SQL Queries over Remote SQL, RDF, XML, and
Web Services based Data SourcesSPARQL Queries over Remote SQL, RDF, XML,
and Web Services based Data SourcesXQuery/XPath Queries over Remote RDF, SQL,
and XML based Data Sources Web Services based access to Remote RDF,
SQL, XML, and other Web Services based Data Sources
4
5© 2008 OpenLink Software, All rights reserved.
Virtual Database Engine Contd.Distributed Query Optimization
Locality Sensitive Query Cost Optimization (Collocated Joins, Pass-Through Queries, and Array Parameters)
Deductively Abstracts SQL Dialect Differences (via ODBC and JDBC metadata call exploitation)
Message Latency Factored into Cost Model
Hash Joins Used When Appropriate, Replacing Multiple Remote Lookups with Single Sequential Read
2-Phase Commit for Distributed TransactionsMS DTC for Windows
Tuxedo on Unix
5
6© 2008 OpenLink Software, All rights reserved.
Virtual Database Engine Contd.
ATTACH TABLE Statement incorporates Remote Table, Indexes and Statistics into Local Virtuoso Schema
Allows Incorporation of SQL Functions and Stored Procedures from Remote Relational Database Engines
Support for Remote XML, Full Text Indexing for Oracle, Microsoft SQL Server
6
7© 2008 OpenLink Software, All rights reserved.
Native Data Management – Relational (RDBMS)
Native SQL 92/2K Engine
Rich Procedure Language (PSM-95 based)
Database Engine Optimized for SMP Performance
Native Full Text Indexing
7
8© 2008 OpenLink Software, All rights reserved.
RDBMS Features - Transactions
Full ACID PropertiesCheckpoint + Roll Forward Log, Optional
Archiving of LogsUncommitted/Read
Committed/Repeatable/Serializable IsolationsNon-blocking Read Committed Shows Latest
Committed Versions of Uncommitted Updated Rows
Can Work as XA/MS DTC Resource Manager
8
9© 2008 OpenLink Software, All rights reserved.
RDBMS Features - SQL
Full SQL 92 with many 2K Features
SQLX, XPATH, XSLT, Xquery
SQL 2K Objects, Implementation in SQL/Java/.net
Transparent Mixing of Local and Remote Tables
9
10© 2008 OpenLink Software, All rights reserved.
RDBMS Features – Query Optimization
Cost Based Optimization
On The Fly Sampling of Table/Column/Literal Key Cardinalities
Fixed Statistics for Deterministic Query Plans
Loop/Hash/Merge Join
SQL Options for Explicitly Specifying Query Plan
10
11© 2008 OpenLink Software, All rights reserved.
RDBMS Features - Storage Engine
Rows Stored At Leaves of Primary Key Index Tree
Non PK Indexes Refer to Row By Value of PKBitmap IndexFull Text IndexStriping Across Disks, No Separate Files Per
Table/Key Incremental Online backup
11
12© 2008 OpenLink Software, All rights reserved.
RDBMS Features - Run Time Hosting
User Defined Type via Java or .NET Objects Hosted in Process
User Defined Types Persisted in LOB Columns
Java/.NET Methods Called Transparently From SQL
‘C” based Plugin Mechanism for adding SQL Functions
12
13© 2008 OpenLink Software, All rights reserved.
RDBMS Features - Security
SQL Role Based Security, Column/Table/View/Procedure Level
Row Level Security With Policy Functions
A Policy Function Can Add Extra Conditions to Queries/Updates Depending on User, Time, Other Considerations
13
14© 2008 OpenLink Software, All rights reserved.
Data Center Features -
Clustering
Combine Multiple Servers for Massive Scale and Parallelism
All Servers Show the Same SQL/RDF Data and Application Logic, A SQL or Web Client Can Connect to Any for the Same Service
Data Partitioning Specifiable Index by Index Optional Replicated Storage of Partitions for More
Load Balancing, Fault Tolerance Shared Nothing Architecture, Works With
Commodity Hardware and Networks
14
15© 2008 OpenLink Software, All rights reserved.
Data Center Features - Query Penalization
Latency: One Message Round Trip is 20 Single Row Random Lookups
Virtuoso Divides Queries into Collocated Fragments, Ships All Filtering, Aggregation, Joining to Where the Data Is.
Sends Arrays of Hundreds of Operations at a Time, Whenever Possible
15
16© 2008 OpenLink Software, All rights reserved.
Data Center Features - Transactions
Full ACID Properties
Two Phase Commit with Single Phase Optimization
Detection of Distributed Deadlocks Without Timing Out
Each Cluster Node Keeps Own Transaction Log
No External Monitor, Virtuoso Handles Distributed Recovery Cycle By Itself
Transactions/Logging Can BE Disabled for Bulk Load etc.
16
17© 2008 OpenLink Software, All rights reserved.
Data Center Features - Parallel SQL
Transparent Map-Reduce Style Execution of Specified Partitioned SQL Functions/Procedures
PL Extensions for Async Remote Execution of SQL Code, With and Without Transactional Semantics
17
18© 2008 OpenLink Software, All rights reserved.
Data Center Features - Futures
Dynamic Deployment, Adding and Removing Cluster Nodes Without Interruption of Service
Keeping Data in Small, Self-Contained, Easily Relocatable Mini-Partitions
18
19© 2008 OpenLink Software, All rights reserved.
SQL Client Connectivity - Data Access Drivers
Cross Platform ODBC 3.0 Drivers
JDBC 2.0 Drivers
OLE-DB Provider
ADO.NET Provider
XMLA Provider
19
20© 2008 OpenLink Software, All rights reserved.
Native Data Management - XML
Native XML Data TypeSQLX + Oracle Compatible XML Functions in
SQLDocument Centric Persistence of XML with
Special Support in Text IndexXSLTXQueryXML Views – XML Mapping Schema based
Views of SQL Data Sources
20
21© 2008 OpenLink Software, All rights reserved.
Native RDF Data Management
Native RDF Quad Storage (Physical Quads)SQL Enhanced With RDF IRI and
Typed/Language Tagged DataBitmap Indices and Key Compression for
Compact StorageSelectable Index Scheme, Optionally Allows
Queries Against Union of All GraphsOptional Full Text Index of LiteralsReuses SQL Cost Model and Execution Engine
With RDF Tailored Statistics
21
22© 2008 OpenLink Software, All rights reserved.
RDF Data Services – Client Connectivity
SPARQL ProtocolJena Storage ProviderSesame Storage ProviderRedland Storage ProviderLinq2Rdf Storage ProviderSPASQL
SPARQL execution within SQL Processor
Plethora of Built-In Functions, Stored Procedures, Web Services
22
23© 2008 OpenLink Software, All rights reserved.
RDF Data Services – SPARQL
Full SPARQL, Language and Protocol SupportJena Compatible SPARUL for Create Graph,
Insert, Update, and DeleteExtensions for Aggregates & GroupingNested Queries, SQL-Like Existence and Value
SubqueriesExpressions in Result SetsPath Expressions for Compact Notation, Also in
ExpressionsFull Text & XPath Magic Predicate Extensions
23
24© 2008 OpenLink Software, All rights reserved.
RDF Data Services – Inference
24
Backward Chaining Inference Support, No Materialization of Entailed Triples needed for:Subclass and Subproperty HierarchiesOWL sameAs for Instances, Classes and PropertiesOWL equivalentClass and equivalentProperty Inference Enabled at Query or Individual Triple Pattern
Level
25© 2008 OpenLink Software, All rights reserved.
Linked Data Services - RDF-ization Middleware
Declarative RDF Views (or Covers) over SQL Data In-Built RDF Middleware (Sponger) for RDF-ization of
Harvested Web Content (bulk ingest or “on the fly”) Extended SPARQL Against Mapped and Stored RDF RDF-ization Cartridges for 30+ non RDF data
sourcesUsed by SPARQL ProcessorUsed by in-built Content Crawler
Cache Invalidation based on HTTP Caching Rules Configurable URI dereferencing via pragmas for
node selection and path traversal
25
26© 2008 OpenLink Software, All rights reserved.
Linked Data Services - Deployment
URL Rewrite Rules combined with SPARQL for flexible association of URIs and RDF Data Sets
Proxy (or wrapper) URIs construction for materializing Linked Data “on the fly” from existing Web information resources
REST or SOAP based Web Services that expose functionality to Web Clients such as OpenLink Data Explorer, Marbles, Zitgist Data Explorer, DISCO, Tabulator etc.
26
27© 2008 OpenLink Software, All rights reserved.
RDF Data Services – RDF Views over SQL Data Sources
SPARQL Data Definition Statements for RDB Mapping Declare Correspondences Between Graph/Triple
Patterns and SQL Objects Specify Mapping Between URI's and Keys , Supporting
All Data Types, Multipart Keys Not Restricted to Table per Class and Column per
Property Use Arbitrary Joins, SQL Functions and Search
Conditions Automatically Generate Basic Class per Table, Property
per Column Mapping of Given SQL Schema
27
28© 2008 OpenLink Software, All rights reserved.
RDF Data Services - RDF Views Contd.
Evaluate Arbitrary SPARQL Against an RDF View In One Query, Some Graphs May Come from Views,
Others From Stored RDF RDF Views Generate a Single SQL Statement, The IRI
Generation and IRI Parsing is Only in Selection and Constant Expressions
SQL Has Full Optimization Possibilities and the Generated SQL Does not Depend on Virtuoso Specifics
Hence, RDF Views Are Efficient for Querying Remote, non-Virtuoso SQL Data
28
29© 2008 OpenLink Software, All rights reserved.
RDF Data Services - Clustering
Cluster-Optimized RDF Loader and SPARULRDF-Aware Data PartitioningAutomatic Statistics Sampling Across Cluster for
Best Query Plan
29
30© 2008 OpenLink Software, All rights reserved.
RDF Benchmarks
TPC H With SPARQL Extensions and RDF ViewsLUBMBerlin SPARQL Benchmark with Triples and with
RDF Views
30
Bundled With:
31© 2008 OpenLink Software, All rights reserved.
Web Services Platform – HTTP Services
HTTP/1.1 and HTTPS Server for Static and Dynamic Content
Dynamic Web Pages in PHP, Virtuoso SQL Procedures, ASP .net, Others
SOAP and Rest Web Services in Virtuoso PL, Java, .NET
DAV
31
32© 2008 OpenLink Software, All rights reserved.
Web Services Platform - WebDAV
Documents Stored in Virtuoso DatabaseACL Based plus Unix Style Security, SQL User
Accounts and Roles Own Documents and Collections
Automatic RDF Metadata ExtractionOptional Full Text Indexing and VersioningDynamic Collections for Alternate Views of
Directory Hierarchy
32
33© 2008 OpenLink Software, All rights reserved.
Web Services Platform – SOAP & REST
SOAP 1.1/1.2 End Points Exposing SQL Procedures in All SOAP Styles
Automatic WSDL GenerationSQL Extensions for Declaring Full XML Schema
Signatures for End PointsExposing Java and .net via SOAPDynamic Web Pages and XML Functions for
REST ServicesXMLA for SQL Access over SOAP
33
34© 2008 OpenLink Software, All rights reserved.
Web Services Platform - Dynamic Server Pages
Configure a Virtual Directory as ExecutablePublish Dynamic Web Pages in PHP, Virtuoso
PL, Ruby, PERL, ASP .net Without Using External Web Server
34
35© 2008 OpenLink Software, All rights reserved.
Administration Services
Web Interface for Setup of Web End Points, SQL, XML, RDF Functions
SQL Functions for Full Programmatic Admin Access
Simple Tuning, Only Specify File Layout and Amount of Threads and Memory to Use
35
36© 2008 OpenLink Software, All rights reserved.
Virtuoso RDF Applications
DbpediaBIO2RDFNeurocommonsZitgist, Pingthesemanticweb, Musicbrainz
36
37© 2008 OpenLink Software, All rights reserved.
Product
Open Source and Closed Source Versions, Closed Source AddsVirtual Database and Clustering
All Code, Applications, Samples, Docs in Single Download
Minimal Installation Consists of Single Executable + Config File
Web Admin Interface and Bundled ODS Collaborative Apps Suite
Available for All Linux, Unix, Windows, 32 and 64 bit Available Preinstalled on Amazon EC2, With Optional
Preloaded Dbpedia, BIO2RDF, Other RDF Data Sets
37