making semantic data federation work

Post on 20-May-2015

960 Views

Category:

Entertainment & Humor

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Enterprises are drowning in data that they can't find, access, or use. For many years, enterprises have wrestled with the best way to combine all that data into actionable information without building systems that break as schemas evolve. Approaches like warehousing and ETL can be brittle in the face of changing data sources or expensive to create. Data integration at the application level is common but this results in significant complexity in the code. Data-oriented web services attempt to provide reusable sources of integrated data, however these have just added another layer of data access that constrain query and access patterns.This talk will look at how semantic web technologies can be used to make existing data visible and actionable using standards like RDF (data), R2RML (data translation), OWL (schema definition and integration), SPARQL (federated query), and RIF (rules). The semantic web approach takes the data you already have and makes that data available for query and use across your existing data sources. This base capability is an excellent platform for building federated analytics.

TRANSCRIPT

Making Semantic Data Federation Work

by Alex Miller

Data Integration Problems1. Discovery and description

2. Internal integration

3. External integration

4. Nomadic data

5. Inflexible interfaces

2

1. Discovery and description

• What data do we have?

• What does it mean?

• Who is creating it?

• Who is using it?

3

2. Internal integration• Does your order entity have the same

fields as my entity?

• Are your codes for order status the same as my codes for order status?

4

3. External integration• Does a public source of information

exist?

• How do the entities in the public source relate to the entities in my data?

5

4. Nomadic data

• Where does your data come from?

• Which version of the data are you using?

• Why does your data not match my data?

6

5. Inflexible interfaces

• Why can't I see all of my data?

• Why does it take months to expose a new data element in my application?

7

Results

8

Data Information ActionX

Semantic Technologies• Data model - RDF

• Metadata - RDFS/OWL

• Entailment - OWL, RIF

• Relational data - R2RML

• Query - SPARQL

• Federation - SPARQL Protocol, Federation

9

10

Semantic Data Source

SPARQL Protocol

SPARQL

RDFS/OWL

RDF

Semantic Data Source

10

Semantic Data Source

SPARQL Protocol

SPARQL

RDFS/OWL

RDFData model

Semantic Data Source

10

Semantic Data Source

SPARQL Protocol

SPARQL

RDFS/OWL

RDF

Metadata

Semantic Data Source

10

Semantic Data Source

SPARQL Protocol

SPARQL

RDFS/OWL

RDF

Query

Semantic Data Source

10

Semantic Data Source

SPARQL Protocol

SPARQL

RDFS/OWL

RDF

API

Semantic Data Source

Semantic Data Source

11

Relational Access

SPARQL Protocol

SPARQL

RDFS/OWL

RDB2RDF

RDF

RelationalDatabase

SQL

Semantic Data Source

11

Relational Access

SPARQL Protocol

SPARQL

RDFS/OWL

RDB2RDF

RDF

RelationalDatabase

SQL

Virtual

Semantic Data Source

11

Relational Access

SPARQL Protocol

SPARQL

RDFS/OWL

RDB2RDF

RDF

RelationalDatabase

SQL

Music Database

12

MID First Last Inst_ID

1 Eddie Van Halen 10

2 Yo Yo Ma 20

3 Kenny G 30

Musicians:

IID Instrument Type

10 Guitar String

20 Cello String

30 Saxophone Woodwind

Instruments:

Musician Schema

13

music:Instrument

rdfs:domain

music:Musician

rdf:type

rdfs:Class rdf:Property

music:firstName

music:lastName

music:plays

music:instName

music:instType

rdf:type

rdfs:domain

rdfs:domain

rdfs:range

rdfs:domainrdfs:domain

Triples From Tables

14

MID First Last Inst_ID

1 Eddie Van Halen 10

2 Yo Yo Ma 20

3 Kenny G 30

Musicians:

artist:1 rdf:type music:Musicianartist:2 rdf:type music:Musicianartist:3 rdf:type music:Musician

Turn each key into a resource and specify the proper type of each resource:

IID Instrument Type

10 Guitar String

20 Cello String

30 Saxophone Woodwind

Instruments:

instrument:10 rdf:type music:Instrumentinstrument:20 rdf:type music:Instrumentinstrument:30 rdf:type music:Instrument

Triples From Tables

15

MID First Last Inst_ID

1 Eddie Van Halen 10

2 Yo Yo Ma 20

3 Kenny G 30

Musicians:

artist:1 music:firstName "Eddie"artist:1 music:lastName "Van Halen"artist:2 music:firstName "Yo Yo"artist:2 music:lastName "Ma"artist:3 music:firstName "Kenny"artist:3 music:lastName "G"

Turn each cell into a triple based on the key, property (mapped per column), and value:

IID Instrument Type

10 Guitar String

20 Cello String

30 Saxophone Woodwind

Instruments:

instrument:10 music:instName "Guitar"instrument:10 music:instType "String"instrument:20 music:instName "Cello"instrument:20 music:instType "String"instrument:30 music:instName "Saxophone"instrument:30 music:instType "Woodwind"

Triples From Tables

16

MID First Last Inst_ID

1 Eddie Van Halen 10

2 Yo Yo Ma 20

3 Kenny G 30

Musicians:

artist:1 music:plays instrument:10artist:1 music:plays instrument:20artist:2 music:plays instrument:30

Turn each foreign key reference into a relationship between the foreign and primary resources.

IID Instrument Type

10 Guitar String

20 Cello String

30 Saxophone Woodwind

Instruments:

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map rr:tableName

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

rr:class

rr:tableName

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

Predicate Object Map

Predicate Map

Object Map

rr:class

rr:tableName

rr:predicate

rr:column

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

Predicate Object Map

Predicate Map

Object Map

rr:class

rr:tableName

rr:predicate

rr:column

Domain

ontology

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

Predicate Object Map

Predicate Map

Object Map

rr:class

rr:tableName

rr:predicate

rr:column

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

Predicate Object Map

Predicate Map

Object Map

rr:class

rr:tableName

rr:predicate

rr:column

Database

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

Predicate Object Map

Predicate Map

Object Map

rr:class

rr:tableName

rr:predicate

rr:column

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

Predicate Object Map

Predicate Map

Object Map

rr:class

rr:tableName

rr:predicate

rr:columnR2RML

R2RML Triple Mapping

17

IID Instrument Type

10 Guitar String

music:Instrumentmusic:instName

music:instType

rdfs:domain

rdfs:domain

Instruments:

Triples Map

Subject Map"http://example.com/music/

Inst-{iid}"

Predicate Object Map

Predicate Map

Object Map

rr:class

rr:tableName

rr:predicate

rr:column

Registry• Semantic data sources are self-describing

and use a common protocol

• Easy to build into a registry w/ additional metadata (also described with RDFS/OWL)

18

Benefits of semantic technology stack

1. Common data model

2. Precise description

3. Uniform access

4. Federation

19

1. Common data model

• RDF provides common model for both data and descriptions of all kinds

• Very flexible (but also very fine-grained)

20

21

dbp: http://dbpedia.org/resource/ex: http://example.org/ontology/rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#rdfs: http://www.w3.org/2000/01/rdf-schema#

ex:City

dbp:London

rdf:type

ex:cityFounded47

rdf:Property

rdf:type

rdfs:domainrdfs:range

xsd:gYear

2. Precise flexible description

rdf:Class

rdf:type

3. Uniform access

• SPARQL 1.1

• SPARQL Protocol

• HTTP

22

23

Semantic Data Source

Semantic Data Source

Semantic Data Source

4. Federation

RelationalDatabase

DBPedia

Data Integration Solutions(with semantics)

1. Discovery and description

2. Internal integration

3. External integration

4. Nomadic data

5. Inflexible interfaces

24

Challenges

25

Challenges

• Relating data domains

25

Challenges

• Relating data domains

• Security

25

Challenges

• Relating data domains

• Security

• Unconstrained query access

25

Challenges

• Relating data domains

• Security

• Unconstrained query access

• Federated query optimization

25

Thanks!

Visit us at http://revelytix.com or at our booth!

26

top related