picasso big data expert group · picasso big data expert group ... a lighweight approach for data...

19
© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS PICASSO Big Data Expert Group Sören Auer

Upload: vuongtu

Post on 02-May-2018

230 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme

IAIS

PICASSO Big Data Expert Group

Sören Auer

Page 2: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 2

The three Big Data „V“ – Variety is often neglected

Quelle: Gesellschaft für Informatik

Sören Auer 2

Page 3: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 3

Semantic Web Layer Cake 2001

http://www.w3.org/2001/10/03-sww-1/slide7-0.html

• Monolithic based on XML

• Focus on heavyweight Semantic (Ontologies, Logic, Reasoning)

Page 4: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 4

The Semantic Web Layer Cake 2015 –

“A Little Semantics Goes a Long Way”

Unicode URIs

XML JSON CSV RDB HTML

RDF

RDF/XML JSON-LD CSV2RDF R2RML RDFa

RDF Data Shapes

RDF-Schema

Vocabularies

OntologienSKOS Thesauri

LogikSWRL Regeln

SPARQL

(Acc

ess

co

ntr

ol)

, Sig

natu

r,

En

cryp

tio

n (

HT

TP

S/C

ER

T/D

AN

E),

• Lingua Franca of Data integration with many technology interfaces (XML, HTML, JSON, CSV, RDB,…)

• Focus on lightweight vocabularies, rules,thesauri etc.

• Less “invasive”

Page 5: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer – Seite 5

INTEGRATING BIG DATA &

LINKED DATA

Page 6: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 6

Blueprint of the Data Aggregator Platform

Follows typical Lambda Architecture

Integrated on top of existing Big Data distribution

+ Semantic Layer (Retaining Semantics using LD approach )

6

Batch Layer

Speed Layer

Data Storage

Real-time data &

Transactions …

Batch View

Real-time

View

mess

ag

e p

ass

ing

message passing

Applications & Showcases

Real-time dashboards

Domain-specific BDE apps

Big Data Analytics

In-stream Mining

BD

E P

latfo

rm&

Inte

lligen

ce

Input data

Stream

Spatial

SocialStatistical Temporal

TransactionalImagery

Page 7: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 7

Adding a Semantic Layer to Data Lakes

7

ManagementAccounting

Regulatory Reporting Risk TreasuryAccounting

Semantic Data Lake• central place for

model, schema and data historization

• Combination of Scale Out (cost reduction) and semantics (increased control & flexibility)

• grows incrementally (pay-as-you-go)

Inbound

Data Sources

Outbound and Consumption

Inbound Raw Data Store

Data Lake (order of magnitude cheaper scalable data store)

Knowledge Graph for Relationship Definition and Meta Data

Frontend to Access Relationship and KPI Definition / Documentation

Frontend to Access (ad hoc) ReportsOutbound Data Delivery to

Target Systems

[1] Wrobel, Voss, Köhler, Beyer, Auer: Big Data, Big Opportunities - Anwendungssituation und Forschungsbedarf. Informatik

[2] Debattista, Lange, Scerri, Auer: Linked 'Big' Data. IEEE/ACM Big Data Computing BDC 2015: 92-98

JSON-LD CSVW R2RMLXML2RDF

Page 8: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 8

INDUSTRIAL DATA SPACE

Page 9: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 9

Vocabulary-based Integration facilitates Data-driven

Businesses

Vocabulary

Page 10: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer ·· Seite 10

Die Arbeiten zum Industrial Data Space sind

komplementär verzahnt mit der Plattform Industrie 4.0

Handel 4.0 Bank 4.0Versicherung

4.0

…Industrie 4.0Fokus auf die

produzierende Industrie Smart Services

Übertragung,Netzwerke

Echtzeitsysteme

Industrial Data SpaceFokus auf Daten

Daten

Page 11: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme

IAIS

The Industrial Data Space Initiative

Community of >30 large German and European Companies

Pre-competitive, publicly funded innovation project involving 11 Fraunhofer institutes for developing IDS reference architecture

Current signatories of the MoU to support the Industrial Data Space Association

Page 12: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 12

Bilder: ©FotoliaFrancesco De Paoli, Nmedia, hakandogu

Semantic Data Linking for Enterprise Data Value Chains

Data Lake Pure Internet

centralized, monopolisticfederated, secure, „trusted“,

standard-basedcompletely dezentral, open,

unsecure

Data management Central Repository Decentral Decentral

Data Ownership Central Decentral Decentral

Data Linking Single provider Federated, on demand Missing

Data Security Bilateral Certified system Bilateral

Market structure Central Provider Role system Unstructured

Transport infrastructure Internet Internet Internet

Industrial Data Space

Page 13: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 13

Bilder: © Fotolia 77260795 ∙ 73040142 58947296 ∙ 68898041

Basic principles of the Industrial Data Space

On DemandVernetzung

Linked Light Semantics

Securitywith

Industrial Data

Container

Certified Roles

On DemandInterlinking

Page 14: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 14Bildquellen: Istockphoto

Industrial Data Space:

On Demand Interlinking

Service A

Service C

Service E

Service B

Service D

Service G

Service F

Enterprise 4

Enterprise 1

Enterprise 6

Enterprise 2Enterprise 3

Enterprise 5

All Data stays with its Ownern and are controlled and secured. Only on request for a service data will be shared. No central platform.

Page 15: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 15 --- VERTRAULICH ---

Linked Light Semantics

A lighweight approach for Data Interlinking

Q: istockphoto.com

Classical Enterprise systems

Fixed Data schema

Globale Enforcement

Closed

ManuelTransformation

High cost

Linked Light Semantics

Reference vocabularies

Bridge between local Representations

Intelligent and structured interlinked

Automatic translation/mapping

Leight-weight

Internet / WWW

Web pages

Only Links

Completely open

Lack of standardization

No structure

Page 16: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 16 --- VERTRAULICH ---

Industrial Data Space

Upload / Download / Search

Internet

AppsVocabulary

Industrial Data Space

Broker

Clearing

RegistryIndex

Industrial Data Space

App Store

Internal IDS

Connector

Company A Internal IDS

Connector

Company B

External IDS

Connector

External IDS

Connector

Upload

Third Party

Cloud Provider

Download

Upload / Download

© Fraunhofer

IDS Architecture Overview

Page 17: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 17

Industry 4.0

Semantic Models as Bridge between Shop & Office Floor

Page 18: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 18

Semantic Administrative Shell &

Reference Architecture for Industry 4.0 (RAMI4.0)

Administrative Shell (Verwaltungsschale) provides a digital identity for arbitraryIndustry 4.0 components (e.g. sensors, actors/robots) exposing data coveringthe whole life-cycle

Reference Architecture for Industry 4.0 (RAMI4.0) provides a conceptualframework for implementingcomprehensive Industry 4.0 scenarios

We have implemented both conceptsalong with a number of IEC and ISO standards in a comprehensiveinformation model ready to beimplemented in productiveenvironments

Page 19: PICASSO Big Data Expert Group · PICASSO Big Data Expert Group ... A lighweight approach for Data Interlinking Q: istockphoto.com ... • Integrating Linked Data and Big Data technology

© Fraunhofer · Seite 19

Summary

Challenges and Opportunities - Interoperability and Standardization

• Adding a semantic layer to Big Data technology

• Integrating Linked Data and Big Data technology

• Towards Enterprise Knowledge Graphs and Data Spaces

• Applications e.g. in Manufacturing, Cultural Heritage, Finance