dive deep into your data pools

46
Welcome to this webinar! Andreas Blumauer CEO of Semantic Web Company

Upload: semantic-web-company

Post on 29-Jul-2015

708 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Dive deep into your Data Pools

Welcome to this webinar!

Andreas BlumauerCEO of Semantic Web Company

Page 2: Dive deep into your Data Pools

About Semantic Web Company (SWC)

SWC was founded 2001, head-quartered in Vienna

25 experts in Linked Data technologies

PoolParty Suite based on RDF Graph Data Model

Serving customers from all over the world

EU- & US-based consulting services

Page 3: Dive deep into your Data Pools

Our Ecosystem: Customers & Partners

Some of our Customers

● Credit Suisse● Boehringer Ingelheim● Roche● adidas● The Pokémon Company● Canadian Broadcasting Corporation (CBC)● Red Bull Media House● Wolters Kluwer● TC Media● Techtarget● BMJ Publishing Group● CafePress● Pearson - Always Learning● Education Services Australia● American Physical Society● Healthdirect Australia● World Bank Group● Inter-American Development Bank (IADB)● Renewable Energy Partnership● Wood MacKenzie● Development Initiatives● International Atomic Energy Agency (IAEA)

Finance / Automotive / Publisher / Health Care / Public Administration / Energy / Education

Selected Partners

● PwC● EPAM Systems● iQuest● EBCONT● Gravity Zero● MarkLogic● OpenLink Software● Ontotext● Wolters Kluwer● Data to Value● Digirati● Term Management● Altotech

We are all working on thereplacement of data chaos by networking information

● Norwegian Directorate of Immigration

● Ministry of Finance (A)● Council of the E.U.● Australian National

Data Service

Page 4: Dive deep into your Data Pools

PoolParty Core Modules

Bain Capital is a venture capital company based in Boston, MA.Since inception it has invested in hundreds of companies including AMC Entertainment, Brookstone, and Burger King. The company was co-founded by Mitt Romney.

Taxonomy &Ontology Server

Entity Extraction & Text Mining

Semantic Search, Analytics & Visualization

Page 5: Dive deep into your Data Pools

Why Graph Databases?

Page 6: Dive deep into your Data Pools

The Enterprise Perspective:The End of the Document

“Life is no longer as simple as making PDF documents.”

John WalkerBusiness Analyst at NXP Semiconductors

Page 7: Dive deep into your Data Pools

The Enterprise Perspective:Graph Databases are Smart Data Lakes

“Data in a large cooperation is often scattered over various tools, comes in different formats and with different levels of quality.”

Fabian HeinemannData Scientist at Roche

Page 8: Dive deep into your Data Pools

The NPO Perspective:Using common Definitions and Standards

“Very few datasets tell a story in isolation.”The Data ManifestoDevelopment Initiatives

Page 9: Dive deep into your Data Pools

The warehouse approach seems to be broken in a complex world

Data Warehouse

- structures and categories predefine the kind of analysis that is possible

- excludes data to simplify the data model

- does not efficiently handle new types of data

- supports efficient indexing- enforces consistency

- includes all data that may be used and even data that may never be used

- all data regardless of source and structure is kept

- data kept in its raw form and only transformed when used

- handles structured and unstructured data- data models emerge with usage over time

Data Lake

Page 10: Dive deep into your Data Pools

The Analyst’s Perspective: Data Lakes don’t fix the problem of lacking semantics

“Organizations should focus on semantic consistency and performance in upstream applications and data stores instead of information consolidation in a data lake.”

GartnerBeware of the Data Lake Fallacy

Page 11: Dive deep into your Data Pools

Data Lakes have all the information to answer complex queries, but….

Country GDP Pop

AUS 1,560 23.14

SVE 580 9.60

WITH A COMBINED NUMBER of 357,100 registered asylum claims in 2013, Germany, the United States of America, France, Sweden and Turkey were the top five receiving countries, together accounting for nearly six out of ten asylum claims submitted in the 44 industrialized countries covered by this report.

Place Asylum seekers

Year

Australia 24,300 2013

Sweden 54,300 2013

Show me all reports, in which EU member countries are mentioned with regards to their asylum politics, which have more than 10 asylum-seekers per 1,000 inhabitants.

Page 12: Dive deep into your Data Pools

...taxonomies link constantly changing data sources while analytic needs are evolving

Countries

European Union

SwedenSVE

FranceFRA

AustriaAUT

Oceania

Country GDP Pop

AUS 1,560 23.14

SVE 580 9.60

Place Asylum seekers

Year

Australia 24,300 2013

Sweden 54,300 2013

WITH A COMBINED NUMBER of 357,100 registered asylum claims in 2013, Germany, the United States of America, France, Sweden and Turkey were the top five receiving countries, together accounting for nearly six out of ten asylum claims submitted in the 44 industrialized countries covered by this report.

Page 13: Dive deep into your Data Pools

Linked Data Warehouses are Smart Data Lakes

Data Lake

Data Warehouse

- supports efficient indexing- enforces consistency

- handles structured & unstructured data

- data models emerge with usage over time

- standards-based- unified data model- powerful query language

Page 14: Dive deep into your Data Pools

What if questions emerge when one starts analyzing the data?

Page 15: Dive deep into your Data Pools

The power of knowledge graphs: Agility, flexibility, complexity

doc doc doc

Norway France Austria Canada

doc

Norway France Austria Canada

doc

Show me all documents about

European countriesTraditional approach Graph-based approach

doc doc doc

Page 16: Dive deep into your Data Pools

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

Europe,France

Europe,Austria

America,Canada

doc

Norway France Austria Canada

doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

doc doc doc

Page 17: Dive deep into your Data Pools

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

Europe,France

Europe,Austria

America,Canada

doc

Norway France Austria Canada

doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents about EU

member countries

doc doc doc

Page 18: Dive deep into your Data Pools

Norway France Austria Canada

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

E.U,Europe,France

E.U,Europe,Austria

America,Canada

doc doc doc doc doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents about EU

member countries

E.U

Page 19: Dive deep into your Data Pools

Norway France Austria Canada

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

French,EU,

Europe,France

EU,Europe,Austria

French,America,Canada

doc doc doc doc doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents about EU

member countries

French-speaking?

French-speaking

EU

Page 20: Dive deep into your Data Pools

Norway France Austria Canada

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

French,EU,

Europe,France

EU,Europe,Austria

French,America,Canada

doc doc doc doc doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents from EU member countries

French-speaking?

French-speaking

EU

Metadata per document1. No or little network effects2. No reuse of metadata3. Metadata resides in silos4. Data quality hard to measure5. Not machine-readable

Knowledge about metadata1. Explicit knowledge models2. Reusable and measurable3. Metadata is machine-processable4. Standards-based metadata5. Linkable metadata opens silos

Page 21: Dive deep into your Data Pools

Better Together: Unstructured and Structured Data.

Page 22: Dive deep into your Data Pools

Towards a Linked Data based search

Page 24: Dive deep into your Data Pools

PoolParty GraphSearch = Semantic Search + Analytics

Page 25: Dive deep into your Data Pools

Complex Queries based on SPARQL and Linked Data

SELECT DISTINCT ?personname ?picture ?countryname ?hdi ?picture

WHERE

{

?person skos:prefLabel ?personname .

?country skos:prefLabel ?countryname .

?person a dbpedia:Person .

?country a dbpedia:Country .

?person skos:related ?country .

?country <http://dbpedia.org/property/hdi> ?hdi .

FILTER ( ?hdi < 0.6)

OPTIONAL

{

?person foaf:depiction ?picture .

}

} ORDER BY DESC(?hdi)

I want to explore medical research trends in relation to regional prosperity.

Page 26: Dive deep into your Data Pools

Organizing data in graphs using links

Graph nervous_system_diseases-abstracts

Graph en.dbpedia.org

Graph www.nlm.nih.gov/mesh

Graph www.geonames.org

Page 27: Dive deep into your Data Pools

PoolParty Semantic IntegratorSystem Architecture

Classified documents + Linked taxonomies +Knowledge graphs

● Dynamic filter criterias● BI-like interface● Large scale RDF store● Fully RDF compatible● All queries via SPARQL

saddsd

s

sadd

sds

adsaddsds

dsad

dsds

saddsds

dsad

dsds

Page 28: Dive deep into your Data Pools

UnfiedViews as part of PoolParty Semantic Integrator

UnifiedViews differs from other ETL frameworks by natively supporting RDF data and ontologies.

UnifiedViews has a graphical user interface for the administration, debugging, and monitoring of the ETL process.

Page 29: Dive deep into your Data Pools

Use Cases

Page 30: Dive deep into your Data Pools

Success story: Healthdirect Australia

Over 120 information partners and sources

Great variety of category and metadata systems

One central vocabulary hub:Australian Health Thesaurus (AHT)

Single point of access incl. harmonized search facets:

http://www.healthdirect.gov.au/

Page 31: Dive deep into your Data Pools

Clean Energy Data - Country Profiles

Page 33: Dive deep into your Data Pools

Complex queries with SPARQLPREFIX mrv-schema: <http://gbpn.org/mrv-schema/> PREFIX qb: <http://purl.org/linked-data/cube#>

SELECT DISTINCT * WHERE { GRAPH <http://gbpn.org/mrv> { ?observation mrv-schema:year ?year. ?observation mrv-schema:region ?region. ?observation mrv-schema:region <http://gbpn.org/mrv-thes/region/India>. ?observation mrv-schema:scenario ?scenario. ?observation mrv-schema:scenario <http://gbpn.org/mrv-thes/scenario/deep-efficiency>. { ?observation mrv-schema:urbanizationType ?urbanizationType. ?observation mrv-schema:urbanizationType <http://gbpn.org/mrv-thes/urbanization-type/urban>. ?observation mrv-schema:buildingType ?buildingType. ?observation mrv-schema:buildingType <http://gbpn.org/mrv-thes/building-type/MF>. ?observation mrv-schema:publicBuildingType ?publicBuildingType. ?observation mrv-schema:publicBuildingType <http://gbpn.org/mrv-thes/public-building-type/NO>. } UNION { ?observation mrv-schema:urbanizationType ?urbanizationType. ?observation mrv-schema:urbanizationType <http://gbpn.org/mrv-thes/urbanization-type/urban>. ?observation mrv-schema:buildingType ?buildingType. ?observation mrv-schema:buildingType <http://gbpn.org/mrv-thes/building-type/Slums>. ?observation mrv-schema:publicBuildingType ?publicBuildingType. ?observation mrv-schema:publicBuildingType <http://gbpn.org/mrv-thes/public-building-type/NO>. } UNION { …….

Page 35: Dive deep into your Data Pools

PoolParty 5.1

Page 36: Dive deep into your Data Pools

Highly precise entity extraction

Domain-specific extraction, highly performant, language-agnostic, disambiguation rules, REST API

Page 37: Dive deep into your Data Pools

Providing context in the knowledge graph

Page 38: Dive deep into your Data Pools

Activating disambiguation

Page 39: Dive deep into your Data Pools

Semantic Records Management: Integration with Confluence Blueprints

⇒ Solution for Semantic Records Management

Page 40: Dive deep into your Data Pools

Fully integrated web crawler

Make use of text corpus analysis: Retrieve documents from various sources, like RSS or from websites

Page 41: Dive deep into your Data Pools

Web Crawler extracts candidate terms from any website

Page 42: Dive deep into your Data Pools

Extended ontology management & semantic reasoning

From SKOS taxonomies to full-blown ontologies: PoolParty supports various levels of knowledge modeling

Page 43: Dive deep into your Data Pools

Publishing custom schemes

Page 44: Dive deep into your Data Pools

Further extension of PoolParty API

● API method for skos:notes

● API method for skosxl:labels

● API methods for skos:collections

● API method to collect custom properties, attributes and types

● API method to R/W workflow status

● Retrieve history API method

● Retrieve SKOS subtree

Developer

Page 45: Dive deep into your Data Pools

Get started with PoolParty. Try it out now!

Get your PoolParty 5.1 Thesaurus Server & Entity Extractor trial:

http://www.poolparty.biz/test-demo/

Page 46: Dive deep into your Data Pools

Contact points & further informationAndreas Blumauer, MSc IT

[email protected]

https://www.linkedin.com/in/andreasblumauer

Semantic Web Company GmbHMariahilfer Strasse 70/8, A-1070 Vienna

+43-1-4021235

http://www.semantic-web.at

http://www.poolparty-software.com

Social Media Channels

http://slideshare.net/semwebcompany

http://youtube.com/semwebcompany

https://www.linkedin.com/groups?home=&gid=4059165