dive deep into your data pools

Post on 29-Jul-2015

708 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Welcome to this webinar!

Andreas BlumauerCEO of Semantic Web Company

About Semantic Web Company (SWC)

SWC was founded 2001, head-quartered in Vienna

25 experts in Linked Data technologies

PoolParty Suite based on RDF Graph Data Model

Serving customers from all over the world

EU- & US-based consulting services

Our Ecosystem: Customers & Partners

Some of our Customers

● Credit Suisse● Boehringer Ingelheim● Roche● adidas● The Pokémon Company● Canadian Broadcasting Corporation (CBC)● Red Bull Media House● Wolters Kluwer● TC Media● Techtarget● BMJ Publishing Group● CafePress● Pearson - Always Learning● Education Services Australia● American Physical Society● Healthdirect Australia● World Bank Group● Inter-American Development Bank (IADB)● Renewable Energy Partnership● Wood MacKenzie● Development Initiatives● International Atomic Energy Agency (IAEA)

Finance / Automotive / Publisher / Health Care / Public Administration / Energy / Education

Selected Partners

● PwC● EPAM Systems● iQuest● EBCONT● Gravity Zero● MarkLogic● OpenLink Software● Ontotext● Wolters Kluwer● Data to Value● Digirati● Term Management● Altotech

We are all working on thereplacement of data chaos by networking information

● Norwegian Directorate of Immigration

● Ministry of Finance (A)● Council of the E.U.● Australian National

Data Service

PoolParty Core Modules

Bain Capital is a venture capital company based in Boston, MA.Since inception it has invested in hundreds of companies including AMC Entertainment, Brookstone, and Burger King. The company was co-founded by Mitt Romney.

Taxonomy &Ontology Server

Entity Extraction & Text Mining

Semantic Search, Analytics & Visualization

Why Graph Databases?

The Enterprise Perspective:The End of the Document

“Life is no longer as simple as making PDF documents.”

John WalkerBusiness Analyst at NXP Semiconductors

The Enterprise Perspective:Graph Databases are Smart Data Lakes

“Data in a large cooperation is often scattered over various tools, comes in different formats and with different levels of quality.”

Fabian HeinemannData Scientist at Roche

The NPO Perspective:Using common Definitions and Standards

“Very few datasets tell a story in isolation.”The Data ManifestoDevelopment Initiatives

The warehouse approach seems to be broken in a complex world

Data Warehouse

- structures and categories predefine the kind of analysis that is possible

- excludes data to simplify the data model

- does not efficiently handle new types of data

- supports efficient indexing- enforces consistency

- includes all data that may be used and even data that may never be used

- all data regardless of source and structure is kept

- data kept in its raw form and only transformed when used

- handles structured and unstructured data- data models emerge with usage over time

Data Lake

The Analyst’s Perspective: Data Lakes don’t fix the problem of lacking semantics

“Organizations should focus on semantic consistency and performance in upstream applications and data stores instead of information consolidation in a data lake.”

GartnerBeware of the Data Lake Fallacy

Data Lakes have all the information to answer complex queries, but….

Country GDP Pop

AUS 1,560 23.14

SVE 580 9.60

WITH A COMBINED NUMBER of 357,100 registered asylum claims in 2013, Germany, the United States of America, France, Sweden and Turkey were the top five receiving countries, together accounting for nearly six out of ten asylum claims submitted in the 44 industrialized countries covered by this report.

Place Asylum seekers

Year

Australia 24,300 2013

Sweden 54,300 2013

Show me all reports, in which EU member countries are mentioned with regards to their asylum politics, which have more than 10 asylum-seekers per 1,000 inhabitants.

...taxonomies link constantly changing data sources while analytic needs are evolving

Countries

European Union

SwedenSVE

FranceFRA

AustriaAUT

Oceania

Country GDP Pop

AUS 1,560 23.14

SVE 580 9.60

Place Asylum seekers

Year

Australia 24,300 2013

Sweden 54,300 2013

WITH A COMBINED NUMBER of 357,100 registered asylum claims in 2013, Germany, the United States of America, France, Sweden and Turkey were the top five receiving countries, together accounting for nearly six out of ten asylum claims submitted in the 44 industrialized countries covered by this report.

Linked Data Warehouses are Smart Data Lakes

Data Lake

Data Warehouse

- supports efficient indexing- enforces consistency

- handles structured & unstructured data

- data models emerge with usage over time

- standards-based- unified data model- powerful query language

What if questions emerge when one starts analyzing the data?

The power of knowledge graphs: Agility, flexibility, complexity

doc doc doc

Norway France Austria Canada

doc

Norway France Austria Canada

doc

Show me all documents about

European countriesTraditional approach Graph-based approach

doc doc doc

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

Europe,France

Europe,Austria

America,Canada

doc

Norway France Austria Canada

doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

doc doc doc

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

Europe,France

Europe,Austria

America,Canada

doc

Norway France Austria Canada

doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents about EU

member countries

doc doc doc

Norway France Austria Canada

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

E.U,Europe,France

E.U,Europe,Austria

America,Canada

doc doc doc doc doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents about EU

member countries

E.U

Norway France Austria Canada

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

French,EU,

Europe,France

EU,Europe,Austria

French,America,Canada

doc doc doc doc doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents about EU

member countries

French-speaking?

French-speaking

EU

Norway France Austria Canada

The power of knowledge graphs:Agility, flexibility, complexity

doc doc doc

Europe,Norway

French,EU,

Europe,France

EU,Europe,Austria

French,America,Canada

doc doc doc doc doc

Show me all documents about

European countries

Europe

Traditional approach Graph-based approach

Show me all documents from EU member countries

French-speaking?

French-speaking

EU

Metadata per document1. No or little network effects2. No reuse of metadata3. Metadata resides in silos4. Data quality hard to measure5. Not machine-readable

Knowledge about metadata1. Explicit knowledge models2. Reusable and measurable3. Metadata is machine-processable4. Standards-based metadata5. Linkable metadata opens silos

Better Together: Unstructured and Structured Data.

Towards a Linked Data based search

PoolParty GraphSearch = Semantic Search + Analytics

Complex Queries based on SPARQL and Linked Data

SELECT DISTINCT ?personname ?picture ?countryname ?hdi ?picture

WHERE

{

?person skos:prefLabel ?personname .

?country skos:prefLabel ?countryname .

?person a dbpedia:Person .

?country a dbpedia:Country .

?person skos:related ?country .

?country <http://dbpedia.org/property/hdi> ?hdi .

FILTER ( ?hdi < 0.6)

OPTIONAL

{

?person foaf:depiction ?picture .

}

} ORDER BY DESC(?hdi)

I want to explore medical research trends in relation to regional prosperity.

Organizing data in graphs using links

Graph nervous_system_diseases-abstracts

Graph en.dbpedia.org

Graph www.nlm.nih.gov/mesh

Graph www.geonames.org

PoolParty Semantic IntegratorSystem Architecture

Classified documents + Linked taxonomies +Knowledge graphs

● Dynamic filter criterias● BI-like interface● Large scale RDF store● Fully RDF compatible● All queries via SPARQL

saddsd

s

sadd

sds

adsaddsds

dsad

dsds

saddsds

dsad

dsds

UnfiedViews as part of PoolParty Semantic Integrator

UnifiedViews differs from other ETL frameworks by natively supporting RDF data and ontologies.

UnifiedViews has a graphical user interface for the administration, debugging, and monitoring of the ETL process.

Use Cases

Success story: Healthdirect Australia

Over 120 information partners and sources

Great variety of category and metadata systems

One central vocabulary hub:Australian Health Thesaurus (AHT)

Single point of access incl. harmonized search facets:

http://www.healthdirect.gov.au/

Clean Energy Data - Country Profiles

Complex queries with SPARQLPREFIX mrv-schema: <http://gbpn.org/mrv-schema/> PREFIX qb: <http://purl.org/linked-data/cube#>

SELECT DISTINCT * WHERE { GRAPH <http://gbpn.org/mrv> { ?observation mrv-schema:year ?year. ?observation mrv-schema:region ?region. ?observation mrv-schema:region <http://gbpn.org/mrv-thes/region/India>. ?observation mrv-schema:scenario ?scenario. ?observation mrv-schema:scenario <http://gbpn.org/mrv-thes/scenario/deep-efficiency>. { ?observation mrv-schema:urbanizationType ?urbanizationType. ?observation mrv-schema:urbanizationType <http://gbpn.org/mrv-thes/urbanization-type/urban>. ?observation mrv-schema:buildingType ?buildingType. ?observation mrv-schema:buildingType <http://gbpn.org/mrv-thes/building-type/MF>. ?observation mrv-schema:publicBuildingType ?publicBuildingType. ?observation mrv-schema:publicBuildingType <http://gbpn.org/mrv-thes/public-building-type/NO>. } UNION { ?observation mrv-schema:urbanizationType ?urbanizationType. ?observation mrv-schema:urbanizationType <http://gbpn.org/mrv-thes/urbanization-type/urban>. ?observation mrv-schema:buildingType ?buildingType. ?observation mrv-schema:buildingType <http://gbpn.org/mrv-thes/building-type/Slums>. ?observation mrv-schema:publicBuildingType ?publicBuildingType. ?observation mrv-schema:publicBuildingType <http://gbpn.org/mrv-thes/public-building-type/NO>. } UNION { …….

PoolParty 5.1

Highly precise entity extraction

Domain-specific extraction, highly performant, language-agnostic, disambiguation rules, REST API

Providing context in the knowledge graph

Activating disambiguation

Semantic Records Management: Integration with Confluence Blueprints

⇒ Solution for Semantic Records Management

Fully integrated web crawler

Make use of text corpus analysis: Retrieve documents from various sources, like RSS or from websites

Web Crawler extracts candidate terms from any website

Extended ontology management & semantic reasoning

From SKOS taxonomies to full-blown ontologies: PoolParty supports various levels of knowledge modeling

Publishing custom schemes

Further extension of PoolParty API

● API method for skos:notes

● API method for skosxl:labels

● API methods for skos:collections

● API method to collect custom properties, attributes and types

● API method to R/W workflow status

● Retrieve history API method

● Retrieve SKOS subtree

Developer

Get started with PoolParty. Try it out now!

Get your PoolParty 5.1 Thesaurus Server & Entity Extractor trial:

http://www.poolparty.biz/test-demo/

Contact points & further informationAndreas Blumauer, MSc IT

a.blumauer@semantic-web.at

https://www.linkedin.com/in/andreasblumauer

Semantic Web Company GmbHMariahilfer Strasse 70/8, A-1070 Vienna

+43-1-4021235

http://www.semantic-web.at

http://www.poolparty-software.com

Social Media Channels

http://slideshare.net/semwebcompany

http://youtube.com/semwebcompany

https://www.linkedin.com/groups?home=&gid=4059165

top related