dive deep into your data pools
TRANSCRIPT
Welcome to this webinar!
Andreas BlumauerCEO of Semantic Web Company
About Semantic Web Company (SWC)
SWC was founded 2001, head-quartered in Vienna
25 experts in Linked Data technologies
PoolParty Suite based on RDF Graph Data Model
Serving customers from all over the world
EU- & US-based consulting services
Our Ecosystem: Customers & Partners
Some of our Customers
● Credit Suisse● Boehringer Ingelheim● Roche● adidas● The Pokémon Company● Canadian Broadcasting Corporation (CBC)● Red Bull Media House● Wolters Kluwer● TC Media● Techtarget● BMJ Publishing Group● CafePress● Pearson - Always Learning● Education Services Australia● American Physical Society● Healthdirect Australia● World Bank Group● Inter-American Development Bank (IADB)● Renewable Energy Partnership● Wood MacKenzie● Development Initiatives● International Atomic Energy Agency (IAEA)
Finance / Automotive / Publisher / Health Care / Public Administration / Energy / Education
Selected Partners
● PwC● EPAM Systems● iQuest● EBCONT● Gravity Zero● MarkLogic● OpenLink Software● Ontotext● Wolters Kluwer● Data to Value● Digirati● Term Management● Altotech
We are all working on thereplacement of data chaos by networking information
● Norwegian Directorate of Immigration
● Ministry of Finance (A)● Council of the E.U.● Australian National
Data Service
PoolParty Core Modules
Bain Capital is a venture capital company based in Boston, MA.Since inception it has invested in hundreds of companies including AMC Entertainment, Brookstone, and Burger King. The company was co-founded by Mitt Romney.
Taxonomy &Ontology Server
Entity Extraction & Text Mining
Semantic Search, Analytics & Visualization
Why Graph Databases?
The Enterprise Perspective:The End of the Document
“Life is no longer as simple as making PDF documents.”
John WalkerBusiness Analyst at NXP Semiconductors
The Enterprise Perspective:Graph Databases are Smart Data Lakes
“Data in a large cooperation is often scattered over various tools, comes in different formats and with different levels of quality.”
Fabian HeinemannData Scientist at Roche
The NPO Perspective:Using common Definitions and Standards
“Very few datasets tell a story in isolation.”The Data ManifestoDevelopment Initiatives
The warehouse approach seems to be broken in a complex world
Data Warehouse
- structures and categories predefine the kind of analysis that is possible
- excludes data to simplify the data model
- does not efficiently handle new types of data
- supports efficient indexing- enforces consistency
- includes all data that may be used and even data that may never be used
- all data regardless of source and structure is kept
- data kept in its raw form and only transformed when used
- handles structured and unstructured data- data models emerge with usage over time
Data Lake
The Analyst’s Perspective: Data Lakes don’t fix the problem of lacking semantics
“Organizations should focus on semantic consistency and performance in upstream applications and data stores instead of information consolidation in a data lake.”
GartnerBeware of the Data Lake Fallacy
Data Lakes have all the information to answer complex queries, but….
Country GDP Pop
AUS 1,560 23.14
SVE 580 9.60
WITH A COMBINED NUMBER of 357,100 registered asylum claims in 2013, Germany, the United States of America, France, Sweden and Turkey were the top five receiving countries, together accounting for nearly six out of ten asylum claims submitted in the 44 industrialized countries covered by this report.
Place Asylum seekers
Year
Australia 24,300 2013
Sweden 54,300 2013
Show me all reports, in which EU member countries are mentioned with regards to their asylum politics, which have more than 10 asylum-seekers per 1,000 inhabitants.
...taxonomies link constantly changing data sources while analytic needs are evolving
Countries
European Union
SwedenSVE
FranceFRA
AustriaAUT
Oceania
Country GDP Pop
AUS 1,560 23.14
SVE 580 9.60
Place Asylum seekers
Year
Australia 24,300 2013
Sweden 54,300 2013
WITH A COMBINED NUMBER of 357,100 registered asylum claims in 2013, Germany, the United States of America, France, Sweden and Turkey were the top five receiving countries, together accounting for nearly six out of ten asylum claims submitted in the 44 industrialized countries covered by this report.
Linked Data Warehouses are Smart Data Lakes
Data Lake
Data Warehouse
- supports efficient indexing- enforces consistency
- handles structured & unstructured data
- data models emerge with usage over time
- standards-based- unified data model- powerful query language
What if questions emerge when one starts analyzing the data?
The power of knowledge graphs: Agility, flexibility, complexity
doc doc doc
Norway France Austria Canada
doc
Norway France Austria Canada
doc
Show me all documents about
European countriesTraditional approach Graph-based approach
doc doc doc
The power of knowledge graphs:Agility, flexibility, complexity
doc doc doc
Europe,Norway
Europe,France
Europe,Austria
America,Canada
doc
Norway France Austria Canada
doc
Show me all documents about
European countries
Europe
Traditional approach Graph-based approach
doc doc doc
The power of knowledge graphs:Agility, flexibility, complexity
doc doc doc
Europe,Norway
Europe,France
Europe,Austria
America,Canada
doc
Norway France Austria Canada
doc
Show me all documents about
European countries
Europe
Traditional approach Graph-based approach
Show me all documents about EU
member countries
doc doc doc
Norway France Austria Canada
The power of knowledge graphs:Agility, flexibility, complexity
doc doc doc
Europe,Norway
E.U,Europe,France
E.U,Europe,Austria
America,Canada
doc doc doc doc doc
Show me all documents about
European countries
Europe
Traditional approach Graph-based approach
Show me all documents about EU
member countries
E.U
Norway France Austria Canada
The power of knowledge graphs:Agility, flexibility, complexity
doc doc doc
Europe,Norway
French,EU,
Europe,France
EU,Europe,Austria
French,America,Canada
doc doc doc doc doc
Show me all documents about
European countries
Europe
Traditional approach Graph-based approach
Show me all documents about EU
member countries
French-speaking?
French-speaking
EU
Norway France Austria Canada
The power of knowledge graphs:Agility, flexibility, complexity
doc doc doc
Europe,Norway
French,EU,
Europe,France
EU,Europe,Austria
French,America,Canada
doc doc doc doc doc
Show me all documents about
European countries
Europe
Traditional approach Graph-based approach
Show me all documents from EU member countries
French-speaking?
French-speaking
EU
Metadata per document1. No or little network effects2. No reuse of metadata3. Metadata resides in silos4. Data quality hard to measure5. Not machine-readable
Knowledge about metadata1. Explicit knowledge models2. Reusable and measurable3. Metadata is machine-processable4. Standards-based metadata5. Linkable metadata opens silos
Better Together: Unstructured and Structured Data.
Towards a Linked Data based search
Bringing structure to text: PoolParty GraphSearch
PoolParty GraphSearch = Semantic Search + Analytics
Complex Queries based on SPARQL and Linked Data
SELECT DISTINCT ?personname ?picture ?countryname ?hdi ?picture
WHERE
{
?person skos:prefLabel ?personname .
?country skos:prefLabel ?countryname .
?person a dbpedia:Person .
?country a dbpedia:Country .
?person skos:related ?country .
?country <http://dbpedia.org/property/hdi> ?hdi .
FILTER ( ?hdi < 0.6)
OPTIONAL
{
?person foaf:depiction ?picture .
}
} ORDER BY DESC(?hdi)
I want to explore medical research trends in relation to regional prosperity.
Organizing data in graphs using links
Graph nervous_system_diseases-abstracts
Graph en.dbpedia.org
Graph www.nlm.nih.gov/mesh
Graph www.geonames.org
PoolParty Semantic IntegratorSystem Architecture
Classified documents + Linked taxonomies +Knowledge graphs
● Dynamic filter criterias● BI-like interface● Large scale RDF store● Fully RDF compatible● All queries via SPARQL
saddsd
s
sadd
sds
adsaddsds
dsad
dsds
saddsds
dsad
dsds
UnfiedViews as part of PoolParty Semantic Integrator
UnifiedViews differs from other ETL frameworks by natively supporting RDF data and ontologies.
UnifiedViews has a graphical user interface for the administration, debugging, and monitoring of the ETL process.
Use Cases
Success story: Healthdirect Australia
Over 120 information partners and sources
Great variety of category and metadata systems
One central vocabulary hub:Australian Health Thesaurus (AHT)
Single point of access incl. harmonized search facets:
http://www.healthdirect.gov.au/
Clean Energy Data - Country Profiles
sOnr webMining for Confluence
Complex queries with SPARQLPREFIX mrv-schema: <http://gbpn.org/mrv-schema/> PREFIX qb: <http://purl.org/linked-data/cube#>
SELECT DISTINCT * WHERE { GRAPH <http://gbpn.org/mrv> { ?observation mrv-schema:year ?year. ?observation mrv-schema:region ?region. ?observation mrv-schema:region <http://gbpn.org/mrv-thes/region/India>. ?observation mrv-schema:scenario ?scenario. ?observation mrv-schema:scenario <http://gbpn.org/mrv-thes/scenario/deep-efficiency>. { ?observation mrv-schema:urbanizationType ?urbanizationType. ?observation mrv-schema:urbanizationType <http://gbpn.org/mrv-thes/urbanization-type/urban>. ?observation mrv-schema:buildingType ?buildingType. ?observation mrv-schema:buildingType <http://gbpn.org/mrv-thes/building-type/MF>. ?observation mrv-schema:publicBuildingType ?publicBuildingType. ?observation mrv-schema:publicBuildingType <http://gbpn.org/mrv-thes/public-building-type/NO>. } UNION { ?observation mrv-schema:urbanizationType ?urbanizationType. ?observation mrv-schema:urbanizationType <http://gbpn.org/mrv-thes/urbanization-type/urban>. ?observation mrv-schema:buildingType ?buildingType. ?observation mrv-schema:buildingType <http://gbpn.org/mrv-thes/building-type/Slums>. ?observation mrv-schema:publicBuildingType ?publicBuildingType. ?observation mrv-schema:publicBuildingType <http://gbpn.org/mrv-thes/public-building-type/NO>. } UNION { …….
More PoolParty Applications & Demos
Thesaurus Publishing Business Intelligence Content Recommendation Semantic Expert Finder
Web Mining Semantic Search Linked Data Visualization Symptom Checker
PoolParty 5.1
Highly precise entity extraction
Domain-specific extraction, highly performant, language-agnostic, disambiguation rules, REST API
Providing context in the knowledge graph
Activating disambiguation
Semantic Records Management: Integration with Confluence Blueprints
⇒ Solution for Semantic Records Management
Fully integrated web crawler
Make use of text corpus analysis: Retrieve documents from various sources, like RSS or from websites
Web Crawler extracts candidate terms from any website
Extended ontology management & semantic reasoning
From SKOS taxonomies to full-blown ontologies: PoolParty supports various levels of knowledge modeling
Publishing custom schemes
Further extension of PoolParty API
● API method for skos:notes
● API method for skosxl:labels
● API methods for skos:collections
● API method to collect custom properties, attributes and types
● API method to R/W workflow status
● Retrieve history API method
● Retrieve SKOS subtree
Developer
Get started with PoolParty. Try it out now!
Get your PoolParty 5.1 Thesaurus Server & Entity Extractor trial:
http://www.poolparty.biz/test-demo/
Contact points & further informationAndreas Blumauer, MSc IT
https://www.linkedin.com/in/andreasblumauer
Semantic Web Company GmbHMariahilfer Strasse 70/8, A-1070 Vienna
+43-1-4021235
http://www.semantic-web.at
http://www.poolparty-software.com
Social Media Channels
http://slideshare.net/semwebcompany
http://youtube.com/semwebcompany
https://www.linkedin.com/groups?home=&gid=4059165