how to survive the document & data tsunami? lambda verdonckt business analyst tenforce
TRANSCRIPT
How to survive the document & data tsunami?
Lambda VerdoncktBusiness Analyst TenForce
We know how to handle large data,regardless of the technology used.
1
Semantic Technology
The only purpose-built technology,to survive a tsunami of doc and data.
2
Semantic Technology
Leveraging information in old systems,no need to change current way of working.
3
How did we end up here in the first place?
Semantic TechnologyTurns the web of documents
into a web of data.
Turns the web as a virtual libraryinto a virtual database.
TenForce applies these technologiesin corporate environments.
How to survive the document & data tsunami?
Semantic Technology1. State-of-the-art2. Examples3. Future
Semantic TechnologyThe meaning of the data is encoded separately
The only purpose-built technology for handling a tsunami of data, in a flexible way.
data
Software understands the data and can reason about it
(JohnDoe, type, Customer)(JohnDoe, owns, Account123)(Account123, type, BankingAccount)
model
Customertype
Person
owns
Account
=> ontology, thesaurus, taxonomy etc.
Semantic Technology StandardsA set of standards & tools to work with large data sets
Semantic Technology Architectures
TenForce Semantic OfferingConsultancy Projects Training Products
Semantic Technology
Assesment
Architectures
Modeling
Validation
Standard compliancy
End-to-end projects
mixed teams
research projects
EU framework
Unique Training Offer
Introduction
Modeling
Programming
and many others…
How to survive the document & data tsunami?
Semantic Technology1. State-of-the-art2. Examples3. Future
Semantic Technology Solutions
The ‘semantic web’ is an application of semantic technology
Corporate solutions built with semantic technology include:• Knowledge Bases• Automatic Categorization & Archiving• Natural Language Processing in documents• …
Semantic Technology SolutionsTenForce projects
• Publications Office of the EU – a thesaurus of European activities
• Wolters Kluwer Globally – building a multilingual publishing bus
• DG Employment of the EC – a taxonomy of European Skills, Competences & Occupations
Semantic Technology SolutionsAdvanced examples
• New York Times– automatic categorization & archiving with Linked Data
• Amdocs– telecom solutions for pro-active decision support
• Audi– modeling behaviour to make testing less error-prone
How to survive the document & data tsunami?
Semantic Technology1. State-of-the-art2. Examples3. Future
Industry Analysts
Gartner: high benefit rating (2010)“ Semantic technologies offer …
options that now are difficult or impossible “
HP: top 10 trend in BI (2010)“New approaches are needed, and semantic technologies hold part of the solution.”
A vision of the data web
LOD2 – a European FP7 project
• Build the infrastructure for the web of data• Opportunities & challenges for all of us!
Future
We know the tsunami is coming,the question is – who will be ready to survive?
BACK-UP SLIDES
Semantic Technology SolutionsKnowledge Bases
• Knowledge is captured in a model, making the DB a KB
• Allows to manage & share knowledge i.s.o. mere storage>50% of companies indicate the need to share stored knowledge (VALUE-IT)
• Better & faster retrieval of information for decision support
• Human-readable: typical CRM with search functionality Machine-readable: expert systems, incl. reasoning
eg. clinical decision support
Þ Rules are part of the data, i.s.o. hard-coded:more readily adaptable to changing needs,
while interoperable with existing DB’s
Semantic Technology SolutionsAutomatic Categorization & Archiving
Categorization based on controlled vocabularies(taxonomies, thesauri, ontologies)
Þ makes content more searchable: better!Þ eliminates cost of labour-intensive processes: cheaper!
vs. user-driven categorization & tagging (web 2.0)
Remark: Look at Evri as an online example!
Semantic Technology SolutionsNatural Language Processing
Software that analyzes the structure and meaning of textual information
• analyze texts, • identify terms & concepts, • extract information, • understand meaning
Þ Automatic categorization & archiving based on NLP
Tools: Alchemy, OpenCalais, PoolParty
Multilingual publishing system in a EU contextfor Legal, Tax & Regulatory
2010 TenForce 26
CMS
CMS
CMS
... ...
portalsCMS
CMS
CMS
CMS
... ...
portals
INT
EG
RA
TIO
N
PR
OD
UC
TIO
N
ORIENTATION
RDF
XHTML
SKOS Product Definition
BEFORE AFTER
Wolters Kluwer Global
ESCO, a taxonomy of European Skills, Competences & Occupations
2010 TenForce 27
Thesaurus Management System ESCO Portal
Import and Integration
Web
Back Office Portal
ESCO user
ESCO user
Job Mobility Portal
DG Employment of the EU Commission
A Semantic Job Portal to leverage the information in ESCO and other information on the web
2010 TenForce 28
DG Employment of the EU Commission
Advanced examplesPublishing
New York Times• in-house developed vocabulary• automatic categorization & archiving• published as Linked Data (open to the world!)
http://data.nytimes.com/
Advanced examplesTelecom
Amdocs
Knowing why a customer is calling, saves 3’ per call (or € 0,30)!
RDFbillingsocial fora
call center logs
...
advanced inference
Pro-active decision support
Advanced examplesManufacturing
Audi (Ontoprise)Testing electronic systems in cars using simulationsÞ huge amounts of data are recordedÞ to be collected and analyzedÞ time-consuming & error-prone
Need for a standardized way to describe • desired system behaviour• known error-cases
Solution: ontology-driven & visualized