Download - Ask Data Anything
1The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Ask Data AnythingHow to make the most of your data using
Semantic Technologies
2The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Overview
Why semantics matter?
Ontologies and Ask Data Anything dimensions
Data exploration through Ask Data Anything
Data mining for attributes (if time permits)
3The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Why semantics matter?
Flat data is void The types of queries you can perform over flat data is restricted
to the vocabulary contained in this specific data.
Item City Date Inv. stat Vocabulary
4The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Item Country
1
2
Why semantics matter?(cont.)
What about relational data? e.g., how to make complex relational queries involving simple relations as the country to which a city belongs to?
Item City Date Inv. stat
5The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Why semantics matter (Cont)?
The key feature offered by semantics – and in particular Ask Data Anything - is adding additional layers on top of data (which are not explicitly in the data itself) e.g., ask for results over cities, countries, continents when the
data only contains information about cities.
A way to achieve this is by defining a structure of knowledge for any sort of domain (taxonomies), with: nouns representing classes of objects
verbs representing relations between objects
6The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Ontologies & Ada dimensions (Taxonomy embodiment)ada
dimensions
localization
city
country
continent
…
temporal
day
month
year
…
hierarchical
goods
clothing
shoe
sport-shoes
high-heel-shoes
pants
jeans
Shorts
…
Poland is a country. Germany is a country.Krakow is a city.Warsaw is a city.Krakow is-located-in Poland.Warsaw is-located-in Poland.Hamburg is a city.Berlin is a city.Hamburg is-located-in Germany.Berlin is-located-in Germany.
Every column in the input data is marked With one of the data dimensions and with a Concept (e.g., localization and city)
7The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Taxonomy modeling in Fluent Editor
Every location is an ada-dimension. Every country is a location. Every city is a location. Every continent is a location.
Every hierarchical is an ada-dimension. Every brand is hierarchical. Every status is a hierarchical. Every vendor is hierarchical. Every good is hierarchical.
Every temporal is an ada-dimension. Every second is temporal. Every month is temporal. Every year is temporal.
OWL 2.0 full compliance.
Uses Ontorion controlled natural language (OCNL) | OWL2 --> OWL2 + SWRL
8The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Ask Data Anything
Technically, Ask Data Anything is capable of performing projection, sub-setting and aggregation operations, providing answers for queries involving the following information:
What quantitative field to use?
How the output is to be displayed?
Where (Optional) to restrict the results? Seizes containment relation
By what to aggregate? (Optional) – needs an aggregating operation to be provided.
Aggregating operation (Optional) – tied to the type (e.g. double, string) of What
When? (Optional)
9The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Ask Data Anything (Architecture)
Data and Models are tightly coupled, as models provide an interface to query the data so everything aimed to be queried needs to be modeled in the underlying ontology.
10The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Ask Data Anything
Invoice Item Vendor Status Quantity City
TID-0001Cafepress Men's Bass Guitar Guy Graphic Tee Vendor-A Rejected 133 Berlin
Warsaw
TID-0002 Shirt-1 Vendor-C Accepted 114 Berlin
TID-0003 Cafepress Men's 2 Daughters Graphic Tee Vendor-A Pending 34 Munchen
TID-0004 Faded Glory Women's Knit Polo 2-Pack Vendor-D Accepted 110 Munchen
TID-0005 Cafepress Men's Biceps Graphic Tee Vendor-B Pending 112 Berlin
TID-0006Faded Glory Men's Long Sleeve 2 Pocket Flannel Shirt Vendor-B Rejected 37 Krakow
TID-0007 Aqua Blues Women's Twist Back Top Vendor-A Paid 63 Berlin
Let us consider the following data source (extract)
Vendor-A is a vendor.Vendor-B is a vendor.Vendor-C is a vendor.Vendor-D is a vendor.
Rejected is a status.Accepted is a status.Pending is a status.Paid is a status.
Every status is a hierarchical. Every vendor is hierarchical.
11The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Example queries in ADA
Sum quantity in Poland by city
12The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Example queries in ADA (Cont.)Sum quantity by country on map
13The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Example queries in ADA (Cont.)Summarize status by vendor on piechart
14The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Material sub-setting (Data mining for attributes)
In the case of csv data, Ontorion Text Mining AddIn for excel allows to extract meaningful information from raw data, helping in the taxonomy creation process
Allows attribute-based sub-setting
15The company, product and service names used in this web site are for identification purposes only.
© Cognitum 2014. All trademarks and registered trademarks are the property of their respective owners.
Thank you!Jesus NuñezJunior Data Engineer at Cognitum
Cognitum Sp. z o.o.Wal Miedzeszynski 630, Warsaw, Poland
http://www.cognitum.eu/, [email protected]
Windows Azure Circle Partner
MEMBER