denodo datafest 2016: metadata and data: search and exploration

25
OCTOBER 18,2016 SAN FRANCISCO BAY AREA, CA #DenodoDataFest RAPID, AGILE DATA STRATEGIES For Accelerating Analytics, Cloud, and Big Data Initiatives.

Upload: denodo

Post on 11-Jan-2017

31 views

Category:

Data & Analytics


1 download

TRANSCRIPT

O C T O B E R 1 8 , 2 0 1 6 S A N F R A N C I S C O B A Y A R E A , C A

#DenodoDataFest

RAPID, AGILE DATA STRATEGIESFor Accelerating Analytics, Cloud, and Big Data Initiatives.

Metadata and Data: Search & Exploration

Pablo Álvarez Yáñez

Principal Engineer, Denodo

Search and Find the Right Data

3

- IDC, The high cost of non finding information

Knowledge workers spend from 15% to 35% of their time

searching for information.

Searchers are successful in finding what they seek 50% of

the time

Different Scenarios

5

“I know the data model and I’m fluent in SQL”

“I know my data but I don’t want

to re-build a query”

“Do we even havethat data?”

Different Paradigms

6

SQL Query: SELECT * FROM

Customer WHERE

City = ‘Palo Alto’

Catalog search:What views/reports do we have about customers?

Keyword search:Do we have dataabout “Palo Alto”?

How?: SQL tools

7

How?: Catalogs

8

How?: Google (Indexes + Search Engine + Hyperlinks)

9

How?: Data Virtualization?

• DV is a central repository to access all

your data

• DV abstracts the underlying technology

of the data sources

• DV enables the definition of a semantic

data model

• DV offers a metadata-rich catalog

• DV offers multiple access methods:

• SQL based

• Keyword based search (using an index)

• RESTful navigation (hyperlinks)

10

VIRTUAL DATA MODEL

Let’s see an example

11

“Steel prices are on the rise, what is the

impact for our organization”

12

13

14

15

16

17

Architecture

18

Financial Web Service

Global WarehouseStatus

CRM

Vendor Product

SAP

Bill Of Materials

Warehouses Databases

Purchasing History

Hive

Product Expected Needs

Price

Benefits

• Multiple sources and technologies accessed from a common place• No need to learn new technologies or request multiple credentials

• Catalog with models, descriptions, relationships and data lineage• Increase reusability and improves consistency

• Google-like keyword based search• Allows end users to search for data without understanding the underlying data model

• Built-in query builder• Supports personalization of filters, aggregations, projections, etc.

• Links to related data • Enables complex drill-down and jump across different sources

• Integrated security model• Role based security applied to all access methods

19

“Citizen Integrators”

Denodo’s Information Self Service tool allows non-technical end

users to access data within proper guardrails

• Secured

• Fine-grained RBAC to protect confidential data

• Data is consistent and trustworthy

• Encourage use of “certified data models”

• Controlled

• Denodo Resource Manager used to set plans and rules to protect resources

and ensure SLAs

20

How does it work

21

How does it work

22

Denodo Scheduler

ARN:Crawling and

Indexing

Information Self Service

SQL queriesSearch,

Browse & Explore

Integrated Security

Best Practices

• Document your views

• It really helps end users

• Create associations between views

• Makes discovery much easier

• Don’t include fact tables in your crawling/indexing

• They only have numbers and are not useful for keyword-based searches

• Set up proper guardrails

• Security

• Resource manager

23

Q&A

Thank you!

© Copyright Denodo Technologies. All rights reservedUnless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.

O C T O B E R 1 8 , 2 0 1 6 S A N F R A N C I S C O B A Y A R E A , C A

#DenodoDataFest