denodo datafest 2016: big data virtualization in the cloud

17
OCTOBER 18,2016 SAN FRANCISCO BAY AREA, CA #DenodoDataFest RAPID, AGILE DATA STRATEGIES For Accelerating Analytics, Cloud, and Big Data Initiatives.

Upload: denodo

Post on 16-Apr-2017

94 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Page 1: Denodo DataFest 2016: Big Data Virtualization in the Cloud

O C T O B E R 1 8 , 2 0 1 6 S A N F R A N C I S C O B A Y A R E A , C A

#DenodoDataFest

RAPID, AGILE DATA STRATEGIESFor Accelerating Analytics, Cloud, and Big Data Initiatives.

Page 2: Denodo DataFest 2016: Big Data Virtualization in the Cloud

BIG DATA

VIRTUALIZATION

IN THE CLOUD

Avinash DeshpandePrincipal, Big data and Advanced Analytics

[email protected]

Page 3: Denodo DataFest 2016: Big Data Virtualization in the Cloud

Logitech designs products that have an everyday place in people's lives, connecting them to the digital experiences they care about. Over 30 years ago, Logitech started connecting people through computers; now it’s designing products that bring people together through music, gaming, video and computing.

In 1981, Logitech was founded in the village of Apples, Switzerland. The start-up was based in a farm building – the Swiss equivalent of a Silicon Valley garage. Shortly after, another office was opened up in the U.S. at 165 University Avenue, Palo Alto. This address has become famous over the years as a lucky one for start-ups. It’s where Logitech started, as well as Danger, Inc, PayPal and Google.

At the heart of Logitech’s success lies its ability to design product experiences that tap into genuine consumer needs. Under a number of different brands, the company offers PC peripherals; cases and keyboards for tablets; equipment for gamers; mobile speakers and earphones for music and sports enthusiasts; devices to make video collaboration simple in the workplace; and entertainment and control products for the home.

THE LOGITECH STORY

Page 4: Denodo DataFest 2016: Big Data Virtualization in the Cloud
Page 5: Denodo DataFest 2016: Big Data Virtualization in the Cloud

LOGITECH DATA USE CASES

Structured Semi-Structured Unstructured

Bat

ch

Dat

a V

elo

city

Re

al-T

ime

Social Media Sentiment Analysis

Predictive Analytics

Demand Forecasting

Price violations on Retail sites

Data Warehousing Text Mining

Security Video AnalysisRetail Data

scrapping

Machine Learning

ioT

Multi site ERP

Marketing Funnel

Sales Channel Mgmt

Smart Home

Page 6: Denodo DataFest 2016: Big Data Virtualization in the Cloud

JOURNEY TO CLOUD

Cloud empowers IT organizations to redefine the way Data

services are produced and delivered

Scalable • Infrastructure scaled up - down on the fly (Elastic)

• Focus on simplicity, security, robustness, and scalability

Efficient • Infrastructure costs are pay as use

Reliable• AWS managed services

Managed & Governed

• Transparency on usage patterns

• Breadth of services offered, pricing, performance and flexibility

Page 7: Denodo DataFest 2016: Big Data Virtualization in the Cloud

NEED FOR DATA VIRTUALIZATION

Abstract access to disparate data sources

A single semantic repository

Optimized data availability in real-time to consumers

Centralized, governed and secured data layer

Page 8: Denodo DataFest 2016: Big Data Virtualization in the Cloud

• Federated Approach

o Queries sent to data sources without much intelligence about the overall query or the cost of the individual parts of the federated query.

o Each underlying data source performs its portion of the workload as best it can and returns the results.

o The various parts are combined and additional post-processing performed if necessary, for example to sort the combined result set.

• DV / Denodo Approach

o Denodo tools consider the costs of each part of the individual query and evaluate trade-offs and decides on the best way to execute the SQL.

DATA VIRTUALIZATION OVER DATA FEDERATION

Page 9: Denodo DataFest 2016: Big Data Virtualization in the Cloud

REFERENCE ARCHITECTURE

Metadata Management, Data Governance, Data Security

Cost and Usage Pattern

Sensor DataMachine Data LogsSocial DataClickstream DataInternet DataImage and Video

Cloud Applications

EnterpriseApplications

Data Sources Data Insights

Self-Service /Data Discovery

Reporting

Predictive AnalyticsStatistical AnalyticsSentimental AnalyticsText AnalyticsData Mining

Data VirtualizationData Collection

Real-Time Data Access (On-Demand / Streaming)

CDC

ETL

EDW

ODS

Cloud DW

NoSQL

Data Warehouse

File Storage (S3)

Batch DW Spark SQL

NoSQLSearch Search

Big Data

In-Memory

AnalyticalAppliances

Real-Time Decision Support

Alerts

Scorecards/Dashboards

Page 10: Denodo DataFest 2016: Big Data Virtualization in the Cloud

SOLUTION ARCHITECTURE

Amazon Web Services

AWS GlacierAWS S3 AWS Redshift

Pentaho DI

Pentaho Operations Mart

Cloudwatch SNSIAM Cloudtrail EMR SPARK Python / R

AWS RDS

Denodo Data Virtualization

Tableau Pentaho BA Data Interfaces Web ServicesOBIEE CUBES

Page 11: Denodo DataFest 2016: Big Data Virtualization in the Cloud

• Logical model can be predefined for the data

• Eliminates load processes and the need to update the data

• Uses the security and governance system already in place

• Collects and maintains statistics and determines optimal query execution

• Avails Cache mechanism and pushdown for optimal performance

• Array of connection options from structured to unstructured data

• Business Layer, enabling data Consistency through single object, multiple

consumers

• Rapid prototyping

• Data Audits

VIRTUALIZATION BENEFITS

Page 12: Denodo DataFest 2016: Big Data Virtualization in the Cloud

• Catalog exploration

o Graphical representation of data model

o Data lineage

o Integrated catalog search

• Data Discovery

o User friendly query wizards with drill down capabilities

o Export to CSV, Excel and Tableau Data Extracts

• Secure

o Leverages Denodo’s security model and access control

o Available vis SSL/TLS

DENODO INFORMATION SELF SERVICE

Page 13: Denodo DataFest 2016: Big Data Virtualization in the Cloud

10/16/2016LOGITECH CONFIDENTIAL: NOT FOR

DISTRIBUTION13

DATA VIRTUALIZATION – NIRVANA

Page 14: Denodo DataFest 2016: Big Data Virtualization in the Cloud

CLOUD AND DV BENEFITS

• Proactive – IT has embraced cloud as a model for achieving innovation through increased efficiency, reliability and agility

• Reusability and template development

• Rapid innovation within governance structure, balanced costs, risks and service levels

• Greater efficiency and reliability, enabling broader audience to consume IT services via self-service

Page 15: Denodo DataFest 2016: Big Data Virtualization in the Cloud

LESSONS LEARNT

• Reduced Spend

• Live migration

• Flexible and cost effective

• Better business continuity

• Speed to deliver

• Easier to manage

• More efficient IT operations

Cons

• Upfront hardware costs

• Software license costs

• Possible learning curve

• Accountability

• Getting all vendors to gel well

Pros

Page 16: Denodo DataFest 2016: Big Data Virtualization in the Cloud
Page 17: Denodo DataFest 2016: Big Data Virtualization in the Cloud

Panel

M O D E R A T E D B Y :

Avinash Deshpande

Principal, Big Data and Advanced Analytics, Logitech

Kurt Jackson

Platform Lead, Autodesk

Dan Young

Chief Data Architect, Indiana University

Paul Moxon

Head of Product Management, Denodo