denodo datafest 2016: big data virtualization in the cloud
TRANSCRIPT
O C T O B E R 1 8 , 2 0 1 6 S A N F R A N C I S C O B A Y A R E A , C A
#DenodoDataFest
RAPID, AGILE DATA STRATEGIESFor Accelerating Analytics, Cloud, and Big Data Initiatives.
BIG DATA
VIRTUALIZATION
IN THE CLOUD
Avinash DeshpandePrincipal, Big data and Advanced Analytics
Logitech designs products that have an everyday place in people's lives, connecting them to the digital experiences they care about. Over 30 years ago, Logitech started connecting people through computers; now it’s designing products that bring people together through music, gaming, video and computing.
In 1981, Logitech was founded in the village of Apples, Switzerland. The start-up was based in a farm building – the Swiss equivalent of a Silicon Valley garage. Shortly after, another office was opened up in the U.S. at 165 University Avenue, Palo Alto. This address has become famous over the years as a lucky one for start-ups. It’s where Logitech started, as well as Danger, Inc, PayPal and Google.
At the heart of Logitech’s success lies its ability to design product experiences that tap into genuine consumer needs. Under a number of different brands, the company offers PC peripherals; cases and keyboards for tablets; equipment for gamers; mobile speakers and earphones for music and sports enthusiasts; devices to make video collaboration simple in the workplace; and entertainment and control products for the home.
THE LOGITECH STORY
LOGITECH DATA USE CASES
Structured Semi-Structured Unstructured
Bat
ch
Dat
a V
elo
city
Re
al-T
ime
Social Media Sentiment Analysis
Predictive Analytics
Demand Forecasting
Price violations on Retail sites
Data Warehousing Text Mining
Security Video AnalysisRetail Data
scrapping
Machine Learning
ioT
Multi site ERP
Marketing Funnel
Sales Channel Mgmt
Smart Home
JOURNEY TO CLOUD
Cloud empowers IT organizations to redefine the way Data
services are produced and delivered
Scalable • Infrastructure scaled up - down on the fly (Elastic)
• Focus on simplicity, security, robustness, and scalability
Efficient • Infrastructure costs are pay as use
Reliable• AWS managed services
Managed & Governed
• Transparency on usage patterns
• Breadth of services offered, pricing, performance and flexibility
NEED FOR DATA VIRTUALIZATION
Abstract access to disparate data sources
A single semantic repository
Optimized data availability in real-time to consumers
Centralized, governed and secured data layer
• Federated Approach
o Queries sent to data sources without much intelligence about the overall query or the cost of the individual parts of the federated query.
o Each underlying data source performs its portion of the workload as best it can and returns the results.
o The various parts are combined and additional post-processing performed if necessary, for example to sort the combined result set.
• DV / Denodo Approach
o Denodo tools consider the costs of each part of the individual query and evaluate trade-offs and decides on the best way to execute the SQL.
DATA VIRTUALIZATION OVER DATA FEDERATION
REFERENCE ARCHITECTURE
Metadata Management, Data Governance, Data Security
Cost and Usage Pattern
Sensor DataMachine Data LogsSocial DataClickstream DataInternet DataImage and Video
Cloud Applications
EnterpriseApplications
Data Sources Data Insights
Self-Service /Data Discovery
Reporting
Predictive AnalyticsStatistical AnalyticsSentimental AnalyticsText AnalyticsData Mining
Data VirtualizationData Collection
Real-Time Data Access (On-Demand / Streaming)
CDC
ETL
EDW
ODS
Cloud DW
NoSQL
Data Warehouse
File Storage (S3)
Batch DW Spark SQL
NoSQLSearch Search
Big Data
In-Memory
AnalyticalAppliances
Real-Time Decision Support
Alerts
Scorecards/Dashboards
SOLUTION ARCHITECTURE
Amazon Web Services
AWS GlacierAWS S3 AWS Redshift
Pentaho DI
Pentaho Operations Mart
Cloudwatch SNSIAM Cloudtrail EMR SPARK Python / R
AWS RDS
Denodo Data Virtualization
Tableau Pentaho BA Data Interfaces Web ServicesOBIEE CUBES
• Logical model can be predefined for the data
• Eliminates load processes and the need to update the data
• Uses the security and governance system already in place
• Collects and maintains statistics and determines optimal query execution
• Avails Cache mechanism and pushdown for optimal performance
• Array of connection options from structured to unstructured data
• Business Layer, enabling data Consistency through single object, multiple
consumers
• Rapid prototyping
• Data Audits
VIRTUALIZATION BENEFITS
• Catalog exploration
o Graphical representation of data model
o Data lineage
o Integrated catalog search
• Data Discovery
o User friendly query wizards with drill down capabilities
o Export to CSV, Excel and Tableau Data Extracts
• Secure
o Leverages Denodo’s security model and access control
o Available vis SSL/TLS
DENODO INFORMATION SELF SERVICE
10/16/2016LOGITECH CONFIDENTIAL: NOT FOR
DISTRIBUTION13
DATA VIRTUALIZATION – NIRVANA
CLOUD AND DV BENEFITS
• Proactive – IT has embraced cloud as a model for achieving innovation through increased efficiency, reliability and agility
• Reusability and template development
• Rapid innovation within governance structure, balanced costs, risks and service levels
• Greater efficiency and reliability, enabling broader audience to consume IT services via self-service
LESSONS LEARNT
• Reduced Spend
• Live migration
• Flexible and cost effective
• Better business continuity
• Speed to deliver
• Easier to manage
• More efficient IT operations
Cons
• Upfront hardware costs
• Software license costs
• Possible learning curve
• Accountability
• Getting all vendors to gel well
Pros
Panel
M O D E R A T E D B Y :
Avinash Deshpande
Principal, Big Data and Advanced Analytics, Logitech
Kurt Jackson
Platform Lead, Autodesk
Dan Young
Chief Data Architect, Indiana University
Paul Moxon
Head of Product Management, Denodo