deep dive - dataversitycontent.dataversity.net/rs/656-wmw-918/images/arcadiadm... · 2020-03-04 ·...
TRANSCRIPT
DEEP DIVEAre Data Lakes for Business Users?
www.dmradio.biz
Featured Speakers
“I only believe in science!”
- Numbers don’t lie, but…- Words are metaphors- Communication is hard- Complexity is blinding
‘Data Science’ Probably Isn’t What You Think
‘Science’ is a discipline
The ‘Scientific Method’ is key
It’s a methodology!
Science is not infallible
Scientists change their minds!
Eggs are bad for you? Really?
When Will the Lava Flow Stop?
Credit: https://www.cnn.com/2018/05/07/us/hawaii-volcano-by-the-numbers/index.html
Words to the Wise• Data will always require
analysis to be of value• Understanding what we know
takes time and communication• Leveraging what we know takes
savvy and moxie• Analytics and Big Data can be
very helpful if we understand what we know, and know (roughly) what we’re doing!
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
Data Lake Value for Business Users
Wayne W. EckersonMay 16, 2018
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
Premise of the Data Lake
Great for data scientists and
analysts who want access to raw data
Address limitations of data
warehouses
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
But what about regular users? Who don’t know SQL, Python, or Java?
Who require clean, curated, aggregated data?
Who need sub-second performance and information tailored to their needs?
Who need a graphical interface to analyze data?
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
Let’s find out!!
Online Assessment on Data Lake Value
• 22 questions, 5 minutes to complete• Respondents get custom report (right)• Runs March to June, 2018
https://eckerson.ratemydata.com/s/data-lake-value-for-business-users
Demographics (as of April 20…)
• 199 started assessment, 162 completed• 93 have data lake in production • 74% from North America • 50% have 10,000+ employees
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
How Do Most Users Query the Data Lake?Data Lake Infrastructure
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
Data Lake Deployment by Age
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
Can Business Users Explore Data to Get Views They Want?
Our Data Lake Fosters Better Decisions
42%
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
Our Data Lake Provides Consistent, Fast Performance Business Users Trust the Accuracy of Analytics in Our Data Lake
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
Exploration by Company
© Eckerson Group 2017 Twitter: @weckerson www.eckerson.com
• Most data lakes run on Hadoop on premises• Data lakes are not “data swamps”• Data lakes are not just for data scientists• Graphical BI tools provide fast performance for
queries and exploration• The quality of data in data lakes is suitable for regular
business users
Conclusions
Arcadia Data. Proprietary and Confidential
Giving Business Users Value from Data Lakes
Steve Wooledge
Arcadia Data. Proprietary and Confidential
Enterprises Today Need Two Separate BI Standards
Arcadia Data. Proprietary and Confidential
Data Warehouse BI Architecture
3
BI Server
Analytic ProcessOptimize Physical
Semantic LayerSecure DataLoad Data
Big Data RequirementsNative ConnectionSemi-Structured
ParallelReal-time
Data Warehouse(RDBMS)
Arcadia Data. Proprietary and Confidential
Data Lake BI Architecture – The Native BI and Analytics Way
4
BI Server
Analytic ProcessOptimize Physical
Semantic LayerSecure DataLoad Data
Big Data RequirementsNative ConnectionSemi-Structured
ParallelReal-time
Data Warehouse(RDBMS)
Data Lake(*DFS, Cloud Object Storage)
Arcadia Data was built from inception to
run natively within data lakes
Arcadia Data. Proprietary and Confidential
The Result: Faster BI Analytics and Higher User Concurrency
5
Customer Benchmark of a Legacy BI Tool Accelerated by Arcadia Data On a Data Lake
Arcadia Data. Proprietary and Confidential
Data Lake BI Architecture – The Native BI Way Plus There’s More!
6
Arc Viz
Streams/Topics
Real-Time Data
Data Warehouse(RDBMS)
Data Lake(HDFS, Cloud Object Storage)
Arcadia Data was built from inception to
run natively within data lakes
Arcadia Data. Proprietary and Confidential
Query acceleration for scale, performance, and concurrency
Smart Acceleration Leverages What Is Learned during Data Discovery
Ad hoc queries
Arcadia Enterprise makes recommendations –build these with a click.
Data Lake Cluster
• Fast query responses• Minimal modeling• Live acceleration (no downtime)
All Granular Data
Analytical Views
Accelerated application queries
Arcadia Data. Proprietary and Confidential
BI Native to Data Lakes Provides Faster Time to ValueVisual Analytics and BI Native to Data Lakes
Data Warehouse or Data Lake
Traditional BI Server or Middleware Cubes
One security model No movement of data AI-driven performance
modelingTime to Insight/Value in Days
Time to Value Delayed Weeks
Time to Insight/Value in Weeks or Months
Extract and Secure
Land / Secure Data
Build Semantic Layer
Analytical Discovery
AI-driven Performance Modeling
Production
Land / Secure Data
Performance Modeling –Cubes / Aggregates
AnalyticalDiscovery Production
Transform 3NF or Star Schema
Build Semantic Layer
Performance Modeling (both places)
Data
Mov
emen
t
Arcadia Data. Proprietary and Confidential
Demo: See It in Action
Social media: @arcadiadataarcadiadata.com 10
Thank You
Self Assessment: Data Lake Value for Business
Users
See Arcadia Enterprise in Action
Download Arcadia Instant
http://bit.ly/RateMyData https://www.youtube.com/watch?v=APPpgGNP5Gs
arcadiadata.com/instant