webinar 1 ppt
TRANSCRIPT
![Page 1: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/1.jpg)
Enterprise Data Warehouse Optimization with Hadoop Big Data
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
@Pentaho #BigDataWebSeries
![Page 2: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/2.jpg)
Your Hosts Today
Dave Henry SVP Enterprise Solutions
2 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Davy Nys VP EMEA & APAC
![Page 3: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/3.jpg)
3 © 2012, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Source/copyright: The Human Face of Big Data
![Page 4: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/4.jpg)
Pentaho Webinar Series
4 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Sign-up at: pentaho.com
![Page 5: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/5.jpg)
Goals for Today
5 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
To understand:
• Challenges with the current EDW architecture
• Trends and shifts in data processing
• How Hadoop can help
• How to leverage Hadoop with Pentaho Visual MapReduce
![Page 6: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/6.jpg)
Complete Analytics and Visual Data Management
Hadoop NoSQL Databases
Data Discovery &
Visualization
Enterprise &
Ad Hoc Reporting
Predictive Analytics &
Machine Learning
Data Ingestion, Manipulation &
Integration
Analytic Databases
© 2012, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 6
![Page 7: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/7.jpg)
Traditional Data Warehouse Architecture
© 2012, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 7
Source data acquisition / Ingestion Initial consolidation as required
Cleansing Transformation Change Data Capture Data Warehouse Management
Extract Transform
Load
Dashboard
Report
Analyzer
Structured Data
Unstructured Data
Data Mart(s) / Warehouse
Metadata
![Page 8: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/8.jpg)
Trends with Data Processing
8 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Data Load
• Volume of existing data sources are steadily increasing
• Requirement to make data available for longer periods of time (3 years -> 30 years)
• New sources of data are desired for analysis – machine-generated or external/3rd-party data
• Extract data from source systems • Load it (in its raw form) into the EDW • Transform it via SQL, creating new tables • Load the new tables into the “official” data
warehouse
“ELTL” Approach
To Data Load
![Page 9: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/9.jpg)
EDW can’t handle increasing data and workloads, so companies must:
• Reduce the volume of data • Restrict end-user access (# of users or access windows) to
accommodate longer batch processing windows • Purchase additional capacity (hardware / licenses), which can be
as much as $100K / TB Then, companies are faced with the following challenges:
• The compromise itself • The incremental outlay of capital required to expand the EDW or
purchase more proprietary ETL tool capacity • The inability of the incumbent ETL vendor to work with Hadoop
Challenges with Traditional Approaches
9 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
![Page 10: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/10.jpg)
Solution Architecture with Hadoop
© 2012, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 10
Data Integration Source data acquisition / Ingestion Initial consolidation as required
ETL ETL Metadata
Dashboard
Report
Analyzer
Structured Data
Unstructured Data
Data Integration Cleansing Transformation Change Data Capture Data Warehouse Management
Data Mart(s) / Warehouse
![Page 11: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/11.jpg)
Core Benefits
1. Improve performance – Meet critical data processing SLAs
2. Retain all data for analysis 3. Lower costs of data
management, growth 4. Extend existing EDW
capacity – Increase ROI from current investments
11 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Costs
Time
Flexibility
![Page 12: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/12.jpg)
Challenges with Hadoop: Scripting and Coding
12 © 2012, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Costs
Time
Flexibility
![Page 13: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/13.jpg)
Pentaho: Quickest, Most Complete Solution for Big Data
Design, develop and deploy 15x faster: • Full continuity from data access to decisions – complete data integration &
business analytics platform for any big data store
• Faster development, faster runtime – visual development, distributed execution
• Instant and interactive analysis – no coding, no ETL required
13 © 2012, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
![Page 14: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/14.jpg)
Solution Architecture & Demo
14 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Solution Architecture & Demo
![Page 15: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/15.jpg)
Data Warehouse Optimization
Data Sources Big Data Architecture
Data Warehouse (Master & Transactional Data)
ERP
CRM
CDR
Analytic Data Mart(s)
Analytic Data Mart(s)
Analytic Data Mart(s)
Logs Logs
Other Data
Raw Data
Parsed Data
Analytic Datasets
Master Data
Tape Archive
![Page 16: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/16.jpg)
ORCHESTRATE
ERP DW
Processing
CRM
Pig, Oozie, Flume, Hive, HBase, Sqoop
Raw Data
Parsed Data
Analytic Datasets
Pentaho for Hadoop – Data Integration + Analytics
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 16
Master Data
Analysis & Reporting
ANALYZE
VISUAL MAP REDUCE
Data Integration Analytics
INGEST
Ingestion
Structured Data
Unstructured Data
![Page 17: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/17.jpg)
Example – Call Record Processing
• What are the top 10 states for outbound calls on Fridays, Saturdays and Sundays?
• Data available: – Call records: date/timestamp & source phone # – Reference data: area code by country, state &
time zone (North American Numbering Plan)
• Goal: – Parse, enrich and filter the data – Load the data into Postgres for analysis
• Challenge – Prepare the data without impacting the EDW (no
ELT)
?
![Page 18: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/18.jpg)
Raw Data
Hadoop Data Processing Scenario
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 18
Master Data
Ingestion Structured Data
Unstructured Data
INGEST
![Page 19: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/19.jpg)
Processing
Raw Data
Parsed Data
Analytic Datasets
Visual MapReduce
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 19
Master Data
VISUAL MAP REDUCE
1. MapReduce Input – calling data
2. Calculate Month, Day, Day of Week
3. Extract 3 digit area code
4. Lookup geo master data in HDFS
5. Filter for weekend and US only calls
6. Create “Value” field for Key-Value Pair
7. Create “Key “ field for Key-Value Pair
8. MapReduce Output – Key-Value Pair
Java Programing
![Page 20: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/20.jpg)
Solution Architecture & Demo
20 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
End of Demo
![Page 21: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/21.jpg)
Leveraging Hadoop with Pentaho
21 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
OEM – Flexibility, Extensibility, Architected to Embed Pricing – One of top reasons customers choose us Community/Open Source Cache – Similar to Hadoop
Data Management Platform – Visual Map Reduce, Orchestration,
Connectivity – Fusion of all data sources & processing – Control/Manage/Optimize flow of data Hybrid – Leverages non-Hadoop infrastructure
![Page 22: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/22.jpg)
Overall Benefits
22 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Business Benefits
• You can defer upgrades to expensive EDW hardware
• You can offload batch processing from the EDW and make it more available to end-users (improve performance / comply with SLAs)
• With better performance you may need smaller cluster sizes
• This is a low-risk use case that lets you get familiar with Hadoop while creating business value
• It’s easy to evaluate – you don’t need to modify your cluster and risk disrupting the configuration
Technical Benefits
You should keep your EDW, but use Hadoop and Pentaho to optimize data processing
![Page 23: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/23.jpg)
Solution Architecture & Demo
23 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Q & A
![Page 24: Webinar 1 PPT](https://reader038.vdocuments.us/reader038/viewer/2022102821/577cce5d1a28ab9e788ddd6f/html5/thumbnails/24.jpg)
24 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Contact Us or Sign-up at: pentaho.com