complement your existing data warehouse with big data & hadoop

36
© 2013 Datameer, Inc. All rights reserved. Complement Your Existing Data Warehouse with Big Data & Hadoop

Upload: datameer

Post on 06-May-2015

2.262 views

Category:

Technology


3 download

DESCRIPTION

To view the full webinar, please go to: http://info.datameer.com/Slideshare-Complement-Your-Existing-EDW-with-Hadoop-OnDemand.html With 40% yearly growth in data volumes, traditional data warehouses have become increasingly expensive and challenging. Much of today’s new data sources are unstructured, making the structured data warehouse an unsuitable platform for analyses. As a result, organizations now look at Hadoop as a data platform to complement existing BI data warehouses, and a scalable, flexible and cost-effective solution for data storage and analysis. Join Datameer and Cloudera in this webinar to discuss how Hadoop and big data analytics can help to: -Get all the data your business needs quickly into one environment Shorten the time to insight from months to days Extend the life of your existing data warehouse investments Enable your business analysts to ask and answer bigger questions

TRANSCRIPT

Page 1: Complement Your Existing Data Warehouse with Big Data & Hadoop

© 2013 Datameer, Inc. All rights reserved.

Complement Your Existing Data Warehouse with Big Data & Hadoop

Page 2: Complement Your Existing Data Warehouse with Big Data & Hadoop

View Recording ▪ You can view the recording of this

webinar at: ▪ http://info.datameer.com/Slideshare-Complement-Your-Existing-EDW-with-

Hadoop-OnDemand.html

Page 3: Complement Your Existing Data Warehouse with Big Data & Hadoop

About our Speakers

Karen Hsu –  Karen is Senior Director, Product Marketing

at Datameer. With over 15 years of experience in enterprise software, Karen Hsu has co-authored 4 patents and worked in a variety of engineering, marketing and sales roles.

–  Most recently she came from Informatica where she worked with the start-ups Informatica purchased to bring data quality, master data management, B2B and data security solutions to market. 

–  Karen has a Bachelors of Science degree in Management Science and Engineering from Stanford University.  

Page 4: Complement Your Existing Data Warehouse with Big Data & Hadoop

About our Speakers Jeff Bean

–  Jeff Bean has been at Cloudera since 2010. He's helped several of Cloudera's most important customers and partners through their adoptions of Hadoop and HBase, including cluster sizing, deployment, operations, application design, and optimization. "

–  Jeff has also spent time on Cloudera's training team, where he focused on partner enablement, training hundreds of field personnel in Hadoop, it's usage, and it's position in the market. Jeff currently does partner engineering at Cloudera, where he handles field support, certifications, and joint engagements with partners such as Datameer. "

Page 5: Complement Your Existing Data Warehouse with Big Data & Hadoop

© 2013 Datameer, Inc. All rights reserved.

How Big Data Analytics and Hadoop Complement Your Existing Data Warehouse Jeff Bean, Cloudera Karen Hsu, Datameer

Page 6: Complement Your Existing Data Warehouse with Big Data & Hadoop

Agenda •  Why optimize? •  What to optimize? •  How to optimize?

•  Who has optimized already? •  Conclusion

Page 7: Complement Your Existing Data Warehouse with Big Data & Hadoop

Data Has Changed in the Last 30 Years D

ATA

GR

OW

TH

END-USER APPLICATIONS THE INTERNET

MOBILE DEVICES

SOPHISTICATED MACHINES

STRUCTURED DATA – 10%

1980 2013

UNSTRUCTURED DATA – 90%

Page 8: Complement Your Existing Data Warehouse with Big Data & Hadoop

EDW Expansion: A Vicious Cycle §  Increasing  

numbers  of  users  

§  Growing  volumes  of  data  

§  Addi7onal  data  sources  

§  New  use  cases  

§  Degraded  quality  of  service  and  inability  to  meet  SLAs  

§  Constant  pressure  to  purchase  addi7onal  capacity    

Enterprise Data

Warehouse

Page 9: Complement Your Existing Data Warehouse with Big Data & Hadoop

Hadoop vs. Data Warehouse:Freeing up Capacity for High Value Workloads

Today  All  growth  accommodated  by  incremental  investment  

in  DW  

100  TB  

Data  Warehouse  $20,000  -­‐  $100,000  /  TB  

100  TB  100  TB  

More  Capacity  in  Data  Warehouse  

Incremental  Spend:  $2  to  $10  Million  

100%    Data  Growth   +  

11  

Page 10: Complement Your Existing Data Warehouse with Big Data & Hadoop

Hadoop vs. Data Warehouse:Freeing up Capacity for High Value Workloads

FutureHadoop  offloads  data  and  workloads  to  defer/avoid  incremental  spend  and  reduce  data  management  TCO  

Lower  Value  Data  

50  TB  100  TB  

Cloudera  /  Datameer  (Total  Cost  of  Cluster)  $1,000  -­‐  $2,000  /  TB  Incremental  Spend:  

$240,000-­‐  $300,000  ACV  

Keep  the  Right  Data  in  the  Data  Warehouse  System  • Opera7onal  Analy7cs  • Repor7ng  • Business  Analy7cs  

Use  Hadoop  for  Everything  ElseSavings:  $1.85  to  9.8  MM  • Historical  Data  • Data  Processing  • Ad  Hoc  Exploratory  • Transforma7on  /  Batch  • Data  Hub  

100  TB   High  Value  Data   50  TB  

Page 11: Complement Your Existing Data Warehouse with Big Data & Hadoop

Agenda •  Why optimize? •  What to optimize? •  How to optimize?

•  Who has optimized already? •  Conclusion

Page 12: Complement Your Existing Data Warehouse with Big Data & Hadoop

Data Warehouse

Operational Business Intelligence

Analytics Self-Service BI

Data Processing (ELT)

Staged Data

Operational Data Archival Data

WO

RK

LOA

DS

D

ATA

Assessing Workloads and Data

▪ Data Processing (ELT) –  Staged data, to be processed –  Temp tables, BLOB/CLOB types, …

▪ Analytics / Machine Learning –  Deep and broad data sets, within

and beyond the warehouse

▪ Self-Service BI (Ad-Hoc Query) –  Operational data, actively used for BI –  Archival data, inactively used for BI

14

Page 13: Complement Your Existing Data Warehouse with Big Data & Hadoop

Offload Data Processing (ELT)

15

High-scale batch data processing

Integrate any type of data with pre-built connectors

High availability, disaster recovery, downtime-less upgrades

Low-latency SQL processing

What?

Benefits of Cloudera and Datameer

Key Capabilities

Over 2X the performance at 1/10th the cost 96% reduction in ETL time

Page 14: Complement Your Existing Data Warehouse with Big Data & Hadoop

Offload Analytics / Machine Learning

Training & scoringpredictive models

Deep and broad data sets

Drag-and-drop Data Mining and Machine Learning for a business analyst

Automated support for Clustering, Recommendations, Decision Tree, and Column Dependencies

Ability to run SAS, R natively on the same cluster

What? Key Capabilities

Benefits of Cloudera and Datameer

Greater flexibility at 1/10th the cost Expand data mining and machine learning to analysts

Page 15: Complement Your Existing Data Warehouse with Big Data & Hadoop

Offload Self-Service Business Intelligence

Self-Service BI,Exploratory BI,Data Discovery

Unknown Questions

Workload

250+ prebuilt analytics functions

Transparency and governance

Open source interactive SQL

Key Capabilities

Benefits of Cloudera and Datameer

Better flexibility at 1/10th the cost Reduce analysis time from 4 weeks to 3 days

Page 16: Complement Your Existing Data Warehouse with Big Data & Hadoop

Complementing the Data Warehouse

19

OLTP

Enterprise Applications

Business Intelligence

Data Warehouse

Query(High $/Byte)

CLOUDERA / DATAMEER ETL

Load Archive

Operational BI

Archival Data, Exploration, Analytics

Batch Process

Storage

Search Analyze Integrate

Vis

Page 17: Complement Your Existing Data Warehouse with Big Data & Hadoop

Agenda •  Why optimize? •  What to optimize? •  How to optimize?

•  Who has optimized already? •  Conclusion

Page 18: Complement Your Existing Data Warehouse with Big Data & Hadoop

Process!

Integrate!

Prepare and!Analyze!

Visualize and !Validate!

Define!Deploy!

Ad Hoc

Production

Page 19: Complement Your Existing Data Warehouse with Big Data & Hadoop

© 2013 Datameer, Inc. All rights reserved.

Define!Profile and Assess "  Workloads in EDW""  Ability to migrate""  Size of data set"

Prioritize "  Constraints""  Portability""  Disruption"

Identify "  Use cases""  Return on investment"

Page 20: Complement Your Existing Data Warehouse with Big Data & Hadoop

© 2013 Datameer, Inc. All rights reserved.

Codeless Integration " ELT, not ETL"" 50+ Datameer connectors, plug-in API"

Integrate!Migration "  Data ingest paths""  Map EDW workload to Cloudera"

Page 21: Complement Your Existing Data Warehouse with Big Data & Hadoop

© 2013 Datameer, Inc. All rights reserved.

Interactive Data Preparation

" Ensure Data Quality"

" Enrich data"

Interactive + Smart Analytics

"  250+ built-in functions"

"  Automated machine learning"

Transparency + Governance

"  Visual data lineage"

"  Complete audit trail"

"  Metadata catalog"

Prepare and Analyze!

Page 22: Complement Your Existing Data Warehouse with Big Data & Hadoop

© 2013 Datameer, Inc. All rights reserved.

Validate " Verify results"

" Tune"

Visualization Anywhere "  Infographic or dashboard"

"  Run on tablets and smart phone devices"

Visualize and Validate!

Page 23: Complement Your Existing Data Warehouse with Big Data & Hadoop

© 2013 Datameer, Inc. All rights reserved.

Scheduling "  Dependency triggers"

"  Data synchronization"

"  External scheduling integration"

Monitoring "  Monitoring system, jobs, performance, throughput"

"  Error handling"

"  Log management"

Deploy!Security "  LDAP / Active Directory "

"  Role based access control"

"  Support for Kerberos"

Page 24: Complement Your Existing Data Warehouse with Big Data & Hadoop

Role Responsibilities

Admin Set up and maintain environment

Business Analyst Work with partners to define requirements and define goals

Deployment Team Set up monitoring and scheduling

ETL Architect Prepare and cleanse data

Page 25: Complement Your Existing Data Warehouse with Big Data & Hadoop

Roles Mapped to Process!

Define

Integrate

Prepare & Analyze

Visualize

Deploy

BA

Admin

BA / Arch.

BA

Admin /Deploy. Team

Define goals, results, sources, requirements

Source data, secure for ad hoc

Cleanse, combine, enrich data Create analysis

Create infographics, dashboards

Business: Validate with end users Technical: Secure, monitor schedule

Page 26: Complement Your Existing Data Warehouse with Big Data & Hadoop

Use Cases

Operational Customer Fraud and Compliance

Page 27: Complement Your Existing Data Warehouse with Big Data & Hadoop

Customer

Reduce customer acquisition costs by 30%

Page 28: Complement Your Existing Data Warehouse with Big Data & Hadoop

H E L L O my name is

greg 7-ELEVEN

$4.10

$3.22 $4.55

$5.15 $4.15

$3.95

Location Data Transactions Authorizations POS Reports

Identify $2B in fraudulent transactions

Page 29: Complement Your Existing Data Warehouse with Big Data & Hadoop

Structured Logs

Network Data

Unstructured Logs

Doubling in size every 15 months

111001 110010 01101001 01100100 10011101 01101110

Improve customer service, development, sales

Page 30: Complement Your Existing Data Warehouse with Big Data & Hadoop

Calculating ROI is a process

Page 31: Complement Your Existing Data Warehouse with Big Data & Hadoop

Apply ROI to Multiple Projects

Page 32: Complement Your Existing Data Warehouse with Big Data & Hadoop

Calculating Return

Page 33: Complement Your Existing Data Warehouse with Big Data & Hadoop

Business Benefits

Funnel Optimization

Behavioral Analytics

Fraud Prevention

Customer Segmentation

Increase Customer conversion by 3x

Increase Revenue by 2x

Identify $2B in potential fraud

Lower Customer Acquisition Costs by 30%

Page 34: Complement Your Existing Data Warehouse with Big Data & Hadoop

EDW Optimization

Enterprise Data Warehouse

Discover fraud in less time – from 2 days to 2 hours, save $30M on DR

Avoid tens of millions in expansion purchases

Offload 90% of all data

Shrank EDW footprint by 4PB, 20x performance boost

Page 35: Complement Your Existing Data Warehouse with Big Data & Hadoop

Call to Action ▪ ROI and Solution Development

Consultation ▪ Join us at Hadoop World ▪ Contacts

– Jeff Bean [email protected] – Karen Hsu [email protected]

Page 36: Complement Your Existing Data Warehouse with Big Data & Hadoop