big data/hadoop/hana basics

36
[ How Big Data Technologies Provide Solutions for Big Data Problems John Choate – PMMS SIG Chair David Burdett – Strategic Technology Advisor, SAP Henrik Wagner, Global SAP Lead-Alliances, EMC Corp

Post on 12-Sep-2014

2.663 views

Category:

Technology


0 download

DESCRIPTION

Summary of three National webinars. Three V's, market, Functional areas showing most traction, Hot Revenue/ROI areas, Architecture options and using Use cases to overcome objections.,

TRANSCRIPT

Page 1: Big data/Hadoop/HANA Basics

[ How Big Data Technologies Provide

Solutions for Big Data Problems John Choate – PMMS SIG Chair

David Burdett – Strategic Technology Advisor, SAP

Henrik Wagner, Global SAP Lead-Alliances, EMC Corp

Page 2: Big data/Hadoop/HANA Basics

[ The Challenge of Big Data

2

Customer

IT Developer Analyst

LOB User

Data

Decision-Maker

Page 3: Big data/Hadoop/HANA Basics

[ The 5 Part Series

Webinar 1: Why Big Data matters, how it can fit into your Business and

Technology Roadmap, and how it can enable your business!

Webinar 2: How Big Data technologies provide Solutions for Big Data

problems

Webinar 3: Using Hadoop in an SAP Landscape with HANA

Webinar 4: Leveraging Hadoop with SAP HANA smart data access

Webinar 5: Using SAP Data Services with Hadoop and SAP HANA

Resources … Webinar Registration

1. Go to www.saphana.com

2. Search “ASUG Big Data Webinar”

3. Registration links in blog …

Big Data, Hadoop and Hana – How they Integrate and How they Enable your Business!

Info on SAP and Big Data – go to www.sapbigdata.com

3

Page 4: Big data/Hadoop/HANA Basics

[ AREAS TO COVER

SETTING THE STAGE

MARKET

TECHNOLOGY

USE CASES

SUMMARY

4

Page 5: Big data/Hadoop/HANA Basics

[ How did we get here?

5

1990 2015 2000 2005 2010

DATABASE

(CIRCA 1980)

ANALYTICS

(CIRCA 1980)

PREDICTIVE ANALYTICS

(CIRCA 1980) SEMANTIC ANALYTICS

(CIRCA 1980)

REAL TIME

1,000,000+ SOLD

WWW

3,000,000 people had access to

internet worldwide

B2B / B2C MOBILE

More people have mobile phones than electricity or

safe drinking water

Facebook: 1 billion users; 600 mobile users; more

than 42 million pages and 9 million apps

Youtube: 4 billion views per day

Google+: 400 million registered users

Skype: 250 million monthly connected users

SOCIAL

BIG DATA

PERSONAL COMPUTER AND CLIENT SERVER

2013

Page 6: Big data/Hadoop/HANA Basics

[ How big is Big Data?

6

1.8

IN 2011, THE AMOUNT OF DATA SURPASSED

ZETTABYTES

90% OF THE WORLD DATA TODAY

has been created

in the last two years alone!

Today we measure available data

in zettabytes (1 trillion gigabytes)

Eight 32GB iPads per person alive

in the world

Page 7: Big data/Hadoop/HANA Basics

[ Big Data Simplified

Definition

“Big data” is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making

Gartner

Three Key Parts

Part One: 3V’s – Volume, Velocity, Variety

Part Two: Cost-Effective, Innovative Forms of Information Processing

Part Three: Enhanced insight for “Real Time” decision making

7

Page 8: Big data/Hadoop/HANA Basics

[ The 7 Key Drivers Behind the Big Data Movement? *

Business

Opportunity to enable innovative new business models

Potential for new insights that drive competitive advantage

Technical

Data collected and stored continues to grow exponentially

Data is increasingly everywhere and in many formats

Traditional solutions are failing under new requirements

Financial

Cost of data systems, as a percentage of IT spend, continues to

grow

Cost advantages of commodity hardware & open source

software

8

* http://hortonworks.com/blog/7-key-drivers-for-the-big-data-market/

Page 9: Big data/Hadoop/HANA Basics

[ Todays Key Challenges in Big Data

Information Strategy

1. Which investments will deliver most business value and ROI?

2. Governance – New expectations for data quality and management

3. Talent – How will you assemble the right teams and align skills?

Data Analytics

1. Data Capture & Retention – What data should be kept and why

2. Behavioral Analytics – Understanding and leveraging customer behavior

3. Predictive Analytics – Using new data types (sentiment, clickstream, video, image and

text) to predict future events

Enterprise Information Management (EIM)

1. User expectations – Making “Big Data” accessible for the end user in “real-time”

2. Costs – How to provide access to big data in a rapid and cost-effective way to support

better decision-making?

3. Tools – Have you identified the processes, tools and technologies you need to support

big data in your enterprise?

9

Page 10: Big data/Hadoop/HANA Basics

[ PRESENTATION CONTENT

SETTING THE STAGE

MARKET

TECHNOLOGY

USE CASES

SUMMARY

10

Page 11: Big data/Hadoop/HANA Basics

[ The RAPIDLY GROWING Market

11

“By 2015, 4.4 million IT jobs globally will

be created to support big data, generating

1.9 million IT jobs in the United States” Peter Sondergaard, Senior Vice President at Gartner and

global head of Research http://www.gartner.com/newsroom/id/2207915

“The Global big data market is estimated to be

$14.87 billion in 2013 and expected to

grow to $46.34 billion … an estimated

Compounded Annual Growth Rate (CAGR) of

25.52% from 2013 to 2018” http://www.marketsandmarkets.com/PressReleases/big-data.asp “IDC expects the Big Data technology and

services market to grow at a 31.7% compound annual growth rate through 2016” http://www.idc.com/getdoc.jsp?containerId=238746

Page 12: Big data/Hadoop/HANA Basics

[ Products and Services under the Umbrella of Big Data

Hadoop software and related

hardware

NoSQL database software and

related hardware

Next-generation data

warehouses/analytic database

software and related hardware

Non-Hadoop Big Data platforms,

software, and related hardware

In-memory – both DRAM and

flash – databases as applied to Big

Data workloads

Data integration and data quality

platforms and tools as applied to

Big Data deployments

Advanced analytics and data

science platforms and tools

Application development

platforms and tools as applied to

Big Data use cases

Business intelligence and data

visualization platforms and tools as

applied to Big Data use cases

Analytic and transactional

applications as applied to Big Data

use cases

Big Data support, training, and

professional services

12

Page 13: Big data/Hadoop/HANA Basics

[ WHO IS SPENDING $$$ ON BIG DATA ?

COMPANIES

Median = $10M

25% Spend less $2.5M

15% Spend greater $100M

7% Spend greater than $500M

INDUSTRIES

MOST

Banking

High Tech

Telecommunications

Travel

LEAST

Energy/Resources

Life Sciences

Retail

13 2012 Tata Consulting Services (TCS)

Global Study

Page 16: Big data/Hadoop/HANA Basics

[ 10 Big Data Trends Changing the Face of Business

1. Machine Data and the Internet of

Things Takes Center Stage

2. Compound Applications That

Combine Data Sets to Create Value

3. Explosion of Innovation Built on

Open Source Big Data Tools

4. Companies Taking a Proactive

Approach to Identifying Where Big Data

Can Have an Impact

5. There Are More Actual Production

Big Data Projects

6. Large Companies Are Increasingly Turning to Big Data

7. Most Companies Spend Very Little, A Few Spend A Lot

8. Investments Are Geared Toward Generating and Maintaining Revenue

9. The Greatest ROI of Big Data Is Coming from the Logistics and Finance Functions

10. The Biggest Challenges Are as Much Cultural as Technological

16

Page 17: Big data/Hadoop/HANA Basics

[ PRESENTATION CONTENT

SETTING THE STAGE

MARKET

TECHNOLOGY

USE CASES

SUMMARY

17

Page 18: Big data/Hadoop/HANA Basics

[ Aspect of Time Value of Data

“HOT” Data may be better suited for “In Memory” HANA

residency. This data largely derived from structured SAP

sources.

“WARM” and “COLD” Data may be better suited for

HADOOP residency. This data is largely unstructured in

nature and may present very large data sets (multi PB).

Business value reflected by Use Cases may consist of queries

and data structures in three different ways:

Enabled by SAP HANA

Enabled by HADOOP

Enabled by HANA and HADOOP simultaneously

18 EMC

Corporation

Page 19: Big data/Hadoop/HANA Basics

[ SAP’s Technology Use Case View

19

EMC Corporation

Page 20: Big data/Hadoop/HANA Basics

[ Big Data High Level Software Architecture

Big Data Storage holds the data in memory or on

SSD/HDD

Big Data Database Software manages data in the Big

Data Storage. Includes SQL and NoSQL DBMS.

Processing Engines are software that can process /

manipulate data in the Big Data Storage

Analytic Software analyzes data using the Processing

Engines or Big Data DB Software

Big Data Applications provide solutions for specific

business problems

Development Software is used to build Big Data

Applications

Visualization Software presents the results to end

users from Analytic Software or Big Data Applications

Data Capture Software on-boards and manages data

from multiple Data Sources

Management Software handles operational of the Big

Data implementation / solution

20

Big Data Storage

Data Sources

Data Capture Software

HDD

SSD

In-memory

Processing Engines

Software

Visualization

Software

Big Data

Database Software

Analytic

Software

Big Data

Applications

Man

agem

en

t

So

ftw

are

Develo

pm

en

t

So

ftw

are

Page 21: Big data/Hadoop/HANA Basics

[ Big Data Software Other Solutions

Big Data Software solutions only handle part of the problem

21

Big Data Storage

Data Sources

Data Capture Software

HDD

SSD

In-memory

Processing Engines

Software

Visualization

Software

Big Data

Database Software

Analytic

Software

Big Data

Applications

Man

agem

ent

Soft

war

e

Deve

lopm

ent

Soft

war

e

Big Data Storage

Data Sources

Data Capture Software

HDD

SSD

In-memory

Processing Engines

Software

Visualization

Software

Big Data

Database Software

Analytic

Software

Big Data

Applications

Man

agem

ent

Soft

war

e

Deve

lopm

ent

Soft

war

e

Big Data Storage

Data Sources

Data Capture Software

HDD

SSD

In-memory

Processing Engines

Software

Visualization

Software

Big Data

Database Software

Analytic

Software

Big Data

Applications

Man

agem

ent

Soft

war

e

Deve

lopm

ent

Soft

war

e

Hadoop Cassandra MongoDB

Cassandra

Hadoop HDFS

Mahout/ Giraph, etc

Cassandra

MongoDB

MongoDB

Hive/HBase

Page 22: Big data/Hadoop/HANA Basics

[ Big Data Software Architecture and HANA

22

ANALYZE – Analytics!

Analyze and visualize Big Data using tools that best serve your

business needs.

Reduce delays associated with complex analysis of large data sets

using in-memory analytics.

New opportunities and expose hidden risks using algorithms, R

integration, and predictive analysis.

Enable business users to access and visualize insight using charts,

graphs, maps, and more.

Uncover hidden value from unstructured data with text analytics.

ACELERATE – “Real Time” Visibility

Increase business speed with cost-performance data processing

options

In-memory processing with SAP HANA to massively parallel

processing with the SAP Sybase IQ database

Distributed processing of large data sets with Hadoop.

ACQUIRE – Meet the Expanding Data Demand

Acquire and store large volumes of data from a variety of data sources.

Flexible data management capabilities delivered via the SAP HANA

platform.

Best option based on business requirements for accessibility,

complexity of analytics, processing speed, and storage costs.

See: http://www.sapbigdata.com/platform/

Big Data Storage

Data Sources

Data Capture Software

HDD

SSD

In-memory

Processing

Engines

Software

Visualizatio

n Software

Big Data

Database

Software

Analytic

Software

Big Data

Applications

Man

agem

en

t

So

ftw

are

Develo

pm

en

t

So

ftw

are

SAP HANA Sybase IQ

Hadoop HDFS

HANA / Sybase IQ “R” Engine, Text Analytics, etc.

SAP BI Tools

SAP Lumira

HA

NA

Stu

dio

SAP Data Services

SAP

Lan

dsc

ape

Man

agem

en

t

Page 23: Big data/Hadoop/HANA Basics

[ PRESENTATION CONTENT

SETTING THE STAGE

MARKET

TECHNOLOGY

USE CASES

SUMMARY

23

Page 24: Big data/Hadoop/HANA Basics

[ Looking for Big Data Potential in your Company

24

ACQUIRE – Meet the Expanding Data Demand

1. Acquire and store large volumes of data from a variety of data sources.

2. Flexible data management capabilities delivered via the SAP HANA platform.

3. Best option based on business requirements for accessibility, complexity of analytics, processing speed,

and storage costs.

ACELERATE – “Real Time” Visibility

1. Increase business speed with cost-performance data processing options

2. In-memory processing with SAP HANA to massively parallel processing with the SAP Sybase IQ

database

3. Distributed processing of large data sets with Hadoop.

ANALYZE – Analytics!

1. Analyze and visualize Big Data using tools that best serve your business needs.

2. Reduce delays associated with complex analysis of large data sets using in-memory analytics.

3. New opportunities and expose hidden risks using algorithms, R integration, and predictive analysis.

4. Enable business users to access and visualize insight using charts, graphs, maps, and more.

5. Uncover hidden value from unstructured data with text analytics.

Page 25: Big data/Hadoop/HANA Basics

[ OVERCOMING OBJECTIONS – USE CASES

1. Big Data Projects are too expensive

2. Big Data is Technology in search of a Business Problem to solve!

3. Big Data is an IT project, we don’t need to involve the business.

4. Big Data is just the new Buzzword phrase, just like Cloud! Soon another

trend and new buzzword will come along.

5. We don’t have the skills to use Big Data Solutions.

25

Page 26: Big data/Hadoop/HANA Basics

[ Big Data and Competitive Advantage

26

Utilize your data to gain a

competitive advantage!

Competitiveness of fact-finders vs. fumblers

Laggards Leaders

Fumblers

Fact-finders

Fumblers

Fact-finders

• Base decisions on the latest, granular multi-structured data

• Make decisions on analytics rather than intuition

• Frequently reassess forecasts and plans

• Utilize analytics to support a spectrum of strategic, operational and tactical decision making

• Rapidly evaluate alternative scenarios

Leading businesses can outpace the competition because they can:

n=1,002 Source: IDC‘s SAP HANA Market Assessment, August 2011

Page 27: Big data/Hadoop/HANA Basics

[

REVENUE

Sales

Marketing

Customer Service

R&D/NPI

IT

Finance

HR

ROI

Finance

Logistics

Marketing

Sales

Greater 25%

27

2012 Tata Consulting

Services (TCS) Global

Study

Soliciting Allies

Page 28: Big data/Hadoop/HANA Basics

[ T-Mobile USA, Inc. Telecom – Optimize Marketing Campaigns Effectiveness

28

Product: Agile Datamart

Business Challenges Proliferation of offers/micro-offers increasingly strategic in a highly

competitive market Marketing Operations needs to collect, analyze and report on results of

campaigns/offers very quickly and with great flexibility Current and future campaigns have to be fine tuned to improve

customer adoption and profitability

Technical Challenges Data for 33M customers required a lot of time to be explored and

analyzed in detail with previous technology

Benefits Dynamic read outs on the upsell/cross sell performance of store and

call centers Easy, fast assess to the performance of all campaigns (e.g. by geo, by

store, etc) Quicker forecast of the financial impact of marketing campaigns

Based on the rapid analytics that we’re performing on SAP HANA, we are now able to quickly fine tune our current and future campaigns to improve the customer adoption rate, reduce churn and increase profit

Alison Bessho, Director, Enterprise Systems Business Solutions, T-Mobile USA

56x faster analysis

“ ”

5 Billion+ records

for 33M customers report executed in 9 seconds

Page 29: Big data/Hadoop/HANA Basics

[

SAP HANA offers an effective real-time data driven system which is essential to giving immediate performance feedback and increase retention rate of students, increasing millions in revenue for the University every year.

Vince Kellen, CIO University of Kentucky

“ ”

29

Business Challenges

Enable the University to increase student retention and thus increase the Graduation Rate from 60% to 70% over a 10 Year period

Huge costs and longer turnaround time for student classification to improve student satisfaction and the retention rate

Technical Challenges

Lack of speed, accuracy and visibility into data analysis

Handling Big data efficiently: SAP ECC V6 production system is 1.5 TB and SAP BW V7 and Oracle Data Warehouse combined is 4 TB

Benefits

Increased Student Retention Rate, fast collect new information related to student interactions and various student behaviors

Reduced IT Infrastructure Costs and increased IT FTE productivity

Allow the University to retire several systems including Informatica, BI Web Focus (IBI), and Oracle (DB)

$1.1M increase in

revenue with 1% increase in retention rate

420x improvement

in reporting speed: It took 2-3 seconds as against the competition Oracle DW which took 15-20 minutes

15x improvement in

Query load time

University of Kentucky Higher Education – Student Retention

Page 30: Big data/Hadoop/HANA Basics

[ Hardware Preventative Maintenance

30

Business Challenges

A computer server manufacturer wants to implement effective preventative maintenance by identifying problems as they arise then take prompt action to prevent the problem occurring at other customer sites

Technical Challenges

Identifying problems by analyzing text data from call centers, customer questionnaires together with server logs generated by their hardware

Combining results with CRM, sales and manufacturing data to predict which servers are likely to have problems in the future

Solution

Use SAP Data Services to analyze call center data and questionnaires stored in Hadoop and identify potential problems

Use HANA to merge results from Hadoop with server logs to identify indicators in those logs of potential problems

Combine with CRM, bill of material and production/manufacturing data to identify cases where preventative maintenance would help

Page 31: Big data/Hadoop/HANA Basics

[ Data Warehouse Migration

31

Business Challenges

A high tech company with a major web presence uses non-SAP software for its data warehouse to analyze the activity on their web site properties and combine it with data in SAP Business Suite

They want to both reduce the cost and improve the responsiveness of their data warehouse solutions by moving to a combination of SAP HANA and Hadoop

Technical Challenges

How to complete the migration without disrupting existing reporting processes

Solution – this was a four step process

Step 1. Replicate Data in Hadoop. SAP Data Services is used to replicate in Hadoop all data from web logs and SAP Business Suite being captured by the current Data Warehouse

Step 2. Aggregate Data in Hadoop. The aggregation process in the existing Data Warehouse is re-implemented in Hadoop and the aggregate results fed back to the existing Data Warehouse significantly reducing its workload.

Step 3. Copy the Aggregate Data to HANA. The aggregate data created by Hadoop is also copied to HANA together with historical aggregate data already in the existing Data Warehouse. The result is that eventually HANA has a complete copy of the data in the existing Data Warehouse.

Step 4. Replace Reporting by SAP HANA. New reports are developed in HANA to replace reports in the original Data Warehouse. Once complete, the original Data Warehouse will be decommissioned.

The end result is a faster, more responsive and lower cost Data Warehouse built on HANA and Hadoop.

Page 32: Big data/Hadoop/HANA Basics

[ PRESENTATION CONTENT

SETTING THE STAGE

MARKET

TECHNOLOGY

CASE STUDY

SUMMARY

32

Page 33: Big data/Hadoop/HANA Basics

[ SUMMARY

1. The Big data Market Is Not Going Away!

2. There are 3 Distinct Components of BD Market

3. Its Not a New Trend but way for Technology To

Enable Your Business

4. Case Studies HELP to visualize your own Companies

BD Opportunities – Benchmark & Assess!

5. Don’t go the Journey Alone – There are many

resources available to make your Journey Successful!

33

Page 34: Big data/Hadoop/HANA Basics

[ Q&A

Questions ?

34

Page 35: Big data/Hadoop/HANA Basics

[ The 5 Part Series

Webinar 1: Why Big Data matters, how it can fit into your Business and

Technology Roadmap, and how it can enable your business!

Webinar 2: How Big Data technologies provide Solutions for Big Data problems

Webinar 3: Using Hadoop in an SAP Landscape with HANA

Webinar 4: Leveraging Hadoop with SAP HANA smart data access

Webinar 5: Using SAP Data Services with Hadoop and SAP HANA

Resources …

Webinar Registration

1. Go to www.saphana.com

2. Search “ASUG Big Data Webinar”

3. Registration links in blog …

Big Data, Hadoop and Hana – How they Integrate and How they Enable your Business!

Info on SAP and Big Data – go to www.sapbigdata.com

35

Page 36: Big data/Hadoop/HANA Basics

THANK YOU FOR PARTICIPATING.

SESSION CODE:

Learn more year-round at www.asug.com