sas on hadoop - analytics, business intelligence and … on hadoop hadoop, big data & analytics...

22
Company Confidential - For Internal Use Only Copyright © 2013, SAS Institute Inc. All rights reserved. SAS ON HADOOP HADOOP, BIG DATA & ANALYTICS 12 TH AUGUST 2014 NANG CHING TECK, HEAD OF TECHNOLOGY SOLUTIONS, SAS MALAYSIA

Upload: dothuy

Post on 01-Apr-2018

254 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS ON HADOOPHADOOP, BIG DATA & ANALYTICS

12TH AUGUST 2014

NANG CHING TECK, HEAD OF TECHNOLOGY SOLUTIONS, SAS MALAYSIA

Page 2: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Copyright © 2012, SAS Institute Inc. All rights reserved.

Hadoop “Comes of Age”

Forrester urges companies to consider Hadoop as

“an in-database analytics approach where multivariate

statistical analysis, data mining, predictive

modeling, sentiment analysis, and content analytics

are executed in parallel across MPP clusters”

“Hadoop” word search (blue line)

“Big data” work search (red)

Mentions of Hadoop in job postings over a 5 year period

Page 3: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

WHY HADOOP? EXPANDING DATA REQUIRES A NEW APPROACH

1980sBring Data to Compute

NowBring Compute to Data

Relative size & complexity

Data

Information-centric

businesses use all data:

Multi-structured,

internal & external data

of all types

Compute

Compute

Compute

Process-centric

businesses use:

• Structured data mainly

• Internal data only

• “Important” data only

Compute

Compute

Compute

Data

Data

Data

Data

Page 4: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

WHY HADOOP? THE OLD WAY: BRINGING DATA TO COMPUTE

Complex Architecture• Many special-purpose

systems

• Moving data around

• No complete views

Missing Data• Leaving data behind

• Risk and compliance

• High cost of storage

Time to Data• Up-front modeling

• Transforms slow

• Transforms lose data

Cost of Analytics• Existing systems strained

• No agility

• “BI backlog”

44

11

22

33

SERVERSMARTSEDWS DOCUMENTS STORAGE SEARCH ARCHIVE

ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSEXTERNAL DATA SOURCES

Page 5: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SERVERS MARTS EDWS DOCUMENTSSTORAGESEARCH ARCHIVE

ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSESTERNAL DATA SOURCES

WHY HADOOP? THE NEW WAY: BRINGING COMPUTE TO DATA

Diverse Analytic Platform• Bring applications to data

• Combine different workloads on

common data (i.e. SQL + Search)

• True analytic agility

44

11

22

33 44

Active Compliance Archive• Full fidelity original data

• Indefinite time, any source

• Lowest cost storage

11

Persistent Staging• One source of data for all analytics

• Persist state of transformed data

• Significantly faster & cheaper

22

Self-Service Exploratory BI• Simple search + BI tools

• “Schema on read” agility

• Reduce BI user backlog requests

33

Page 6: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

WHY HADOOP? WHY HADOOP IS IMPORTANT FOR SAS?

Low Cost Computing Power

Scalability Storage Flexibility

Data Protection and Self-Healing

Page 7: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP SAS ON HADOOP, YES WE CAN DO THAT8

SAS

Hive

SAS/Access to Hadoop - Push some

SAS processing to Hadoop with Hive

and SAS

Embedded Process - Push SAS

Data Step (DS & DS2), Data

Cleansing processing to Hadoop

with Map Reduce

SAS

Scoring

Accel.Code

Accel.

Impala

In-Memory Analytics – Process in

Memory, use Hadoop for Storage

persistence and commodity

computing.

SAS

HPADQ

Accel.

MapR

Pig

Hadoop P

latf

orm

In-

Memory

Page 8: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

WHERE DOES

HADOOP FIT?HADOOP AS A “NEW DATA” STORE BI and

Analytics

Operational Data

Sources

EDWAnalytic

Mart

Data Mart

Page 9: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

WHERE DOES

HADOOP FIT?HADOOP AS AN ADDITIONAL INPUT TO THE EDW

BI and

Analytics

Operational Data

Sources

EDW

Data Mart

Analytic

Mart

Data &

Analytic

Mart

Page 10: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

WHERE DOES

HADOOP FIT?

HADOOP DATA PLATFORM AS A BASIS FOR BI AND

ANALYTICS BI and

Analytics

Operational Data

SourcesEDW

Data &

Analytic

Mart

Data Mart

Analytic

Mart

Page 11: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

WHERE DOES

HADOOP FIT?

HADOOP DATA PLATFORM AS A “STAGING LAYER” AS

PART OF A “DATA LAKE” – Downstream stores could be

Hadoop, data appliances or an RDBMSBI and

Analytics

Operational Data

Sources EDW

Data Mart

Analytic

Mart

Page 12: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP SAS DATA MANAGEMENT

• SAS support native Apache Hadoop language –

HDFS, Map Reduce and Pig

• SAS offered SAS/ACCESS to Hadoop (Hive)

and SAS/ACCESS to Impala

• SAS provide SASHDAT tables and SPDE in

Hadoop, coming soon SPDS

• In-Database Process

• Scoring Accelerator

• Code Accelerator

• Data Quality Accelerator

Page 13: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP FEDERATE HADOOP DATA

Enables secure data

access & audit

Delivers virtual view

of data across many

sources

Quicker access to

data sources

Standardize central

administration and

configuration

Seamless, managed

access to data

Other8

Hadoop RDBMS

Governance

CACHED VIEWS

Page 14: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP DATA LOADER FOR HADOOP

Page 15: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP SAS VISUAL ANALYTICS

• Interactive exploration,

dashboards and reporting

• Auto-charting automatically

picks the best graph

• Forecasting, scenario analysis,

Decision Trees and other

analytic visualizations

• Text analysis and content

categorization

• Feature-rich mobile apps for

iPad® and Android

Page 16: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP SAS VISUAL STATISTICS

• Interactive, visual

application for statistical

modeling and classification

• Multiple methods:

• logistic, Regression, GLM,

Trees, Forest, Clustering and

moreF

• Model comparison and

assessment

• Group BY Processing

Page 17: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP SAS VISUAL SCENARIO DESIGNER

• Interactive, data driven,

temporal window building

• Interactive Decision Engine

• Decision Tables and Trees

• Simulation & Deployment

• Integrated with:

• SAS Visual Analytics

• SAS Event Stream Processing

Page 18: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP SAS IN-MEMORY STATISTICS FOR HADOOP

• SAS® In-Memory Statistics for

Hadoop provides a single interactive

programming environment for the

entire analytical life cycle.

• It enables users to perform data

manipulation, variable transformation,

exploratory analysis, statistical

modeling and machine learning

techniques, integrated modeling

comparison and scoring - all inside the

Hadoop environment.

Page 19: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

This slide is for video use only.

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

Page 20: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP SAS® WITHIN THE HADOOP ECOSYSTEM

Next-Gen

SAS®

UserSAS

®User

User

Interface

Metadata

Data

Access

Data

Processing

File

System

SAS Metadata

In-Memory

Data Access

HivePig

Map Reduce

HDFS

Base SAS

SAS/ACCESS® to Hadoop™

SAS/ACCESS® to Impala

In-Memory

Data Access

HivePig

SAS® Data

Management

SAS® Visual

Analytics

SAS® Visual

Statistics

SAS®

Enterprise

Miner™

SAS®

Studio

SAS® LASR™ Analytic

Server

SAS Embedded

Process

SAS® In-memory

Statistics for

Hadoop

Page 21: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.

SAS & HADOOP WHY SAS ON HADOOP?

Bring superior

analytics to

Hadoop for more

precise insights

Manage data in order to

promote reuse and to

comply with IT policies

and procedures.

Maximize the value of

Hadoop across the

enterprise with data-to-

decision lifecycle support

Page 22: SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ... data mining, predictive ... data appliances or an RDBMS BI and Analytics

Company Confidential - For Internal Use Only

Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved. sas.com

THANK YOU