big data and your data warehouse -...

26
Philip Russom TDWI Research Director for Data Management April 5, 2012 Big Data and Your Data Warehouse

Upload: others

Post on 29-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Philip Russom

TDWI Research Director for Data Management

April 5, 2012

Big Data and Your Data Warehouse

Page 2: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Sponsor

Page 3: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Speakers

Philip Russom Research Director,

Data Management,

TDWI

Peter Jeffcock Director,

Oracle Product Marketing

Page 4: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Today’s Agenda • Big Data

– Was a problem; now it’s an opportunity

– The opportunities of Big Data Analytics

• Definitions and Comparisons – Old Big Data versus New Big Data

– Big Data Analytics, Data Warehouses, Analytic Databases

• Big Data and Analytics affect Your Data Warehouse – Scalability, Workloads, Data Types, Architectures

– Analytic tool choices, Data integration/quality best practices

• Use Cases for Your Data Warehouse & Big Data Analytics

• Recommendations

Page 5: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Big Data: Problem or Opportunity?

• Only 30% of survey respondents consider big data a problem.

– Oddly enough, big data was a serious problem just a few years ago.

– Storage and CPUs developed greater capacity, speed, intelligence

• They also fell in price.

– New database mgt systems (DBMSs) arrived, designed for big data analytics.

• Data warehouse appliances and analytic DBMSs are relatively affordable.

• The vast majority (70%) considers big data an opportunity.

– The recent economic recession forced deep changes in most businesses.

– Big data analytics reveals change’s root cause, so you can stop or leverage it.

30%

70%

Problem – because it's hard to manage

from a technical viewpoint

Opportunity – because it yields detailed

analytics for business advantage

In your organization is big data considered mostly a problem or mostly an opportunity?

Source TDWI. Survey of 325 respondents, June 2011

Page 6: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Definition of Big Data Analytics • It’s where advanced analytic techniques operate on big data sets

• It’s about two things: big data AND advanced analytics

– The two have teamed up to leverage big data

– The combo turns big data into an opportunity

• Big Data isn’t new.

Advanced Analytics isn’t new.

– Their successful

combination is new

– Hundreds of terabytes of

data just for analytics is new

Big

Data

Ad

va

nc

ed

An

alytic

s

Big Data

Analytics

Big Data

Analytics

Page 7: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Opportunities for Big Data Analytics • Anything involving customers benefits from big data analytics

– better-targeted social-influencer marketing (61%)

– customer-base segmentation (41%)

– recognition of sales/market opportunities (38%)

• BI, in general, benefits from big data analytics

– more numerous and accurate business insights (45%)

– understanding business change (30%)

– better planning and forecasting (29%)

– identification of root causes of cost (29%)

• Specific analytics applications are likely beneficiaries

– detection of fraud (33%)

– quantification of risks (30%)

– market sentiment trending (30%)

Source TDWI. Survey of 325 respondents, June 2011

Page 8: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Traditional Big Data versus New Generation Big Data

Traditional Big Data New Generation Big Data

Problem Opportunity

Hoarded Leveraged

Tens of Terabytes,

sometimes more

Hundreds of Terabytes,

soon to be measured in Petabytes

Mostly structured and relational data Mixture of structured, semi- & unstructured

Data mostly from traditional enterprise

applications: ERP, CRM, etc.

Also from Web logs, clickstreams, e-commerce,

sensors, mobile devices

Common in large companies:

Mainstream today

Common in Internet companies:

Will eventually go mainstream

Real-time for Operational BI, etc. Real-time for Streaming Data

Page 9: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Traditional DWs versus New Generation Analytic DBs

Traditional Data Warehouse New Generation Analytic Database

Excels with Traditional Big Data Excels with New Generation Big Data

Killer platform for standard reports, dashboards,

performance mgt, OLAP

Killer platform for discovery oriented

advanced analytics

Mostly aggregates and calculated metrics in time-

series and multidimensional models

Mostly detailed source data

All the best practices of data management apply,

including DI, DQ, MDM

Best practices of data mgt are suspended to preserve

data nuggets for discovery

Well-understood data for reporting and OLAP Less understood data for discovery analytics

Single version of the truth for facts that must be

tracked over time

Excellent source for deducing unknown facts and

relationships

Well-organized history of corporate performance Recent information, for studying change

Page 10: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

The Location of Big Data and Analytic Workloads

affect Your DW Architecture and Design

• Where do you store and manage big data for analytics?

• Where do you process big data with analytic tools? – In the data warehouse proper?

– In a standalone edge system that integrates with the DW?

– Both?

• These are critical design and architectural decisions that adopters of big data and advanced analytics must make.

EDW

Federated

Data

Marts

Real

Time

ODS

Customer

Mart or

ODS

No-SQL

Database

Hadoop

Distributed

File Sys

Data

Staging

Area

Metrics for

Performance

Mgt

OLAP

Cubes

Multi-

dimensional

Data Models

Detailed

Source

Data

Analytic

Sand

Box

DW

Appliance

Columnar

DBMS

Map

Reduce

Data

Mining

Cache

Star or

Snowflake

Scheme

Page 11: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Analytic Workloads affect Your DW Architecture

• Monolithic DW Architecture – A single DBMS instance (or grid) hosts multiple database

workloads & datasets • Requires hefty DW platform to handle multiple, diverse workloads

& data types

• Distributed DW Architecture – Users deploy multiple platforms, so each is optimized for a

specific workload type • If not controlled, data marts and ODSs may proliferate. Complexity

is a problem, due to the large number of “edge systems.”

• Hybrid DW Architecture – A monolith manages all core reporting/OLAP data & most

workloads

– But a few workloads are deployed on separate systems

Page 12: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

• Data staging areas evolved to do more than stage data – Now they must evolve again to accommodate big data

• Originally data staging areas were temporary holding bins – In that spirit, some are good for “analytic sandboxes”

• Most data staging areas are optimized for detailed source data – Can manage detailed source data as found in transient big data

• Data is regularly processed while managed in the staging area – E.g., sort prior to a DW load. SQL temp tables held in staging for later merging

– Some analytic workloads (especially columnar) may run well in a staging area

• Data staging must scale to big data’s volume, which comes and goes – A cloud could be an elastic platform for unpredictable data staging volumes

Big Data even affects

Your DW’s Data Staging Area

Big

Data

Data

Staging

DW

Page 13: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Analytic Tools for Big Data affect Your DW • There’s a cross-road where you choose an

analytic method – or multiple methods!

1. Online Analytic Processing (OLAP)

2. Extreme SQL

3. Statistical Analysis

4. Data Mining

5. Other: Natural Language Processing (NLP), Artificial Intelligence (AI)

• Each analytic method has requirements for data and analytic tool types.

– Multiple analytic methods can lead to multiple data stores, DBMSs, DW arch. components – and multiple analytic tools

Page 14: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Big Data Analytics affects Data Mgt for Your DW • Analyze data first

– Later, improve it for a more polished analysis

• Analytic discovery depends on data nuggets

– Both query-based and predictive analytics need:

• Big data, raw data

• Data quality for analytic databases

– Do discovery work before addressing data anomalies and standardization

• E.g., fraud is often revealed via non-standard or outlier data

• Data modeling for analytic databases

– Modeling data can speed up queries and enable multidimensional views

• But it loses details & limits queries

• Do only what’s required, like flattening and binning

• Data for post-analysis use in BI

– Apply best practices of DI, DQ, modeling

01101

00100

10110

10010

10100

10011

01001

Page 15: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Use Cases

for Your DW

and Big Data

Analytics

• Big Data enables exploratory analytics. Discover new: – Customer base segments

– Customer behaviors and their meaning

– Forms of churn and their root causes

– Relationships among customers and products

• Analyze big data you’ve hoarded. Finally understand: – Web site visitor behavior

– Product quality based on robotic data from manufacturing

– Product movement via RFID in retail

• Use tools that handle human language for visibility into: – Claims process in insurance

– Medical records in healthcare

– Call center applications in any industry

• Big data improves data samples for older analytic apps: – Fraud detection

– Risk management

– Actuarial calculations

– Anything involving statistics or data mining

• Big data can add more granular detail to analytic datasets: – Broaden 360-degree views of customers and other

entities, from hundreds of attributes to thousands

Page 16: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

CONCLUSIONS – Plan ahead, because:

Big Data affects Your Data Warehouse • Scaling up or out to big data volumes

– Affects choice of computing architectures, platforms, hardware, DBMSs…

• Incorporating new data types and new data sources – Semi- and un-structured data. Web, sensor, and social data

• Operating in real-time and streaming big data – Real-time is a biz requirement. Many new sources stream big data.

• Supporting multiple database workloads in single DW environment – Big data analytics introduces new analytic workloads & affects architecture

• Adjusting best practices in data management – These still apply to big data analytics, but in a different order & priority

• Updating or replacing your DW models or architecture – To accommodate all the above…

Page 17: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Big Data Analysis: Airline Industry

What do they think about me?

Page 18: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Data Sources

Customer records Social Media Email Phone Logs Weblogs

Page 19: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Apache Hadoop

Distributed file system Map/Reduce programming

Page 20: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Oracle In-Database Analytics

2 miles

Statistical Analysis

Data Mining

Text

Graph

Spatial

Semantic

Page 21: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Oracle R Enterprise Oracle Advanced Analytics

In-database Large data sets Same code

Page 22: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Exadata Exalytics

Oracle Big Data Platform

ACQUIRE ORGANIZE DECIDE ANALYZE

Big Data

Appliance

Page 23: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Oracle Big Data Appliance

• Optimized and Complete

• Integrated with Oracle Exadata

• Easy to Deploy

• Single Vendor Support

Page 24: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Available For Replay

To Register go to: www.oracle.com/bigdata

Page 25: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

25

Questions?

Page 26: Big Data and Your Data Warehouse - download.101com.comdownload.101com.com/pub/tdwi/Files/Oracle040512.pdfOpportunities for Big Data Analytics • Anything involving customers benefits

Contacting Speakers

• If you have further questions or comments:

Philip Russom

[email protected]

Peter Jeffcock

[email protected]