oracle data integration with hadoop data integration with hadoop jeff pollock vice president, oracle...

25
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair Marketing Director, Oracle Data Integration Introducing the Big Data Reservoir

Upload: phungdung

Post on 22-Mar-2018

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Data Integration with Hadoop

Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair Marketing Director, Oracle Data Integration

Introducing the Big Data Reservoir

Page 2: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Big Data Too much of a good thing

Oracle Confidential – Internal/Restricted/Highly Restricted

2

Page 3: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

22× 2011-2016

12.5 Billion 2020

1.3 Billion Today

Smart Device Growth Data Production Increase

Datafication is leading to Data Explosion

Page 4: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Use Data

12%

Executives who feel they understand the impact data

will have on their organizations

Produce Data

The Big Data Paralysis

Page 5: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Get Fast Answers to New Questions

Create a Data Reservoir

Predict More, More Accurately

Accelerate Data-Driven Action

Big Data Reservoir To Drive Results

Page 6: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Why the word “Reservoir?”

Oracle Company Confidential 6

https://blogs.oracle.com/bigdata/entry/big_data_and_analytic_top

Page 7: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Big Data, Data Integration and Data Reservoir

Oracle Confidential – Internal/Restricted/Highly Restricted

7

Page 8: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Business Value of a Reservoir Architecture

Oracle Confidential 8

Lower TCO for the Data

Warehouse

LoB Faster Access to

Analytic Data

New Types of Analytics for

All Data • Control the costs of the Data

Warehouse • Massive value multipliers for

Teradata and Netezza customers

• Put an end to the annual upgrade cycle

• Give analytics to the business earlier in the data lifecycle

• Empower IT to focus the data modeling and report design on highest value analytics

• Run BI queries faster

• Support Exploratory Analytics directly from Hadoop cluster

• Run Streaming Analytics from big data Storm, Flume etc.

• Drive new business solutions (telematics data, machine data, log data, unstructured data)

COST SPEED VALUE

Page 9: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

The Hadoop Opportunity for Big Data Reservoir

Support for exploratory analytics

without time consuming modeling

Lower cost data staging and

data preparation

Lower cost storage for questionable

Business data.

Oracle Confidential – Internal/Restricted/Highly Restricted 9

Data Flow DW Data Discovery

Data

Preparation

Deep Data Storage

Data staged/merged in Hadoop to provide single place to explore/ discover data External data staging and long running batch jobs run in Hadoop To make the most of DB Store more raw detail data for less Cost while keeping aggregates in the DB.

Page 10: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Differentiation in Data Integration

Oracle Confidential 10

Native Capture Deeply integrated capture from the #1 database with ~50% market share, OGG will be preferred choice

Hadoop Agnostic Generate transformation code into popular Hadoop frameworks/languages using KMs – other ETL vendors must recompile their engine

Real-time Delivery Dominant market share for OGG and battle-hardened robustness

E-LT Engine Dominant market share for ODI capabilities with large scale E-LT use cases on DW translate into battle-hardened robustness for Hadoop E-LT

Differentiated Oracle Data Integration Features

Differentiated Data Integration “Know How” and Core Capabilities

Page 11: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Big Data Reservoir for EDW Continuous Data Delivery and Pushdown ELT Transformations

Oracle Confidential – Internal/Restricted/Highly Restricted 11

Staging Detail

Fast load

Fast load

Data Replication

Data Synchronization

Hadoop Data Transformation

PIG - HiveQL

Sources

Data Reservoir

Sources

Page 12: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle For Big Data Reservoir Continuous Data Delivery and Pushdown ELT Transformations

Oracle Confidential – Internal/Restricted/Highly Restricted 12

Staging Detail

Fast load

Fast load

Data Replication

Data Synchronization

Hadoop Data Transformation

PIG - HiveQL

Sources

Data Reservoir

Sources

Oracle GoldenGate

Oracle Data Integrator Oracle Data Integrator

Page 13: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Reference Architecture – Logical View

Oracle Confidential 13

Vir

tua

lisa

tion

&

Qu

ery

Fe

de

ratio

n

Enterprise Performance Management

Pre-built & Ad-hoc BI Assets

Information Services

Data Ingestion

Information Interpretation

Access & Performance Layer

Foundation Data Layer

Raw Data Reservoir

Data Science

Data Engines & Poly-structured sources

Content

Docs Web & Social Media

SMS

Structured Data Sources

• Operational Data • COTS Data • Streaming & BAM

Immutable raw data reservoir Raw data at rest is not interpreted

Immutable modelled data. Business Process Neutral form. Abstracted from business process changes

Past, current and future interpretation of enterprise data. Structured to support agile access & navigation

Discovery Lab Sandboxes Rapid Development Sandboxes

Project based data stores to support specific discovery objectives

Project based data stored to facilitate rapid content / presentation delivery

Data Sources

Master & Reference Data Sources

Page 14: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Data Integration Can Help Right Now

Oracle Confidential 14

Any Sources

Staging

Prod

Detail

MR

MR

Oracle Data Integrator Oracle GoldenGate

Fast Load

Transformation

#1 – Tools not Spaghetti • “ETL 101” avoid complex, costly custom coding

#2 – Non-invasive Capture and Staging • Move data without inefficient batch extracts

#3 – Processing is Taken to the Data • No separate ETL engine needed • Eliminate unnecessary data movement • Reclaim latency and time from network overhead

#4 –Native Hadoop Execution • Choose the right Hadoop language for your use case

• HiveQL, Pig, Spark, Storm, Java/MR2, etc. • Template driven code gen keeps pace w/change on Hadoop platform

#5 – Native SQL Pushdown • Optimize some join types within the Data Warehouse

#6 – Oracle Optimized • OGG and ODI certified to run on the Oracle Appliances

Page 15: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 15

No more sampling

From 2 weeks to 2 minutes

Complex custom analysis

Dunnhumby

Page 16: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 16

Customer 360 Vision

CEO driven initiative

Adaptability and time to market

Improving Banking Service Quality

Page 17: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 17

Cable TV Capture true user preference

Model behavior

Refine marketing

Page 18: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Myth Busters ETL Workload Offloading versus ETL Technology

Oracle Confidential 18

Dominant Perception 1. Hadoop will replace the Data

Warehouse

2. Hadoop is mainly for Unstructured Data

3. Hadoop is a Data Integration solution

Reality: 1. Hadoop is a supplement to the

Data Warehouse

2. Hadoop is for both Structured and Unstructured Data

3. Hadoop is not a Data Integration Solution ETL workloads are a critical Hadoop use case!

Page 19: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Bridging Big Data and Enterprise Data Oracle Big Data Platform

Data Warehouse Data Reservoir +

Oracle Big Data Connectors

Oracle Data Integrator

Oracle Advanced Analytics

Oracle Database

Oracle Spatial & Graph

Oracle NoSQL Database

Cloudera Hadoop

Oracle R Distribution

Oracle Industry Models

Oracle GoldenGate

Oracle Data Integrator

Oracle Event Processing

Oracle Event Processing

Oracle Data Integrator

Oracle GoldenGate

Oracle Advanced Analytics

Oracle Database

Oracle Spatial & Graph

Oracle Industry Models

Page 20: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Actionable Events

Event Engine Data Reservoir

Data Factory Enterprise Information Store

Reporting

Discovery Lab

Actionable Information

Actionable Insights

Data Streams

Execution

Innovation

Discovery Output

Events & Data

Data Flow View – Data Factory and Discovery Lab

Structured Enterprise Data

Other Data

Oracle Confidential 20

Embedding Big Data in Corporate DNA

Page 21: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle for Data Integration with Hadoop

Oracle Confidential 21

Proven Technology

Better Architecture

Best for Oracle

• Unlike custom coding, a tools based approach is proven to result in lower cost long term operations

• Oracle GoldenGate is industry standard for Data Replication

• Oracle invented E-LT Pushdown processing and is 10x more widely deployed than competitors

• Oracle GoldenGate provides the most scalable, native integration for database replication

• Oracle Data Integrator provides ultimate scalability and choice for Hadoop data transformations

• Consistent agent-based architecture avoids having multiple, incompatible engines (eg; INFA and IBM)

• Exadata – OGG and ODI are deeply integrated and are the only Replication and ETL processes certified to run on the appliance

• Big Data Appliance – deeply integrated technology part of core reference architecture

• Big Data Connectors – ODI included with core connector technologies for Hadoop

RISK SCALE COMPLETE

Page 22: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Simplifies Big Data Integration

Open Comprehensive Big Data Platform

Appliance w/Hadoop Cluster

Analytic Tools

DI Tools and Connectors

Heterogeneous & Best of Breed

Differentiated and powerful DI capabilities for Teradata, Netezza, Microsoft, DB2, Sybase..

Faster Time to Value

Flexible configurations

OOTB performance with DI

Unified Mgmt - EM Plug-ins for Appliance and DI Tools

Single Support Contact – Hardware/Software/Networking and ASR

Oracle Company Confidential 22

Page 23: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Oracle Confidential – Internal/Restricted/Highly Restricted 23

Page 24: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair
Page 25: Oracle Data Integration with Hadoop Data Integration with Hadoop Jeff Pollock Vice President, Oracle Data Integration Product Management and Strategy Madhu Raviendran Nair

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 25