rdbms_realtime_into_hadoop

1

Click here to load reader

Upload: mich-talebzadeh-phd

Post on 15-Apr-2017

239 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RDBMS_realtime_into_Hadoop

RDBMS Silos Real time transactional data

delivery from RDBMS to

Hadoop Ecosystem

Mich Talebzadeh

Version 1, April 2015

Table data

RSSD

DB

Metastore

DB

Replication

Server

Oracle

Agent

Hadoop

Storage Layer

DataNodes

1

2

3

4

Hive

Metastore

service

HiveServer2

Hadoop

Platform Layer

Replication

Server

generated

batch data files

for Hive

Replicated

data

ETL Process

· Identification

· Filteration

· Validation

· Data Pruning

· Transformation

RDBMS Heap table

Hive Table