rdbms_realtime_into_hadoop
TRANSCRIPT
![Page 1: RDBMS_realtime_into_Hadoop](https://reader038.vdocuments.us/reader038/viewer/2022100723/58f1b6ce1a28abc3348b457b/html5/thumbnails/1.jpg)
RDBMS Silos Real time transactional data
delivery from RDBMS to
Hadoop Ecosystem
Mich Talebzadeh
Version 1, April 2015
Table data
RSSD
DB
Metastore
DB
Replication
Server
Oracle
Agent
Hadoop
Storage Layer
DataNodes
1
2
3
4
Hive
Metastore
service
HiveServer2
Hadoop
Platform Layer
Replication
Server
generated
batch data files
for Hive
Replicated
data
ETL Process
· Identification
· Filteration
· Validation
· Data Pruning
· Transformation
RDBMS Heap table
Hive Table