hadoop big data automation infographic - … · title: hadoop big data automation infographic -...

1

Click here to load reader

Upload: lydung

Post on 09-Sep-2018

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Hadoop Big Data Automation Infographic - … · Title: Hadoop Big Data Automation Infographic - ActiveBatch IT Automation Author: ActiveBatch Product Marketing Subject: Apache Hadoop

SYSTEM 1

SYSTEM 2

SYSTEM 3

SYSTEM 4

Automating HadoopWithout ActiveBatch

/

Data is extracted from an Oracle database

A Perl script is run thatgenerates a .diff file

The .diff file is transferred toanother server viaSFTP

The data is mergedinto a dataset

The data is copied toHDFS

A hive scriptgenerates a summary file

Data is then imported into SQL Server

Requires scripting to make thedatabase connection and query.

Requires ensuring that the filewas properly generated

Requires coordinating between 3systems and authentication is still in use

Requires high credentials to move data andyou must coordinate resources between systems

SYSTEM 1

SYSTEM 2

SYSTEM 3

SYSTEM 4

Automating HadoopWith ActiveBatch

Oracle_Connect

ExportRowsFromTableSpace

Perl - Diff - This Export vs Last Export

UploadFiles_SFTP

HDFS_Login

Copy_TargetFilestoHDFSLanding

ForEach_HiveStatements_Merge

PigScript

Sqoop_ExportToSQL

}}}

ActiveBatch allows users to: - manage credentials across various systems in a secure and encrypted manner - quickly know which systems are up or down - reduce the amount of scripting needed - easily coordinate workflows across various systems - remove the need to memorize or research command-line utility switches - create event based triggers across various systems