SYSTEM 1
SYSTEM 2
SYSTEM 3
SYSTEM 4
Automating HadoopWithout ActiveBatch
/
Data is extracted from an Oracle database
A Perl script is run thatgenerates a .diff file
The .diff file is transferred toanother server viaSFTP
The data is mergedinto a dataset
The data is copied toHDFS
A hive scriptgenerates a summary file
Data is then imported into SQL Server
Requires scripting to make thedatabase connection and query.
Requires ensuring that the filewas properly generated
Requires coordinating between 3systems and authentication is still in use
Requires high credentials to move data andyou must coordinate resources between systems
SYSTEM 1
SYSTEM 2
SYSTEM 3
SYSTEM 4
Automating HadoopWith ActiveBatch
Oracle_Connect
ExportRowsFromTableSpace
Perl - Diff - This Export vs Last Export
UploadFiles_SFTP
HDFS_Login
Copy_TargetFilestoHDFSLanding
ForEach_HiveStatements_Merge
PigScript
Sqoop_ExportToSQL
}}}
ActiveBatch allows users to: - manage credentials across various systems in a secure and encrypted manner - quickly know which systems are up or down - reduce the amount of scripting needed - easily coordinate workflows across various systems - remove the need to memorize or research command-line utility switches - create event based triggers across various systems