how to automate offloading etl processes to hadoop

Download How to Automate Offloading ETL Processes to Hadoop

Post on 11-Apr-2017

819 views

Category:

Technology

1 download

Embed Size (px)

TRANSCRIPT

  • Confidential

    OPERATIONAL EXCELLENCE FOR BIG DATA APPS

  • Confidential2

    TRUSTEDby over 10,000

    companies as their big data app platform

    BACKEDby top Silicon Valley

    investors True Ventures,Rembrandt VP, Bain

    Capital

    FOUNDEDin 2008, with

    headquarters in San Francisco

  • Confidential

    PERFORMANCE MANAGEMENT FOR BIG DATA APPLICATIONS

    your big data apps

    MONITORto resolve

    issues fasterbig data apps

    more effectively

    MANAGECOLLABORATE

  • Confidential4

    Java, Scala (Scalding), SQL SIMPLEEnsure best practices at any scale thanks to easy-to-learn design

    principles

    FLEXIBLELeverage existing Java,

    Scala, and SQL skills and easily adapt to new

    systems

    WE ARE THE DEVELOPERS BEHIND CASCADING

    RELIABLEAlways get optimal performance and

    reliability for big data applications

  • Confidential

    Use Hadoop for ETL / ELT Ensure quality and manageability

    of our ELT / ELT applications Translate existing ETL work to

    Hadoop GUI ETL tool for developers that

    dont know Java, Scala, SQL

    5

    MIGRATING TO HADOOP FOR ETL AT ENTERPRISE SCALE.

    Cascading

    Driven

    ?

    ?

  • Confidential6

    TODAYS SPEAKERS

    Shahab KamalVice President at BitWise Inc.

    Shahab is responsible for strategy, growth and client relations. Shahab works with client executives on ITStrategy for Business Intelligence, Big Data, Data Warehousing and Enterprise Applications. Shahab hasworked at Ford Motors, Aon Hewitt and Tribune Company on their PeopleSoft ERP implementation and support.His expertise has been around retrofitting data from legacy applications without loss of data integrity.

    Mark CastilloDriven, Inc.

    Mark is a Solutions Architect with 15+ years of software engineering background. He has worked in thefinance, security, healthcare, streaming music, marketing, and social networking industries. His technicalknowledge and skills are focused on distributed systems, data processing, networking, Linux appliances andBig Data.

  • DataMigrationSeamlessTransition toHadoopShahabKamal&MarkCastillo

  • AboutBitwise

    Founded

    in1996withHQinChicago,IL

    Located

    InofficesinIndia&Australia

    ISO9001:2008&ISO27001:2005Certified

    Backed

    ByFortune500customers

    ProprietaryTechnology

    suiteofAcceleratorsthatreducetheexpense,timeandcomplexityoflarge-scaledataprojects.

  • Reporting,Mining,Analytics

    Analytics

    Reporting,Mining,AnalyticsExploratoryDiscoverySearch

    DATAMART

    ReportingDataMining

    STAGE TRANSFORM ARCHIVE

    DataLake

  • BitwiseMigrationSolutionApproach

    ~70%EffortSaving~60%EffortSaving

    Inventory DeepDive MigrationDesignMigration Validation

    ~30%EffortSaving

    MigrationAutomationAssessmentAutomation TestAutomation

    1 2 3

  • BigDataProcessingPlatform

    OTHERCUSTOM

    LocalIn-Memory MapReduce&Tez

    COMPUTATION FABRIC

    CASCADINGEnterpriseDataApplication

    BitWise BigDataProcessingPlatform

    ETLMigration QualiDI

    DataQualityFramework

    ELTDevelopment

    Development MigrationEngine Testing Checks&Balances

  • CaseStudy

    RECOVERYAPPLICATIONDATASOURCES

    ANALYTICS

    REPORTING

    DeveloperUI

    XMLCustomCode

    ExecutionService

    CascadingFramework

    ETLApplication

    RECOVERYAPPLICATIONDATASOURCES

    ANALYTICS

    REPORTING

    AutomatedETL

    Migration

    RDBMS

    RDBMS

    DataQualityMonitoring

    DataQua

    lityMon

    itorin

    g

    ETLTesting

    OnExecution

    GenerateCascadingFlow

    LaunchMapReduce Jobs

  • BitwiseELTToolArchitecture

    ETLMigration QualiDI

    DataQualityFramework

    DeveloperUI

    XMLCustomCode

    ExecutionService

    CascadingFramework

    DevelopmentEnvironment

  • KeyFeatures

    IncreasesETLdeveloperproductivityonHadoopbyupto50%EASY

    EFFECTIVE

    ECONOMICAL

    OPERATIONALVISIBILITY

    PortsmajorityofexistingETLprocessestoHadoopwithlittletonochanges

    OptimizesETLperformancebychoosingtherightcomputationfabric

    ViewsETLprocessesinreal-timeforservicelevelmanagement

  • BenefitsofBitwiseMigrationSolutionUpto60%ReductionduringAssessmentPhasewithDarkDataDiscoveryFrameworkSAVESTIME

    ECONOMICAL

    INCREASESPRODUCTIVITY

    QUICKERVALIDATION

    Upto70%Touch-FreeMigration

    Upto40%IncreaseinDeveloperProductivity

    Upto30%EffortSavingsinDataValidation

    SAVESEFFORT Upto75%90%EffortSavedforTestComplianceReports

  • AxesUI

  • AxesUI

  • AxesUI

  • AxesUI

  • AxesUI

  • Accelero Demo&UI

  • Concurrent CascadingandDriven

    OTHERCUSTOM

    LocalIn-Memory MapReduce&Tez

    COMPUTATIONFABRIC

    CASCADINGEnterpriseDataApplication

  • BitwisehelpedalargeFortune500companysavemillionsofdollarsandanestimated30-50%timeinETLdevelopment through utilizationof theBitwiseproprietaryETLmigrationaccelerator,offloading fromacostlylegacyplatformtoHadoop.Itbeganwhentheclientexpressedtheirinterestinmoving toHadoop/BigDatabymigrating theirexistingRecoveryAbInitioETLs.BitwisecameupwithaphasedapproachtoProof, ValidateandConverttheexistingETLs.

    Takingthepartnership further, Bitwiseproposed aGUItotheELTtooltoactasadeveloper IDEbasedonEclipseasaNextStep.

    ProofingtheTechnologyStack

    ValidationoftheBitwiseHadoopELTStack

    ETLMigrationusingAcceleroConversionEngine

    PartnershipinAcceleroDevelopmentEngine

    PartnershipinAcceleroGUIDevelopment

    Stage1 Stage2 Stage3 Stage4 Stage5

  • Bitwisehasbeenworkingwithfortune500companytomovedatafromDatalaketoHadoopandidentify risksthatneedtobeaddressed.Theprimary focusondeveloping templatesandframeworkforDataIngestionandTestingafterthedataistransferredandbuild reportsontheoffloaded data.

    PriorBitwisehashelped theclientwithDataIntegrationmigration throughutilizationoftheBitwiseproprietaryDataIntegrationmigrationexceleratorAccelero,offloading fromacostlylegacyplatformtoHadoop, saving30-50%timeinETLmigration.

    Stage1 Stage2 Stage3

    DataIngestionintoHadoopProofofConcept

    TakingtheentireProofofConceptaheaddatalakemovingtoHive

    BuildoptimizedreportsrunningoftheoffloadeddataonHadoop

    Conversion ofproprietaryETLtoAcceleroELTusingCascadingandDriven

    Stage4

  • ThankYou

  • Confidential

    Bitwise website: http://www.bitwiseglobal.com/ Driven website: http://www.driven.io/

    Speakers contact information:- Bob Taylor: bobt@driven.io- Shahab Kamel: Shahab.Kamal@bitwiseglobal.com- Mark Catillo: mark@drive.io

    ADDITIONAL RESOURCES