hadoop framework demo

Upload: sapapoonlinetraining

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Hadoop Framework Demo

    1/30

    b y

    p r a v e e n . K

    Hadoop Framework

    1

  • 8/10/2019 Hadoop Framework Demo

    2/30

    Agenda

    What is Hadoop?

    Origin of Hadoop

    Why Hadoop?

    Users of Hadoop

    Hadoop Features

    Hadoop Architecture

    Basic Hadoop Work Flow.

    Hadoop omponents.

    ! " A

    #

  • 8/10/2019 Hadoop Framework Demo

    3/30

    $iscussion %oint

    What is Hadoop?

    Origin of Hadoop

    Why Hadoop?

    Users of Hadoop

    Hadoop Features

    Hadoop Architecture

    Basic Hadoop Work Flow.

    Hadoop omponents.

    ! " A

    &

  • 8/10/2019 Hadoop Framework Demo

    4/30

    What is Hadoop?

    Hadoop is a software framework fordistributed storing and processingoflargedatasetsacrosslarge clustersof computers

    Large datasets'era(ytes or peta(ytes of data

    Large clustershundreds or thousands of nodes

    Hadoopis (ased on a simple programming model calledMapReduce

    Hadoop is (ased on a simple data model)any data will fit

    *

  • 8/10/2019 Hadoop Framework Demo

    5/30

    +

    ,

    %

    , -

    %-

    H$F,

    /ap0educe

    ame ode

    $ata ode

    2o( 'racker

    'ask 'racker

    ,

    %

    H$F,

    H$F, H$F,H$F, H$F,

    /ap0educe

    /ap0educe /ap0educe /ap0educe

    ame ode

    $ata ode $ata ode $ata ode

    2o( 'racker

    'ask 'racker 'ask 'racker 'ask 'racker

  • 8/10/2019 Hadoop Framework Demo

    6/30

    ont.3

    Hadoop framework consists on two main layers

    $istri(uted file system 4H$F,5

    67ecution engine 4/ap0educe5

    8

  • 8/10/2019 Hadoop Framework Demo

    7/30

    $iscussion %oint

    What is Hadoop?

    Origin of Hadoop

    Why Hadoop?

    Users of Hadoop

    Hadoop Features

    Hadoop Architecture

    Basic Hadoop Work Flow.

    Hadoop omponents.

    ! " A

    9

  • 8/10/2019 Hadoop Framework Demo

    8/30

    Origin of Hadoop

    Google Hadoop

    :F, H$F,

    /ap0educe /ap0educe

    Big'a(le HBA,6

    ;

    Hadoop was created (yDoug CuttingandMikeCafarellain #

  • 8/10/2019 Hadoop Framework Demo

    9/30

    $iscussion %oint

    What is Hadoop?

    Origin of Hadoop

    Why Hadoop?

    Users of Hadoop

    Hadoop Features

    Hadoop Architecture

    Basic Hadoop Work Flow.

    Hadoop omponents.

    ! " A

    C

  • 8/10/2019 Hadoop Framework Demo

    10/30

    Why Hadoop?

    0eading 1 'B of $ata

    raditional Hadoop

    1 machine

    *Hard disks

    1

  • 8/10/2019 Hadoop Framework Demo

    11/30

    erabyte#ort $enchmark%&uly '(()*

    One of =ahoo>s Hadoop clusters sorted 1 tera(yte of data in'(+ seconds) which (eatthe pre@ious record of #C9 seconds in the annual general purpose 4daytona5tera(ytesort(enchmark. 'he sort (enchmark specifies the input data 41< (illion 1

  • 8/10/2019 Hadoop Framework Demo

    12/30

    ,inserabyte#ort $enchmark%'(-*

    One of =ahoo>s Hadoop clusters sorted1.*#tera(yte of data in/(seconds)there is nopre@ious record for this.

  • 8/10/2019 Hadoop Framework Demo

    13/30

    Ad@antages

    1&

    1. Volume, Variety, Velocity

    2. Hardware Commoditization

    3. CloudComputing

  • 8/10/2019 Hadoop Framework Demo

    14/30

    ont3

    1*

  • 8/10/2019 Hadoop Framework Demo

    15/30

    $iscussion %oint

    What is Hadoop?

    Origin of Hadoop

    Why Hadoop?

    Users of Hadoop

    Hadoop Features

    Hadoop Architecture

    Basic Hadoop Work Flow.

    Hadoop omponents.

    ! " A

    1+

  • 8/10/2019 Hadoop Framework Demo

    16/30

    Users

    18

  • 8/10/2019 Hadoop Framework Demo

    17/30

    $iscussion %oint

    What is Hadoop?

    Origin of Hadoop

    Why Hadoop?

    Users of Hadoop

    Hadoop Features

    Hadoop Architecture

    Basic Hadoop Work Flow.

    Hadoop omponents.

    ! " A

    19

  • 8/10/2019 Hadoop Framework Demo

    18/30

    Features of Hadoop

    Here>s what makes it especially usefulN

    #calable0t can relia(ly store and process peta(ytes.

    1conomical0t distri(utes the data and processing across clusters of commonly a@aila(lecomputers 4in thousands5.

    1fficient0By distri(uting the data) it can process it in parallel on the nodes where the datais located.

    2eliable0t automatically maintains multiple copies of data and automatically redeployscomputing tasks (ased on failures

    1;

  • 8/10/2019 Hadoop Framework Demo

    19/30

    $iscussion %oint

    What is Hadoop?

    Origin of Hadoop

    Why Hadoop?

    Users of Hadoop

    Hadoop Features

    Hadoop Architecture

    Basic Hadoop Work Flow.

    Hadoop omponents.

    ! " A

    1C

  • 8/10/2019 Hadoop Framework Demo

    20/30

    Hadoop Architecture

    Master #lave

    Name Node Data Node

    &ob racker ask racker

    #econdaryName Node

    #