big data hadoop training by easylearning guru

23
Welcome to the World of Big Data & Hadoop www.easylearning.guru

Upload: easylearning

Post on 14-Jul-2015

249 views

Category:

Education


2 download

TRANSCRIPT

Welcome to the World of Big Data & Hadoop

www.easylearning.guru

Agenda

What is Big Data ?

Different Kinds of Big Data

Big Data Global Market

Hadoop Global job trends

What is Hadoop ?

www.easylearning.guru

What is Big Data?

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.

www.easylearning.guru

Types of Big Data ?

Traditional RDBMS deals

with only Structured data.

Need of a technology which deals with

Semi-structured data, Unstructured

data and Structured data as well

Semi-Structured

Data

www.easylearning.guru

The 3V’s of Big Data

www.easylearning.guru

Sources of Data

Social Media & Networks

(All of us are generating data)

Mobile Devices

(Tracking all the objects all the time)

Sensor Technology & Networks

(Measuring all kinds of data)

Scientific Instruments

(Collecting all sorts of data)

www.easylearning.guru

Where Big Data is used ?

www.easylearning.guru

Facebook Scenario

Facebook on an average generates 70 thousand MB in 1 minute.

1 hour = 70,000 MB *60 = 4.2 Million MB

1 Day = 4.2 Million *24 MB = 10.8 Billion MB = 98438 GB

1 week = 6.9 thousand GB = 690 TB

4 weeks = 690 TB * 4 = 2756 TB = 2.7 PB

52 weeks = 2.7 PB * 52 = 143.3 PB

A d that’s aloooooooooot of data !

www.easylearning.guru

Various Bigdata Technologies

www.easylearning.guru

Big Data Global Market

Sources : Dice, LinkedIn.

Big Data Implementation

Implemented Big Data Yet to Implement Big Data0

10

20

30

40

50

60

2012 2013 2014 2015 2016 2017

Big

Da

ta G

row

th (

in U

SD

Bil

lio

ns)

BIG D A TA A NA LYST

B IG D A TA A RCHITECT

B IG D A TA ENGINEER

B IG D A TA RESEA RCH A NA LYST

B IG D A TA V ISUA LIZ ER

D A TA SCIENTIST

50

43

44

31

23

18

50

57

56

69

77

82

FILLED/VACANCY(%)

Filled Unfilled

www.easylearning.guru

Hadoop Global Job Trends

Top Hadoop Technology Companies

Sources : Dice, LinkedIn.

More than 17,000

employees with Hadoop

skill across these

companies

www.easylearning.guru

2% 2% 3% 4%

8% 8% 10% 11%

14%

38%

DEMAND FOR BIG DATA IN CITIES

As of February 2014

0

20

40

60

80

100

120

SA

LAR

Y (

US

D P

.A.

IN T

HO

US

AN

DS

)

Sources : Dice, LinkedIn.

Hadoop Global Job Trends

www.easylearning.guru

What is Hadoop ?

Hadoop was created by Doug Cutting and Mike Cafarella.

Hadoop provides the reliable shared storage and analysis system.

It is designed to scale up from a single server to thousand of machines, with a high degree of fault tolerance.

www.easylearning.guru

Hadoop History

www.easylearning.guru

Hadoop Core Components

Core Hadoop has two main systems:

• Hadoop Distributed File System: The Hadoop file system is a

Distributed file system which holds the large amount of data across multiple nodes in a cluster.

• MapReduce: MapReduce is a distributed programming paradigm used to analyze the data in the HDFS.

www.easylearning.guru

Hadoop Distributed File System (HDFS)

A given file is broken down into blocks (default=64MB), then blocks are replicated across cluster (default=3).

Optimized for throughput.

HDFS allows you to put/get/delete files.

Follows the philosophy

Write O ce a d Read Multiple ti es

Block Replication for:

- Durability, High Availability and Throughput.

www.easylearning.guru

MapReduce Flow

www.easylearning.guru

MapReduce Framework

Map Reduce works by breaking the processing into two phases :

Map Phase and Reduce Phase.

www.easylearning.guru

www.easylearning.guru

What we offer…

www.easylearning.guru

www.easylearning.guru

Syllabus

Introduction

a)Big Data

b)Hadoop

Hadoop

a)HDFS

b)MapReduce

PIG

a)Pig 1

b)Pig 2

Hive

a)Hive 1

b)Hive 2

Hbase

Zookeeper

Sqoop

Yarn

Project Class www.easylearning.guru

Thank you for watching the Live Demo for Hadoop.

You can always contact us on:

Your queries are always welcome.

Phone : +91 124 4763660 (India)

Email : [email protected]

Skype Id : easylearning.guru

Website : www.easylearning.guru

www.easylearning.guru