easylearning guru online hadoop class

23
Welcome to the World of Big Data & Hadoop www.easylearning.guru

Upload: easylearning

Post on 10-May-2015

258 views

Category:

Education


2 download

DESCRIPTION

easylearning Guru provide online training on Hadoop in witch u can learn hadoop easily

TRANSCRIPT

Page 1: Easylearning Guru online Hadoop class

Welcome to the World of Big Data & Hadoop

www.easylearning.guru

Page 2: Easylearning Guru online Hadoop class

Agenda

What is Big Data ?

Different Kinds of Big Data

Big Data Global Market

Hadoop Global job trends

What is Hadoop ? www.easylearning.guru

Page 3: Easylearning Guru online Hadoop class

What is Big Data?

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.

www.easylearning.guru

Page 4: Easylearning Guru online Hadoop class

Types of Big Data ?

Traditional RDBMS deals with only Structured data.

Need of a technology which deals with Semi-structured data, Unstructured

data and Structured data as well

Semi-Structured Data

www.easylearning.guru

Page 5: Easylearning Guru online Hadoop class

The 3V’s of Big Data

www.easylearning.guru

Page 6: Easylearning Guru online Hadoop class

Sources of Data

Social Media & Networks(All of us are generating data)

Mobile Devices(Tracking all the objects all the time)

Sensor Technology & Networks(Measuring all kinds of data)

Scientific Instruments(Collecting all sorts of data)

www.easylearning.guru

Page 7: Easylearning Guru online Hadoop class

Where Big Data is used ?

www.easylearning.guru

Page 8: Easylearning Guru online Hadoop class

Facebook Scenario

Facebook on an average generates 70 thousand MB in 1 minute.

1 hour = 70,000 MB *60 = 4.2 Million MB1 Day = 4.2 Million *24 MB = 10.8 Billion MB = 98438 GB1 week = 6.9 thousand GB = 690 TB4 weeks = 690 TB * 4 = 2756 TB = 2.7 PB52 weeks = 2.7 PB * 52 = 143.3 PB

And that’s aloooooooooot of data !www.easylearning.guru

Page 9: Easylearning Guru online Hadoop class

Various Bigdata Technologies

www.easylearning.guru

Page 10: Easylearning Guru online Hadoop class

Big Data Global Market

Sources : Dice, LinkedIn.

Big Data Implementation

Implemented Big DataYet to Implement Big Data 2012 2013 2014 2015 2016 2017

0

10

20

30

40

50

60

Big

Dat

a G

row

th (i

n U

SD B

illio

ns)

B I G D ATA ANALYST

B I G D ATA AR CH I TECT

B I G D ATA ENGI NE ER

B I G D ATA R ESEAR CH ANALYST

B I G D ATA VI SUAL I Z ER

D ATA SCI ENTI ST

50

43

44

31

23

18

50

57

56

69

77

82

Filled Unfilled

Filled/vacancy(%) www.easylearning.guru

Page 11: Easylearning Guru online Hadoop class

Hadoop Global Job Trends

Top Hadoop Technology Companies

Sources : Dice, LinkedIn.

More than 17,000 employees with Hadoop skill across these companies

www.easylearning.guru

Page 12: Easylearning Guru online Hadoop class

VIZAG

NOIDA

DELHI

AHEMDABAD

CHENNAI

MUM

BAI

GURGAONPUNE

HYDERABAD

BANGALORE

2% 2% 3% 4%

8% 8%10% 11%

14%

38%

Demand for Big Data in Cities

As of February 2014

HadoopUnix

Teradata SA

P

Java Sc

ript

C++

IBM M

ainframe VB

.NET

MYSQ

L

VM W

are0

20

40

60

80

100

120

Sala

ry (U

SD p

.a. i

n th

ousa

nds)

Sources : Dice, LinkedIn.

Hadoop Global Job Trends

www.easylearning.guru

Page 13: Easylearning Guru online Hadoop class

What is Hadoop ?

Hadoop was created by Doug Cutting and Mike Cafarella.

Hadoop provides the reliable shared storage and analysis system.

It is designed to scale up from a single server to thousand of machines, with a high degree of fault tolerance.

www.easylearning.guru

Page 14: Easylearning Guru online Hadoop class

Hadoop History

www.easylearning.guru

Page 15: Easylearning Guru online Hadoop class

Hadoop Core Components

Core Hadoop has two main systems:

• Hadoop Distributed File System: The Hadoop file system is a Distributed file system which holds the large amount of data across multiple nodes in a cluster.

• MapReduce: MapReduce is a distributed programming paradigm used to analyze the data in the HDFS.

www.easylearning.guru

Page 16: Easylearning Guru online Hadoop class

Hadoop Distributed File System (HDFS)

A given file is broken down into blocks (default=64MB), then blocks are replicated across cluster (default=3).Optimized for throughput.HDFS allows you to put/get/delete files. Follows the philosophy

“Write Once and Read Multiple times”

Block Replication for: - Durability, High Availability and Throughput.

www.easylearning.guru

Page 17: Easylearning Guru online Hadoop class

MapReduce Flow

www.easylearning.guru

Page 18: Easylearning Guru online Hadoop class

MapReduce Framework

Map Reduce works by breaking the processing into two phases :Map Phase and Reduce Phase.

www.easylearning.guru

Page 19: Easylearning Guru online Hadoop class

www.easylearning.guru

Page 20: Easylearning Guru online Hadoop class

What we offer…

www.easylearning.guru

Page 21: Easylearning Guru online Hadoop class

www.easylearning.guru

Page 22: Easylearning Guru online Hadoop class

Syllabus

Introductiona)Big Data

b)Hadoop

Hadoopa)HDFS

b)MapReduce

PIGa)Pig 1

b)Pig 2

Hivea)Hive 1

b)Hive 2

Hbase

Zookeeper

Sqoop

Yarn

Project Classwww.easylearning.guru

Page 23: Easylearning Guru online Hadoop class

Thank you for watching the Live Demo for Hadoop.You can always contact us on:

Your queries are always welcome.

Phone : +91 124 4763660 (India)

Email : [email protected]

Skype Id : easylearning.guru

Website : www.easylearning.guruwww.easylearning.guru