big data hadoop tutorial by easylearning guru

23
Welcome to the World of Big Data & Hadoop www.easylearning.guru

Upload: easylearning

Post on 14-Jul-2015

281 views

Category:

Engineering


6 download

TRANSCRIPT

Page 1: Big Data Hadoop Tutorial by Easylearning Guru

Welcome to the World of Big Data & Hadoop

www.easylearning.guru

Page 2: Big Data Hadoop Tutorial by Easylearning Guru

Agenda

What is Big Data ?

Different Kinds of Big Data

Big Data Global Market

Hadoop Global job trends

What is Hadoop ?

www.easylearning.guru

Page 3: Big Data Hadoop Tutorial by Easylearning Guru

What is Big Data?

Big data is the term for a collection of datasets so large and complex that it becomesdifficult to process using on-hand databasemanagement tools or traditional dataprocessing applications.

www.easylearning.guru

Page 4: Big Data Hadoop Tutorial by Easylearning Guru

Types of Big Data ?

Traditional RDBMS deals with only Structured data.

Need of a technology which deals with Semi-structured data, Unstructured

data and Structured data as well

Semi-Structured Data

www.easylearning.guru

Page 5: Big Data Hadoop Tutorial by Easylearning Guru

The 3V’s of Big Data

www.easylearning.guru

Page 6: Big Data Hadoop Tutorial by Easylearning Guru

Sources of Data

Social Media & Networks(All of us are generating data)

Mobile Devices(Tracking all the objects all the time)

Sensor Technology & Networks(Measuring all kinds of data)

Scientific Instruments(Collecting all sorts of data)

www.easylearning.guru

Page 7: Big Data Hadoop Tutorial by Easylearning Guru

Where Big Data is used ?

www.easylearning.guru

Page 8: Big Data Hadoop Tutorial by Easylearning Guru

Facebook Scenario

Facebook on an average generates 70 thousand MB in 1 minute.

1 hour = 70,000 MB *60 = 4.2 Million MB

1 Day = 4.2 Million *24 MB = 10.8 Billion MB = 98438 GB

1 week = 6.9 thousand GB = 690 TB

4 weeks = 690 TB * 4 = 2756 TB = 2.7 PB

52 weeks = 2.7 PB * 52 = 143.3 PB

And that’s aloooooooooot of data !www.easylearning.guru

Page 9: Big Data Hadoop Tutorial by Easylearning Guru

Various Bigdata Technologies

www.easylearning.guru

Page 10: Big Data Hadoop Tutorial by Easylearning Guru

Big Data Global Market

Sources : Dice, LinkedIn.

Big Data Implementation

Implemented Big Data Yet to Implement Big Data0

10

20

30

40

50

60

2012 2013 2014 2015 2016 2017

Big

Dat

a G

row

th (

in U

SD B

illio

ns)

BIG D A TA A NA LYS T

BIG D A TA A RCHITECT

BIG D A TA ENGINEER

BIG DA TA RESEA RCH A NA LYST

BIG D A TA V ISUA LIZ ER

D A TA SCIENTIST

50

43

44

31

23

18

50

57

56

69

77

82

FILLED/VACANCY(%)

Filled Unfilled

www.easylearning.guru

Page 11: Big Data Hadoop Tutorial by Easylearning Guru

Hadoop Global Job Trends

Top Hadoop Technology Companies

Sources : Dice, LinkedIn.

More than 17,000 employees with Hadoop skill across these companies

www.easylearning.guru

Page 12: Big Data Hadoop Tutorial by Easylearning Guru

2% 2% 3% 4%

8% 8%10% 11%

14%

38%

DEMAND FOR BIG DATA IN CITIES

As of February 2014

0

20

40

60

80

100

120

SALA

RY

(USD

P.A

. IN

TH

OU

SAN

DS)

Sources : Dice, LinkedIn.

Hadoop Global Job Trends

www.easylearning.guru

Page 13: Big Data Hadoop Tutorial by Easylearning Guru

What is Hadoop ?

Hadoop was created by Doug Cutting and Mike Cafarella.

Hadoop provides the reliable shared storage and analysis system.

It is designed to scale up from a single server to thousand of machines, with a high degree of fault tolerance.

www.easylearning.guru

Page 14: Big Data Hadoop Tutorial by Easylearning Guru

Hadoop History

www.easylearning.guru

Page 15: Big Data Hadoop Tutorial by Easylearning Guru

Hadoop Core Components

Core Hadoop has two main systems:

• Hadoop Distributed File System: The Hadoop file system is a Distributed file system which holds the large amount of data across multiple nodes in a cluster.

• MapReduce: MapReduce is a distributed programming paradigm used to analyze the data in the HDFS.

www.easylearning.guru

Page 16: Big Data Hadoop Tutorial by Easylearning Guru

Hadoop Distributed File System (HDFS)

A given file is broken down into blocks (default=64MB), then blocks are replicated across cluster (default=3).

Optimized for throughput.

HDFS allows you to put/get/delete files.

Follows the philosophy

“Write Once and Read Multiple times”

Block Replication for:

- Durability, High Availability and Throughput.

www.easylearning.guru

Page 17: Big Data Hadoop Tutorial by Easylearning Guru

MapReduce Flow

www.easylearning.guru

Page 18: Big Data Hadoop Tutorial by Easylearning Guru

MapReduce Framework

Map Reduce works by breaking the processing into two phases :Map Phase and Reduce Phase.

www.easylearning.guru

Page 19: Big Data Hadoop Tutorial by Easylearning Guru

www.easylearning.guru

Page 20: Big Data Hadoop Tutorial by Easylearning Guru

What we offer…

www.easylearning.guru

Page 21: Big Data Hadoop Tutorial by Easylearning Guru

www.easylearning.guru

Page 22: Big Data Hadoop Tutorial by Easylearning Guru

Syllabus

Introduction

a)Big Data

b)Hadoop

Hadoop

a)HDFS

b)MapReduce

PIG

a)Pig 1

b)Pig 2

Hive

a)Hive 1

b)Hive 2

Hbase

Zookeeper

Sqoop

Yarn

Project Classwww.easylearning.guru

Page 23: Big Data Hadoop Tutorial by Easylearning Guru

Thank you for watching the Live Demo for Hadoop.You can always contact us on:

Your queries are always welcome.

Phone : +91 124 4763660 (India)

Email : [email protected]

Skype Id : easylearning.guru

Website : www.easylearning.guru

www.easylearning.guru