introduction of big data and hadoop

17
Presentation on Big Data & Hadoop PRESENTED BY: AROHI KHANDELWAL 1

Upload: arohi-khandelwal

Post on 13-Jan-2017

43 views

Category:

Education


2 download

TRANSCRIPT

Page 1: Introduction of Big data and Hadoop

Presentation on Big Data & Hadoop

PRESENTED BY: AROHI KHANDELWAL

1

Page 2: Introduction of Big data and Hadoop

Contents : What is BIG DATA ? Why BIG DATA ? Hadoop Hadoop Architecture Hadoop Distributed File System HDFS Architecture Map Reduce How Map Reduce Works ? Hadoop Ecosystem What is Hadoop used for ? Users of Hadoop Advantage & Disadvantage of Hadoop Conclusion

2

Page 3: Introduction of Big data and Hadoop

What Is BIG DATA ?

Big Data

VolumeVarietyVelocity

3

Page 4: Introduction of Big data and Hadoop

Why BIG DATA ? 4

Mobile phone increased 70.3% to 918m in last two years.

Twitter has 328m monthly active users – 55% growth

Facebook has 765m active users.

Google+ has 495m monthly active users – grow 45%

LinkedIn has 300m users.

On every single minute 48 hours of video are posted.

Page 5: Introduction of Big data and Hadoop

Hadoop :

Open source distributed computing framework . Built on Java and Scala languages. Named by Doug Cutting on his son’s toy elephant.

5

Storage

Process

Hadoop

Page 6: Introduction of Big data and Hadoop

Hadoop Architecture :

Hadoop designed and built on two independent frame works namely : Hadoop Distributed File System Map Reduce

Hadoop

Map ReduceHDFS

6

Page 7: Introduction of Big data and Hadoop

Hadoop Distributed File System :

Based on Google File System. Data is stored in the form of blocks . Provide data reliability. Provide fast processing on data.

7

Page 8: Introduction of Big data and Hadoop

HDFS Architecture :

Hadoop Distributed File System has : Name node Data nodes

8

Page 9: Introduction of Big data and Hadoop

Map Reduce :9

Takes a set of data & breaks individual

elements into tuple

Takes Map’s o/p as i/p and combine those data tuple forming a similar set of

tuple

Page 10: Introduction of Big data and Hadoop

How Map Reduce works ?10

Page 11: Introduction of Big data and Hadoop

Hadoop Ecosystem

:HDFSYARN Map Reduce V2HBASEHIVEApache PigOozieZookeeperSqoop

11

Page 12: Introduction of Big data and Hadoop

What is Hadoop used for ?

Search • Yahoo , AmazonLog processing • Facebook , Yahoo

Data Warehouse • Facebook , AOLVideo & Image Analysis • New York Times

12

Page 13: Introduction of Big data and Hadoop

Users of Hadoop :13

Page 14: Introduction of Big data and Hadoop

Advantage of Hadoop :

platform independent. Block structured file system. We can store any thing. Huge storage capacity. Rapidly process large amounts of data in parallel. Fault-tolerance.

14

Page 15: Introduction of Big data and Hadoop

Disadvantage of Hadoop :

Not Fit for Small Data Setup Issue Programming model is very restrictive

15

Page 16: Introduction of Big data and Hadoop

Summery

Hadoop excels at Big Data , analytics , batch processing.

Not real-time , no random access ; not a database.

HDFS makes it all possible: Fault tolerant file system Fast accessing speed . Pig , Hive are easy to use.

16

Page 17: Introduction of Big data and Hadoop

THANKING YOU …