big data hadoop tutorial by easylearning guru
TRANSCRIPT
Welcome to the World of Big Data & Hadoop
www.easylearning.guru
Agenda
What is Big Data ?
Different Kinds of Big Data
Big Data Global Market
Hadoop Global job trends
What is Hadoop ?
www.easylearning.guru
What is Big Data?
Big data is the term for a collection of datasets so large and complex that it becomesdifficult to process using on-hand databasemanagement tools or traditional dataprocessing applications.
www.easylearning.guru
Types of Big Data ?
Traditional RDBMS deals with only Structured data.
Need of a technology which deals with Semi-structured data, Unstructured
data and Structured data as well
Semi-Structured Data
www.easylearning.guru
The 3V’s of Big Data
www.easylearning.guru
Sources of Data
Social Media & Networks(All of us are generating data)
Mobile Devices(Tracking all the objects all the time)
Sensor Technology & Networks(Measuring all kinds of data)
Scientific Instruments(Collecting all sorts of data)
www.easylearning.guru
Where Big Data is used ?
www.easylearning.guru
Facebook Scenario
Facebook on an average generates 70 thousand MB in 1 minute.
1 hour = 70,000 MB *60 = 4.2 Million MB
1 Day = 4.2 Million *24 MB = 10.8 Billion MB = 98438 GB
1 week = 6.9 thousand GB = 690 TB
4 weeks = 690 TB * 4 = 2756 TB = 2.7 PB
52 weeks = 2.7 PB * 52 = 143.3 PB
And that’s aloooooooooot of data !www.easylearning.guru
Various Bigdata Technologies
www.easylearning.guru
Big Data Global Market
Sources : Dice, LinkedIn.
Big Data Implementation
Implemented Big Data Yet to Implement Big Data0
10
20
30
40
50
60
2012 2013 2014 2015 2016 2017
Big
Dat
a G
row
th (
in U
SD B
illio
ns)
BIG D A TA A NA LYS T
BIG D A TA A RCHITECT
BIG D A TA ENGINEER
BIG DA TA RESEA RCH A NA LYST
BIG D A TA V ISUA LIZ ER
D A TA SCIENTIST
50
43
44
31
23
18
50
57
56
69
77
82
FILLED/VACANCY(%)
Filled Unfilled
www.easylearning.guru
Hadoop Global Job Trends
Top Hadoop Technology Companies
Sources : Dice, LinkedIn.
More than 17,000 employees with Hadoop skill across these companies
www.easylearning.guru
2% 2% 3% 4%
8% 8%10% 11%
14%
38%
DEMAND FOR BIG DATA IN CITIES
As of February 2014
0
20
40
60
80
100
120
SALA
RY
(USD
P.A
. IN
TH
OU
SAN
DS)
Sources : Dice, LinkedIn.
Hadoop Global Job Trends
www.easylearning.guru
What is Hadoop ?
Hadoop was created by Doug Cutting and Mike Cafarella.
Hadoop provides the reliable shared storage and analysis system.
It is designed to scale up from a single server to thousand of machines, with a high degree of fault tolerance.
www.easylearning.guru
Hadoop History
www.easylearning.guru
Hadoop Core Components
Core Hadoop has two main systems:
• Hadoop Distributed File System: The Hadoop file system is a Distributed file system which holds the large amount of data across multiple nodes in a cluster.
• MapReduce: MapReduce is a distributed programming paradigm used to analyze the data in the HDFS.
www.easylearning.guru
Hadoop Distributed File System (HDFS)
A given file is broken down into blocks (default=64MB), then blocks are replicated across cluster (default=3).
Optimized for throughput.
HDFS allows you to put/get/delete files.
Follows the philosophy
“Write Once and Read Multiple times”
Block Replication for:
- Durability, High Availability and Throughput.
www.easylearning.guru
MapReduce Flow
www.easylearning.guru
MapReduce Framework
Map Reduce works by breaking the processing into two phases :Map Phase and Reduce Phase.
www.easylearning.guru
www.easylearning.guru
What we offer…
www.easylearning.guru
www.easylearning.guru
Syllabus
Introduction
a)Big Data
b)Hadoop
Hadoop
a)HDFS
b)MapReduce
PIG
a)Pig 1
b)Pig 2
Hive
a)Hive 1
b)Hive 2
Hbase
Zookeeper
Sqoop
Yarn
Project Classwww.easylearning.guru
Thank you for watching the Live Demo for Hadoop.You can always contact us on:
Your queries are always welcome.
Phone : +91 124 4763660 (India)
Email : [email protected]
Skype Id : easylearning.guru
Website : www.easylearning.guru
www.easylearning.guru