the motivation for hadoop hadoop: basic concepts what is...
TRANSCRIPT
1compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
Section 9 : Case Study #Objectives of this Session The Motivation For Hadoop
What problems exist with traditional large-scale computing systemsWhat requirements an alternative approach should haveHow Hadoop addresses those requirements
Hadoop: Basic ConceptsWhat Is Hadoop?The Hadoop Distributed File System (HDFS)How Google MapReduce Algorithm worksAnatomy of a Hadoop Cluster
Who uses Hadoop ?
db.suven.net# Not a part of 1Z0-061 or 1Z0-144 Certification test , but very important technology in BIG DATA Analysis
• Hadoop Solutions
– The most common problems Hadoop can solve
– The types of analytics often performed with Hadoop
– Where the data comes from ?
– The benefits of analyzing data with Hadoop
– How some real-world companies use Hadoop
• Hadoop Ecosystem
• Cloudera Software (All Open-Source)
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 2
Objectives of this Session … contd…
The Motivation For Hadoop
3compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
*MPI: Message Passing InterfacePVM: Parallel Virtual Machine
4compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
Major Problem
5compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
1 GB = 1000 MB , 1 TB = 1000 GB , 1 PT = 1000 TB , 1 Exabyte = 1000 PTPT => petabyte , TB => teraByte
6compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
7compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
The Motivation For Hadoop
8compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
1.
2.
9compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
3.
4.
5.
10
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
Hadoop History
11compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
Core Hadoop Concepts
12compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
Hadoop Components
13compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
HDFS
14compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
HDFSConcepts
15compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
HDFS : How Files Are Stored ?
16compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
How Files Are Stored: Example
17compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
IMP :
How MapReduce Work ?
18compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
MapReduce: The Mapper
19compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
Example :
20compiled by Rocky Jagtiani Tech Head for
SCTPL , 9892544177
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 21
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 22
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 23
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 24
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
25
Anatomy of a Hadoop Cluster :
compiled by Rocky Jagtiani Tech Head for SCTPL , 989254417726
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 27
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 28
Who uses Hadoop ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
29
Hadoop Solutions
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 30
A
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
31
B What is Problem if the data is coming ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
32
C
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 33
The most common problems Hadoop can solve :
We understand how each problem is solved using Hadoop in brief
D
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 34
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
35
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 36
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 37
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 38
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 39
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
40
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
41
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
42
How some real-world companies use HadoopE
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
43
Hadoop Ecosystem
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 44
Cloudera Software (All Open-Source)
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177
45
*enterprise data
warehouse (EDW)
Conclusion :
1) Input to mapper is
"Google is one of the richest companies "
"one who works with the Google is technical expert "
what will be the out put after reducing ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 46
Questions
2) Input to mapper is
"Cat is eating milk"
"Cat is very sweet and she likes milk"
"milk is in bottle"
what will be the out put after reducing ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 47
3) Input to mapper is
"Dollar is national currency for USA"
"Rupee is national currency for India"
"Dollar is ahead of Rupee in economy"
"India is developing country"
what will be the out put after Mapping ?
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 48
compiled by Rocky Jagtiani Tech Head for SCTPL , 9892544177 49
what will be the out put after reducing ?
what will be the out put after shuffling?