big data analysis using map/reduce
Post on 18-Jul-2015
64 Views
Preview:
TRANSCRIPT
BBig ig DData Analysis for Page ata Analysis for Page
Ranking using Map/ReduceRanking using Map/Reduce
R.Renuka, R.Vidhya Priya, III B.Sc., IT, The S.F.R.College for Women, Sivakasi.
OverviewIntroductionWhat is Big Data!Why Big Data?4 V’s Of Big DataBig Data Analytics TechnologiesMap/Reduce Applications Case StudyConclusion
IntroductionData have outgrown the storage and processing capabilities of a single host.
Two fundamental challenges: – how to store and – how to work with voluminous data sizes, and, – how to understand data and turn it into a competitive
advantage.
What is Big Data! ‘Big-data’ is similar to ‘Small-data’ , but bigger
But having data bigger requires different approaches: techniques, tools & architectures
To solve: New problems and old problems in a better way.
Why Big Data?Key enablers for the growth of “Big Data” are:
Increase of Processing Power
Increase of Storage Capacities
Availability of Data
Big Data Analytics TechnologiesHadoop
PLATFORA
WibiData
PIG
Hive
MapReduce
NoSQL databases
Column-oriented databases
HadoopHadoop is a distributed file system and data processing engine
Hadoop has two components:– The Hadoop distributed file system (HDFS)– The MapReduce programing.
Map / ReduceA High level abstracted framework for distributed processing of large datasets
Fault Tolerant , Parallelization
Computation consists of two phasesMapReduce
A Master-Slave architecture
Computations occurs in multiple slave nodes
And it tries to provide data locality as much as possible.
MR modelMap– Process a key/value pair to generate intermediate key/value
pairsReduce– Merge all intermediate values associated with the same key
Users implement interface of two primary methods:1. Map: (key1, val1) → (key2, val2)2. Reduce: (key2, [val2]) → [val3]
Homeland Security
Finance Smarter Healthcare Multi-channel sales
Telecom
Manufacturing
Traffic Control
Trading Analytics Fraud and Risk
Log Analysis
Search Quality
Retails
Conclusion
Real-time big data isn’ t just a process for storing
petabytes or exabytes of data in a data warehouse, It’s
about the ability to make better decisions and take
meaningful actions at the right time.
top related