apache hadoop & map reduce

2

Click here to load reader

Upload: md-mahedi-mahfuj

Post on 11-May-2015

481 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Apache hadoop & map reduce

Apache Hadoop, BigData & MapReduce

WHY BIG DATA:

“More data usually beats better algorithm.”

GOOD NEWS:

“Big data is here.”

BAD NEWS:

We are struggling to store and analyze it.

KEY PROBLEM:

“Storage increased, not Speed.”

SOLUTION:

Parallelism

But, while implementing parallelism we may face some noteworthy problems like;

Hardware failure

Combining data

These problems have been overcome by Hadoop because of use of –

HDFS ( Hadoop Distributed File System)

MapReduce ( use of keys and values)

Page 2: Apache hadoop & map reduce

In a nutshell,

Hadoop provides - A reliable Shared Storage (by HDFS)

-A reliable Analysis System (by MapReduce)

MAPREDUCE:

Entire database or a good portion of it is processed for each query.

MapReduce is a batch query processor.

Already used by Mailtrust , Rackspace’s mail division for handling big data.

MAPREDUCE VS RDBMS:

CONCLUSION:

Though a thorough understanding is absent here, more research will make it more clarified and

distinguished as well. Some more valuable information will enrich it in the coming days.