the limitation of mapreduce: a probing case and a lightweight solution zhiqiang ma lin gu department...

The Limitation of MapReduce:The Limitation of MapReduce:

A Probing Case and a Lightweight Solution A Probing Case and a Lightweight Solution

Zhiqiang Ma Lin GuZhiqiang Ma Lin GuDepartment of Computer Science and EngineeringDepartment of Computer Science and Engineering

The Hong Kong University of Science and TechnologyThe Hong Kong University of Science and Technology

CLOUD COMPUTING 2010CLOUD COMPUTING 2010

November 21-26, 2010 - Lisbon, PortugalNovember 21-26, 2010 - Lisbon, Portugal

MapReduceMapReduce MapReduce: parallel computing MapReduce: parallel computing

framework framework for large-scale data processingfor large-scale data processing Successful used in datacenters comprising Successful used in datacenters comprising

commodity computerscommodity computers A fundamental piece of software in the Google A fundamental piece of software in the Google

architecture for many years architecture for many years Open source variant already exists: HadoopOpen source variant already exists: Hadoop Widely used in solving data-intensive problemsWidely used in solving data-intensive problems

2

MapReduce … Hadoop or variants …Hadoop

Introduction to MapReduceIntroduction to MapReduce Map and Reduce are higher-order functionsMap and Reduce are higher-order functions Map: Map: apply an operation to apply an operation to all elements in a listall elements in a list Reduce: Like “fold”;Reduce: Like “fold”; aggregate elements of a list aggregate elements of a list

3

11

mm

44

mm

99

mm

1616

mm

2525

mm

11 22 33 44 55

m: x2

00 11

rr

55

rr

1414

rr

3030

rr

5555

rr

final valuefinal valueInitial valueInitial value

r: +

12 + 22 + 32 + 42 + 52 = ?

Introduction to MapReduceIntroduction to MapReduceMassive parallel processing made simpleMassive parallel processing made simple Example: world countExample: world count Map: parse a document and generate <word, 1> pairsMap: parse a document and generate <word, 1> pairs Reduce: receive all pairs for a specific word, and countReduce: receive all pairs for a specific word, and count

4

// D is a documentfor each word w in D output <w, 1>

// D is a documentfor each word w in D output <w, 1>

MapMap

Reduce for key w:count = 0for each input item count = count + 1output <w, count>

Reduce for key w:count = 0for each input item count = count + 1output <w, count>

ReduceReduce

Thoughts on MapReduceThoughts on MapReduce

MapReduce provides an easy-to-use MapReduce provides an easy-to-use framework for parallel programming.framework for parallel programming.

But is it good for general programs running But is it good for general programs running in datacenters? in datacenters?

5

Our workOur work

Analyze MapReduce’s design and use a Analyze MapReduce’s design and use a case study to probe the limitationcase study to probe the limitation

Design a new parallelization framework - Design a new parallelization framework - MRliteMRlite

Evaluate the new framework’s performanceEvaluate the new framework’s performance

6

Design a general parallelization framework and programming paradigm for cloud computing

Thoughts on MapReduceThoughts on MapReduce Originally designed for processing large static Originally designed for processing large static

data setsdata sets No significant dependenceNo significant dependence

Throughput over latencyThroughput over latency Large-data-parallelism over small-maybe-Large-data-parallelism over small-maybe-

ephemeral parallelization opportunitiesephemeral parallelization opportunities

7

… Input

Output

MapReduce MapReduce

The limitation of MapReduceThe limitation of MapReduce

One-way scalabilityOne-way scalability Allows a program to scale up to process very Allows a program to scale up to process very

large data setslarge data sets Constrains the program’s ability to process Constrains the program’s ability to process

moderate-size data itemsmoderate-size data items

Limits the applicability of MapReduceLimits the applicability of MapReduce Difficult to handle dynamic, interactive and Difficult to handle dynamic, interactive and

semantic-rich applicationssemantic-rich applications

8

A case study on MapReduceA case study on MapReduceDistributed compilerDistributed compiler

Very useful in development environmentsVery useful in development environments Code (data) has dependenceCode (data) has dependence Abundant parallelization opportunitiesAbundant parallelization opportunities

A “typical” application, A “typical” application,

9

make -j N

init/version.o

vmlinux-main

vmlinux-init

kallsyms.o

vmlinux

driver/built-in.o

mm/built-in.o

…

but a hard case for MapReduce

A case study: mrccA case study: mrcc Develop a distributed compiler using the Develop a distributed compiler using the

MapReduce modelMapReduce model How to extract the parallelizable components in a How to extract the parallelizable components in a

relatively complex data flow?relatively complex data flow?

mrcc: A distributed compilation systemmrcc: A distributed compilation system The workload is parallelizable but data-dependence The workload is parallelizable but data-dependence

constrainedconstrained Explores parallelism using the MapReduce modelExplores parallelism using the MapReduce model

10

mrccmrcc

Multiple machines Multiple machines available to MapReduce available to MapReduce for parallel compilationfor parallel compilation

A master instructs A master instructs multiple slaves (“map multiple slaves (“map workers”) to compile workers”) to compile source filessource files

11

Design of mrccDesign of mrcc

12

make -j N

… …

master

slave

“make” explores parallelism among

compiling source files

“make” explores parallelism among

compiling source files

MapReduce jobs for compiling source files

MapReduce jobs for compiling source files

The map taskcompiles an individual file

The map taskcompiles an individual file

Experiment: mrcc over HadoopExperiment: mrcc over Hadoop MapReduce implementationMapReduce implementation

Hadoop 0.20.2Hadoop 0.20.2

TestbedTestbed 10 nodes 10 nodes available to Hadoop for parallel available to Hadoop for parallel

executionexecution Nodes are connected by Nodes are connected by 1Gbps1Gbps Ethernet Ethernet

WorkloadWorkload Compiling Linux kernel, ImageMagick, and Xen Compiling Linux kernel, ImageMagick, and Xen

toolstools

13

Result and observationResult and observation

The compilation using mrcc on 10 nodes is The compilation using mrcc on 10 nodes is 2~11 2~11 times times slower than sequential compilation on one node.slower than sequential compilation on one node.

ProjectProject Time for gcc (sequesntial Time for gcc (sequesntial compilation) (min) compilation) (min)

Time for Time for

mrcc/Hadoop (min)mrcc/Hadoop (min)

Linux kernelLinux kernel 4949 151151

ImageMagickImageMagick 55 1111

Xen toolsXen tools 22 2424

14

For compiling source fileFor compiling source file• Put source files to HDFS: >2sPut source files to HDFS: >2s• Start Hadoop job: > 20sStart Hadoop job: > 20s• Retrieve object files: >2sRetrieve object files: >2s

Where does the slowdown Where does the slowdown come from?come from?

• Network communication Network communication overhead for data transportation overhead for data transportation and replicationand replication

• Tasking overheadTasking overhead

Is there sufficient parallelism to exploit?Is there sufficient parallelism to exploit? Yes. “distcc” serves as baselineYes. “distcc” serves as baseline

One-way scalability in the (MapReduce) design One-way scalability in the (MapReduce) design and (Hadoop) implementation.and (Hadoop) implementation.

MapReduce is not designed for compiling. We use this case to show some of its limitations.

mrcc: Distributed Compilationmrcc: Distributed Compilation

15

Parallelization frameworkParallelization framework

MapReduce/Hadoop is inefficient for MapReduce/Hadoop is inefficient for general programminggeneral programming

Cloud computing needs a general Cloud computing needs a general parallelization framework!parallelization framework! Handle applications with complex logic, data Handle applications with complex logic, data

dependence and frequent updates, etc.dependence and frequent updates, etc. 39% of Facebook’s MapReduce workload have 39% of Facebook’s MapReduce workload have

only 1 Map [Zaharia 2010]only 1 Map [Zaharia 2010] Easy to use and high performanceEasy to use and high performance

16[Zaharia 2010] M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. 2010. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. EuroSys ‘10.

Lightweight solution: MRliteLightweight solution: MRlite A lightweight parallelization framework A lightweight parallelization framework

following the MapReduce paradigmfollowing the MapReduce paradigm Parallelization can be invoked when neededParallelization can be invoked when needed Able to scale “up” like MapReduce, and Able to scale “up” like MapReduce, and

scale “down” to process moderate-size datascale “down” to process moderate-size data Low latency and massive parallelismLow latency and massive parallelism Small run-time system overheadSmall run-time system overhead

17

A general parallelization framework and programming paradigm for cloud computing

Architecture of MRliteArchitecture of MRlite

18

MRlite clientMRlite client

MRlite masterscheduler

MRlite masterscheduler

slaveslave

slaveslave

slaveslave

slaveslave

applicationapplication

Data flowData flow

Command flowCommand flow

Linked together with the app, the MRlite client library accepts calls

from app and submits jobs to the master

Linked together with the app, the MRlite client library accepts calls

from app and submits jobs to the master High speed distributed

storage, stores intermediate files

High speed distributed storage, stores

intermediate files

The MRlite master accepts jobs from clients and

schedules them to execute on slaves

The MRlite master accepts jobs from clients and

schedules them to execute on slaves Distributed nodes

accept tasks from master and execute

them

Distributed nodes accept tasks from

master and execute them

DesignDesign Parallelization is invoked when neededParallelization is invoked when needed

An application can request parallel execution for An application can request parallel execution for arbitrary number of timesarbitrary number of times

Program’s natural logic flow integrated with Program’s natural logic flow integrated with parallelismparallelism

Remove one important limitationRemove one important limitation

Facility outlives utilityFacility outlives utility Use and reuse threads for master and slavesUse and reuse threads for master and slaves

Memory is “first class” mediumMemory is “first class” medium Avoid touching hard drivesAvoid touching hard drives

19

DesignDesign Programming interfaceProgramming interface

Provides simple APIProvides simple API API allows programs to invoke parallel processing API allows programs to invoke parallel processing

during executionduring execution Data handlingData handling

Network file system which stores files in memoryNetwork file system which stores files in memory No replication for intermediate filesNo replication for intermediate files Applications are responsible to retrieve output filesApplications are responsible to retrieve output files

Latency controlLatency control Jobs and tasks have timeout limitationsJobs and tasks have timeout limitations

20

ImplementationImplementation Implemented in C as Linux applicationsImplemented in C as Linux applications Distributed file storageDistributed file storage

Implemented with NFS in memory; Mounted Implemented with NFS in memory; Mounted from all nodes; Stores intermediate filesfrom all nodes; Stores intermediate files

A specially designed distributed in-memory A specially designed distributed in-memory network file system may further improve network file system may further improve performance (future work)performance (future work)

There is no limitation on the choice of There is no limitation on the choice of programming languagesprogramming languages

21

EvaluationEvaluation Re-implement mrcc on MRliteRe-implement mrcc on MRlite

It is not difficult to port mrcc because MRlite can It is not difficult to port mrcc because MRlite can handle a “superset” of the MapReduce workloadshandle a “superset” of the MapReduce workloads

Testbed and workloadTestbed and workload Use the same testbed and same workload to Use the same testbed and same workload to

compare MRlite‘s performance with compare MRlite‘s performance with MapReduce/Hadoop’sMapReduce/Hadoop’s

22

ResultResult

The compilation of the three projects using mrcc on MRlite The compilation of the three projects using mrcc on MRlite is much faster than compilation on one node. The speedup is much faster than compilation on one node. The speedup is is at least 2 at least 2 and the best speedup reaches and the best speedup reaches 66.. 23

MRlite vs. HadoopMRlite vs. HadoopThe average speedup of MRlite is more than The average speedup of MRlite is more than 1212 times times better than that of Hadoop.better than that of Hadoop.

24

The evaluation shows that MRlite is one order of magnitude The evaluation shows that MRlite is one order of magnitude faster than Hadoop on problems that MapReduce has faster than Hadoop on problems that MapReduce has difficulty in handling.difficulty in handling.

ProjectProject Speedup Speedup on Hadoopon Hadoop

Speedup Speedup on MRliteon MRlite

MRlite vs. MRlite vs. HadoopHadoop

LinuxLinux 0.320.32 5.85.8 17.917.9

ImageMagickImageMagick 0.480.48 6.26.2 13.013.0

Xen toolsXen tools 0.090.09 2.02.0 22.022.0

ConclusionConclusion Cloud Computing needs a general programming Cloud Computing needs a general programming

frameworkframework Cloud computing shall not be a platform to run just simple OLAP Cloud computing shall not be a platform to run just simple OLAP

applications. It is important to support complex computation and even applications. It is important to support complex computation and even OLTP on large data sets.OLTP on large data sets.

Use the distributed compilation case (mrcc) to probe Use the distributed compilation case (mrcc) to probe the one-way scalability limitation of MapReducethe one-way scalability limitation of MapReduce

Design MRlite: a general parallelization framework Design MRlite: a general parallelization framework for cloud computingfor cloud computing Handles applications with complex logic flow and data dependenciesHandles applications with complex logic flow and data dependencies Mitigates the one-way scalability problemMitigates the one-way scalability problem Able to handle all MapReduce tasks with comparable (if not better) Able to handle all MapReduce tasks with comparable (if not better)

performanceperformance

25

ConclusionConclusionEmerging computing platforms increasingly Emerging computing platforms increasingly emphasize parallelization capability, such as emphasize parallelization capability, such as GPGPUGPGPU

MRlite respects applications’ naturalMRlite respects applications’ natural logic flow logic flow and data dependenciesand data dependencies

This modularization of parallelization capability This modularization of parallelization capability from application logic enables MRlite to from application logic enables MRlite to integrate GPGPU processing very easily (future integrate GPGPU processing very easily (future work)work)

26

Thank you!Thank you!

the limitation of mapreduce: a probing case and a lightweight solution zhiqiang ma lin gu department...

Documents

large data setsconstrains

mapreduce hadoop

dataintensive problems

parallel computing framework

complex data flow

parallel programming

design of mrcc

hard case