the limitation of mapreduce: a probing case and a lightweight solution zhiqiang ma lin gu department...
TRANSCRIPT
The Limitation of MapReduce:The Limitation of MapReduce:
A Probing Case and a Lightweight Solution A Probing Case and a Lightweight Solution
Zhiqiang Ma Lin GuZhiqiang Ma Lin GuDepartment of Computer Science and EngineeringDepartment of Computer Science and Engineering
The Hong Kong University of Science and TechnologyThe Hong Kong University of Science and Technology
CLOUD COMPUTING 2010CLOUD COMPUTING 2010
November 21-26, 2010 - Lisbon, PortugalNovember 21-26, 2010 - Lisbon, Portugal
MapReduceMapReduce MapReduce: parallel computing MapReduce: parallel computing
framework framework for large-scale data processingfor large-scale data processing Successful used in datacenters comprising Successful used in datacenters comprising
commodity computerscommodity computers A fundamental piece of software in the Google A fundamental piece of software in the Google
architecture for many years architecture for many years Open source variant already exists: HadoopOpen source variant already exists: Hadoop Widely used in solving data-intensive problemsWidely used in solving data-intensive problems
2
MapReduce … Hadoop or variants …Hadoop
Introduction to MapReduceIntroduction to MapReduce Map and Reduce are higher-order functionsMap and Reduce are higher-order functions Map: Map: apply an operation to apply an operation to all elements in a listall elements in a list Reduce: Like “fold”;Reduce: Like “fold”; aggregate elements of a list aggregate elements of a list
3
11
mm
44
mm
99
mm
1616
mm
2525
mm
11 22 33 44 55
m: x2
00 11
rr
55
rr
1414
rr
3030
rr
5555
rr
final valuefinal valueInitial valueInitial value
r: +
12 + 22 + 32 + 42 + 52 = ?
Introduction to MapReduceIntroduction to MapReduceMassive parallel processing made simpleMassive parallel processing made simple Example: world countExample: world count Map: parse a document and generate <word, 1> pairsMap: parse a document and generate <word, 1> pairs Reduce: receive all pairs for a specific word, and countReduce: receive all pairs for a specific word, and count
4
// D is a documentfor each word w in D output <w, 1>
// D is a documentfor each word w in D output <w, 1>
MapMap
Reduce for key w:count = 0for each input item count = count + 1output <w, count>
Reduce for key w:count = 0for each input item count = count + 1output <w, count>
ReduceReduce
Thoughts on MapReduceThoughts on MapReduce
MapReduce provides an easy-to-use MapReduce provides an easy-to-use framework for parallel programming.framework for parallel programming.
But is it good for general programs running But is it good for general programs running in datacenters? in datacenters?
5
Our workOur work
Analyze MapReduce’s design and use a Analyze MapReduce’s design and use a case study to probe the limitationcase study to probe the limitation
Design a new parallelization framework - Design a new parallelization framework - MRliteMRlite
Evaluate the new framework’s performanceEvaluate the new framework’s performance
6
Design a general parallelization framework and programming paradigm for cloud computing
Thoughts on MapReduceThoughts on MapReduce Originally designed for processing large static Originally designed for processing large static
data setsdata sets No significant dependenceNo significant dependence
Throughput over latencyThroughput over latency Large-data-parallelism over small-maybe-Large-data-parallelism over small-maybe-
ephemeral parallelization opportunitiesephemeral parallelization opportunities
7
… Input
Output
MapReduce MapReduce
The limitation of MapReduceThe limitation of MapReduce
One-way scalabilityOne-way scalability Allows a program to scale up to process very Allows a program to scale up to process very
large data setslarge data sets Constrains the program’s ability to process Constrains the program’s ability to process
moderate-size data itemsmoderate-size data items
Limits the applicability of MapReduceLimits the applicability of MapReduce Difficult to handle dynamic, interactive and Difficult to handle dynamic, interactive and
semantic-rich applicationssemantic-rich applications
8
A case study on MapReduceA case study on MapReduceDistributed compilerDistributed compiler
Very useful in development environmentsVery useful in development environments Code (data) has dependenceCode (data) has dependence Abundant parallelization opportunitiesAbundant parallelization opportunities
A “typical” application, A “typical” application,
9
make -j N
init/version.o
vmlinux-main
vmlinux-init
kallsyms.o
vmlinux
driver/built-in.o
mm/built-in.o
…
but a hard case for MapReduce
A case study: mrccA case study: mrcc Develop a distributed compiler using the Develop a distributed compiler using the
MapReduce modelMapReduce model How to extract the parallelizable components in a How to extract the parallelizable components in a
relatively complex data flow?relatively complex data flow?
mrcc: A distributed compilation systemmrcc: A distributed compilation system The workload is parallelizable but data-dependence The workload is parallelizable but data-dependence
constrainedconstrained Explores parallelism using the MapReduce modelExplores parallelism using the MapReduce model
10
mrccmrcc
Multiple machines Multiple machines available to MapReduce available to MapReduce for parallel compilationfor parallel compilation
A master instructs A master instructs multiple slaves (“map multiple slaves (“map workers”) to compile workers”) to compile source filessource files
11
Design of mrccDesign of mrcc
12
make -j N
… …
master
slave
“make” explores parallelism among
compiling source files
“make” explores parallelism among
compiling source files
MapReduce jobs for compiling source files
MapReduce jobs for compiling source files
The map taskcompiles an individual file
The map taskcompiles an individual file
Experiment: mrcc over HadoopExperiment: mrcc over Hadoop MapReduce implementationMapReduce implementation
Hadoop 0.20.2Hadoop 0.20.2
TestbedTestbed 10 nodes 10 nodes available to Hadoop for parallel available to Hadoop for parallel
executionexecution Nodes are connected by Nodes are connected by 1Gbps1Gbps Ethernet Ethernet
WorkloadWorkload Compiling Linux kernel, ImageMagick, and Xen Compiling Linux kernel, ImageMagick, and Xen
toolstools
13
Result and observationResult and observation
The compilation using mrcc on 10 nodes is The compilation using mrcc on 10 nodes is 2~11 2~11 times times slower than sequential compilation on one node.slower than sequential compilation on one node.
ProjectProject Time for gcc (sequesntial Time for gcc (sequesntial compilation) (min) compilation) (min)
Time for Time for
mrcc/Hadoop (min)mrcc/Hadoop (min)
Linux kernelLinux kernel 4949 151151
ImageMagickImageMagick 55 1111
Xen toolsXen tools 22 2424
14
For compiling source fileFor compiling source file• Put source files to HDFS: >2sPut source files to HDFS: >2s• Start Hadoop job: > 20sStart Hadoop job: > 20s• Retrieve object files: >2sRetrieve object files: >2s
Where does the slowdown Where does the slowdown come from?come from?
• Network communication Network communication overhead for data transportation overhead for data transportation and replicationand replication
• Tasking overheadTasking overhead
Is there sufficient parallelism to exploit?Is there sufficient parallelism to exploit? Yes. “distcc” serves as baselineYes. “distcc” serves as baseline
One-way scalability in the (MapReduce) design One-way scalability in the (MapReduce) design and (Hadoop) implementation.and (Hadoop) implementation.
MapReduce is not designed for compiling. We use this case to show some of its limitations.
mrcc: Distributed Compilationmrcc: Distributed Compilation
15
Parallelization frameworkParallelization framework
MapReduce/Hadoop is inefficient for MapReduce/Hadoop is inefficient for general programminggeneral programming
Cloud computing needs a general Cloud computing needs a general parallelization framework!parallelization framework! Handle applications with complex logic, data Handle applications with complex logic, data
dependence and frequent updates, etc.dependence and frequent updates, etc. 39% of Facebook’s MapReduce workload have 39% of Facebook’s MapReduce workload have
only 1 Map [Zaharia 2010]only 1 Map [Zaharia 2010] Easy to use and high performanceEasy to use and high performance
16[Zaharia 2010] M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. 2010. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. EuroSys ‘10.
Lightweight solution: MRliteLightweight solution: MRlite A lightweight parallelization framework A lightweight parallelization framework
following the MapReduce paradigmfollowing the MapReduce paradigm Parallelization can be invoked when neededParallelization can be invoked when needed Able to scale “up” like MapReduce, and Able to scale “up” like MapReduce, and
scale “down” to process moderate-size datascale “down” to process moderate-size data Low latency and massive parallelismLow latency and massive parallelism Small run-time system overheadSmall run-time system overhead
17
A general parallelization framework and programming paradigm for cloud computing
Architecture of MRliteArchitecture of MRlite
18
MRlite clientMRlite client
MRlite masterscheduler
MRlite masterscheduler
slaveslave
slaveslave
slaveslave
slaveslave
applicationapplication
Data flowData flow
Command flowCommand flow
Linked together with the app, the MRlite client library accepts calls
from app and submits jobs to the master
Linked together with the app, the MRlite client library accepts calls
from app and submits jobs to the master High speed distributed
storage, stores intermediate files
High speed distributed storage, stores
intermediate files
The MRlite master accepts jobs from clients and
schedules them to execute on slaves
The MRlite master accepts jobs from clients and
schedules them to execute on slaves Distributed nodes
accept tasks from master and execute
them
Distributed nodes accept tasks from
master and execute them
DesignDesign Parallelization is invoked when neededParallelization is invoked when needed
An application can request parallel execution for An application can request parallel execution for arbitrary number of timesarbitrary number of times
Program’s natural logic flow integrated with Program’s natural logic flow integrated with parallelismparallelism
Remove one important limitationRemove one important limitation
Facility outlives utilityFacility outlives utility Use and reuse threads for master and slavesUse and reuse threads for master and slaves
Memory is “first class” mediumMemory is “first class” medium Avoid touching hard drivesAvoid touching hard drives
19
DesignDesign Programming interfaceProgramming interface
Provides simple APIProvides simple API API allows programs to invoke parallel processing API allows programs to invoke parallel processing
during executionduring execution Data handlingData handling
Network file system which stores files in memoryNetwork file system which stores files in memory No replication for intermediate filesNo replication for intermediate files Applications are responsible to retrieve output filesApplications are responsible to retrieve output files
Latency controlLatency control Jobs and tasks have timeout limitationsJobs and tasks have timeout limitations
20
ImplementationImplementation Implemented in C as Linux applicationsImplemented in C as Linux applications Distributed file storageDistributed file storage
Implemented with NFS in memory; Mounted Implemented with NFS in memory; Mounted from all nodes; Stores intermediate filesfrom all nodes; Stores intermediate files
A specially designed distributed in-memory A specially designed distributed in-memory network file system may further improve network file system may further improve performance (future work)performance (future work)
There is no limitation on the choice of There is no limitation on the choice of programming languagesprogramming languages
21
EvaluationEvaluation Re-implement mrcc on MRliteRe-implement mrcc on MRlite
It is not difficult to port mrcc because MRlite can It is not difficult to port mrcc because MRlite can handle a “superset” of the MapReduce workloadshandle a “superset” of the MapReduce workloads
Testbed and workloadTestbed and workload Use the same testbed and same workload to Use the same testbed and same workload to
compare MRlite‘s performance with compare MRlite‘s performance with MapReduce/Hadoop’sMapReduce/Hadoop’s
22
ResultResult
The compilation of the three projects using mrcc on MRlite The compilation of the three projects using mrcc on MRlite is much faster than compilation on one node. The speedup is much faster than compilation on one node. The speedup is is at least 2 at least 2 and the best speedup reaches and the best speedup reaches 66.. 23
MRlite vs. HadoopMRlite vs. HadoopThe average speedup of MRlite is more than The average speedup of MRlite is more than 1212 times times better than that of Hadoop.better than that of Hadoop.
24
The evaluation shows that MRlite is one order of magnitude The evaluation shows that MRlite is one order of magnitude faster than Hadoop on problems that MapReduce has faster than Hadoop on problems that MapReduce has difficulty in handling.difficulty in handling.
ProjectProject Speedup Speedup on Hadoopon Hadoop
Speedup Speedup on MRliteon MRlite
MRlite vs. MRlite vs. HadoopHadoop
LinuxLinux 0.320.32 5.85.8 17.917.9
ImageMagickImageMagick 0.480.48 6.26.2 13.013.0
Xen toolsXen tools 0.090.09 2.02.0 22.022.0
ConclusionConclusion Cloud Computing needs a general programming Cloud Computing needs a general programming
frameworkframework Cloud computing shall not be a platform to run just simple OLAP Cloud computing shall not be a platform to run just simple OLAP
applications. It is important to support complex computation and even applications. It is important to support complex computation and even OLTP on large data sets.OLTP on large data sets.
Use the distributed compilation case (mrcc) to probe Use the distributed compilation case (mrcc) to probe the one-way scalability limitation of MapReducethe one-way scalability limitation of MapReduce
Design MRlite: a general parallelization framework Design MRlite: a general parallelization framework for cloud computingfor cloud computing Handles applications with complex logic flow and data dependenciesHandles applications with complex logic flow and data dependencies Mitigates the one-way scalability problemMitigates the one-way scalability problem Able to handle all MapReduce tasks with comparable (if not better) Able to handle all MapReduce tasks with comparable (if not better)
performanceperformance
25
ConclusionConclusionEmerging computing platforms increasingly Emerging computing platforms increasingly emphasize parallelization capability, such as emphasize parallelization capability, such as GPGPUGPGPU
MRlite respects applications’ naturalMRlite respects applications’ natural logic flow logic flow and data dependenciesand data dependencies
This modularization of parallelization capability This modularization of parallelization capability from application logic enables MRlite to from application logic enables MRlite to integrate GPGPU processing very easily (future integrate GPGPU processing very easily (future work)work)
26
Thank you!Thank you!