mapreduce - read-only

35
MapReduce Vitaly Shmatikov CS 5450

Upload: others

Post on 19-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: mapreduce - Read-Only

MapReduce

Vitaly Shmatikov

CS 5450

Page 2: mapreduce - Read-Only

BackRub (Google), 1997

slide 2

Page 3: mapreduce - Read-Only

NEC Earth Simulator, 2002

slide 3

Page 4: mapreduce - Read-Only

Conventional HPC Machine

uCompute nodes• High-end processors• Lots of RAM

uNetwork• Specialized• Very high performance

uStorage server• RAID disk array

slide 4

Page 5: mapreduce - Read-Only

HPC Programming Model

uPrograms described at a very low level• Detailed control of

processing and communication

uRely on small number of software packages• Written by specialists• Limits classes of

problems and solution methods

slide 5

Page 6: mapreduce - Read-Only

Typical HPC Operation

uCharacteristics• Long-lived processes• Make use of spatial locality• Hold all data in memory (no disk access)• High-bandwidth communication

uStrengths• High utilitization of resources• Effective for many scientific applications

uWeaknesses• Requires careful tuning of applications to

resources• Intolerant of any variability

slide 6

Page 7: mapreduce - Read-Only

HPC Fault Tolerance

uCheckpoint• Periodically store state of all

processes• Significant I/O traffic

uRestore after failure• Reset state to last checkpoint• All intervening computation

wasteduPerformance scaling

• Very sensitive to the number of failing components

slide 7

wasted

Page 8: mapreduce - Read-Only

Google Data Center, 2016

slide 8

Dozens of these!

Page 9: mapreduce - Read-Only

Ideal Cluster Programming Model

uApplications written in terms of high-level operations on the data

uRuntime system controls scheduling, load balancing

slide 9

Page 10: mapreduce - Read-Only

MapReduce, 2004

slide 10

Page 11: mapreduce - Read-Only

MapReduce Programming Model

uMap computation across many data objects• For example, 1010 webpages

uAggregate results in many different waysuSystem deals with resource allocation and availability

slide 11

Page 12: mapreduce - Read-Only

Programming in Lisp

slide 12

Computing the sum of squares

• (map square ‘(1 2 3 4))Processes each record sequentially and

independentlyOutput: (1 4 9 16)

• (reduce + ‘(1 4 9 16))Computes (+ 16 (+ 9 (+ 4 1) ) )Output: 30

Page 13: mapreduce - Read-Only

Application: Word Count

slide 13

SELECT count(word) FROM data GROUP BY word

Page 14: mapreduce - Read-Only

Partial Aggregation

1. In parallel, each worker computes word counts from individual files

2. Collect results, wait until all finished3. Merge intermediate output4. Compute word count on merged intermediates

slide 14

Page 15: mapreduce - Read-Only

Map

slide 15

Process individual records to generate (key, value) pairs

Welcome Everyone

Hello Everyone

Welcome 1Everyone 1 Hello 1Everyone 1

ValueKey

Page 16: mapreduce - Read-Only

Parallel Map

slide 16

Process individual records to generate (key, value) pairs in parallel

Welcome Everyone

Hello Everyone

Welcome 1Everyone 1

Hello 1Everyone 1

Key Value

map task 1

map task 2

Process large number of records in parallel

Page 17: mapreduce - Read-Only

Reduce

slide 17

Merge all intermediate values per key

Welcome 1Everyone 1 Hello 1Everyone 1

ValueKey

Everyone 2Welcome 1 Hello 1

ValueKey

Page 18: mapreduce - Read-Only

Partitioning

slide 18

Merge all intermediate values in parallel:partition keys, assign each key to one Reduce

Welcome 1Everyone 1 Hello 1Everyone 1

ValueKey

Everyone 2Welcome 1 Hello 1

ValueKey

Reduce task 1

Reduce task 2

Page 19: mapreduce - Read-Only

MapReduce API

slide 19

map(key, value) -> list(<k’, v’>)

• Applies function to (key, value) pair and produces set of intermediate pairs

reduce(key, list<value>) -> <k’, v’>

• Applies aggregation function to values• Outputs result

Page 20: mapreduce - Read-Only

MapReduce WordCount

slide 20

map(key, value):for each word w in value:

EmitIntermediate(w, "1");

reduce(key, list(values): int result = 0;

for each v in values: result += ParseInt(v);

Emit(AsString(result));

Page 21: mapreduce - Read-Only

Optimizations

slide 21

combine(list<key, value>) -> list<k,v>

• Perform partial aggregation on each mapper node<the, 1>, <the, 1>, <the, 1> à <the, 3>

• reduce() should be commutative and associative

partition(key, int) -> int• Need to aggregate intermediate values with same key• Given n partitions, map key to partition 0 ≤ i < n

– Typically via hash(key) mod n - but not always!

Page 22: mapreduce - Read-Only

Putting It Together

slide 22

map combine partition reduce

Page 23: mapreduce - Read-Only

Synchronization Barrier

slide 23

Page 24: mapreduce - Read-Only

App: grep

u Input: large set of filesuOutput: lines that match pattern

uMap:Output a line if it matches the supplied patternuReduce: Copy the intermediate data to output

slide 24

Page 25: mapreduce - Read-Only

App: Reverse Web-Link Graph

u Input: web graph, i.e., tuples (A, B) where page A links to page B

uOutput: for each page, list of pages that link to it

uMap:For each (source, target), output (target, source)uReduce: Output (target, list(source))

slide 25

Page 26: mapreduce - Read-Only

App: URL Access Frequencyu Input: log of accessed URLs (e.g., from a proxy)u Output: for each URL, % of total accesses for that URL

u Map:For each URL, output (URL, 1)u Multiple reducers: Output (URL, urlCount)Chain another MapReduce...u Map:For each (URL, urlCount), output (1, (URL, urlCount))u Reduce: Compute TotalCount, output multiple (URL, urlCount/TotalCount)

slide 26

Page 27: mapreduce - Read-Only

App: Sorting

u Input: series of valuesuOutput: sorted values

uMap:For each value, output (value, _)uPartitioning:Partition keys across reducers based on ranges

• Take data distribution into accounts to balance reducer tasksuReduce: Copy the intermediate data to output

slide 27

Page 28: mapreduce - Read-Only

MapReduce Features

uCharacteristics• Computation broken up into

many short-lived tasks (mapping, reducing)

• Disk storage to hold intermediate results

uStrengths• Very flexible placement,

scheduling, load balancing• Can access large datasets

uWeaknesses• Higher overhead• Lower raw performance

slide 28

Page 29: mapreduce - Read-Only

MapReduce Execution

slide 29

Page 30: mapreduce - Read-Only

Fault Tolerance in MapReduce

uMap worker writes intermediate output to local disk, separated by partitioning; once completed, tells master node

uReduce worker told of location of mapper outputs, pulls their partition’s data, executes function across data

uNote:• “All-to-all” shuffle between mappers and

reducers• Written to disk (“materialized”) before each

stage

slide 30

Page 31: mapreduce - Read-Only

Fault Tolerance in MapReduce

uMaster node monitors state of system• If master fails, job aborts

uMap worker failure• In-progress and completed tasks marked as idle• Reduce workers notified when map task is re-

executed on another map workeruReducer worker failure

• In-progress tasks are reset and re-executed• Completed tasks had been written to global file

systemslide 31

Page 32: mapreduce - Read-Only

Stragglers

uStraggler = task that takes long time to execute• Bugs, flaky hardware, poor partitioning

uFor slow map tasks, execute in parallel on second “map” worker as backup, race to complete task

uWhen done with most tasks, reschedule any remaining executing tasks• Keep track of redundant executions• Significantly reduces overall run time

slide 32

Page 33: mapreduce - Read-Only

Straggler Mitigation

slide 33

Page 34: mapreduce - Read-Only

Goodbye MapReduce

slide 34

Page 35: mapreduce - Read-Only

Modern Data Processing

slide 35

Big driver: machine learning