umass lowell computer science 91.503 analysis of algorithms prof. karen daniels fall, 2001
DESCRIPTION
UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Fall, 2001. Lecture 9 Tuesday, 11/20/01 Parallel Algorithms Chapters 28, 30. Relevant Sections of Chapters 28-30. - PowerPoint PPT PresentationTRANSCRIPT
UMass Lowell Computer Science 91.503
Analysis of Algorithms Prof. Karen Daniels
Fall, 2001
UMass Lowell Computer Science 91.503
Analysis of Algorithms Prof. Karen Daniels
Fall, 2001
Lecture 9Lecture 9Tuesday, 11/20/01Tuesday, 11/20/01
Parallel AlgorithmsParallel AlgorithmsChapters 28, 30Chapters 28, 30
Relevant Sections of Chapters 28-30Relevant Sections of Chapters 28-30
Ch28Sorting Networks
You’re responsible for material in this chapter that we discuss in lecture. (Note that this includes all sections 28.1 - 28.5.)
Ch29Arithmetic Circuits
You’re not responsible for any of the material in this chapter. We will not be discussing it in lecture.
Ch30Algorithms for Parallel Computers
You’re responsible for material in this chapter that we discuss in lecture. (Note that this includes only sections 30.1 - 30.2.)
OverviewOverview
Sorting NetworksSorting Networks Comparison NetworksComparison Networks 0-1 Principle0-1 Principle Bitonic Sorting NetworkBitonic Sorting Network Merging NetworkMerging Network Sorting NetworkSorting Network
Algorithms for Parallel ComputersAlgorithms for Parallel Computers PRAM ModelPRAM Model Pointer JumpingPointer Jumping CRCW Algorithms vs. EREW AlgorithmsCRCW Algorithms vs. EREW Algorithms
Sorting NetworksChapter 28
Sorting NetworksChapter 28
Comparison NetworksComparison Networks
0-1 Principle0-1 Principle
Bitonic Sorting NetworkBitonic Sorting Network
Merging NetworkMerging Network
Sorting NetworkSorting Network
Comparison Networks:DefinitionComparison Networks:Definition
Comparison Network Comparison Network onlyonly performs comparisons. performs comparisons.
Comparisons may occur in parallel.Comparisons may occur in parallel.
2-input comparator:2-input comparator:
Comparison Network contains Comparison Network contains onlyonly comparators & wires. comparators & wires.
input wiresinput wires
output wiresoutput wires
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Comparison Networks: Definition (continued)
Comparison Networks: Definition (continued)
Graph of interconnections must be acyclic.
Define Define timetime using using wire wire depthdepth..
Running Time:Running Time:Comparator Comparator
uses uses (1)(1) time. time.
Input wire has Input wire has depth = 0.depth = 0.
Comparator Comparator with input wire with input wire depths ddepths dxx, d, dyy
has output has output wire depths = wire depths =
1),max( yx dd
Depth of Depth of comparison comparison
network = max network = max depth of a depth of a
comparator. comparator. source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Sorting Network: DefinitionSorting Network: Definition
Sorting NetworkSorting Network:: Comparison Network for which output Comparison Network for which output
sequence is monotonically increasingsequence is monotonically increasing
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Example:Example:
Sorting Network: StructureSorting Network: Structure
Families of Families of Comparison Comparison
NetworksNetworks
BITONIC-SORTERsBITONIC-SORTERs
MERGERsMERGERs
SORTERsSORTERs
HALF-CLEANERsHALF-CLEANERs
COMPARATORsCOMPARATORs
Bitonic Bitonic
Sorting Sorting
NetworksNetworksMergingMerging
NetworksNetworksSortingSorting
NetworksNetworks
Recursive Recursive StructureStructure
““Parallel Parallel MergeSort” MergeSort”
StrategyStrategy
Sort Sort nn values in values in O( lgO( lg22n )n ) time time
0-1 Principle0-1 Principle
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
If sorting network works correctly for {0,1} inputs, If sorting network works correctly for {0,1} inputs, it works correctly on arbitrary input numbers.it works correctly on arbitrary input numbers.
Proof relies on Proof relies on function monotonicityfunction monotonicity::allows us to limit attention to {0,1} inputsallows us to limit attention to {0,1} inputs
0-1 Principle (continued)0-1 Principle (continued)
Proof of Lemma 28.1:Proof of Lemma 28.1:
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
f monotonically increasing f monotonically increasing comparator with comparator with
inputs f(x), f(y) produces inputs f(x), f(y) produces outputs f(min(x,y)), f(max(x,y))outputs f(min(x,y)), f(max(x,y))
Induction on wire depthInduction on wire depth
0-1 Principle (continued)0-1 Principle (continued)
Example applying Lemma 28.1:Example applying Lemma 28.1:
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
2
)(x
fxf
0-1 Principle (continued)0-1 Principle (continued)
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
If sorting network works correctly for {0,1} inputs, If sorting network works correctly for {0,1} inputs, it works correctly on arbitrary input numbers.it works correctly on arbitrary input numbers.
allows us allows us to limit to limit
attention to attention to {0,1} inputs{0,1} inputs
Sorting Network: StructureSorting Network: Structure
Families of Families of Comparison Comparison
NetworksNetworks
BITONIC-SORTERsBITONIC-SORTERs
MERGERsMERGERs
SORTERsSORTERs
HALF-CLEANERsHALF-CLEANERs
COMPARATORsCOMPARATORs
Bitonic Bitonic
Sorting Sorting
NetworksNetworksMergingMerging
NetworksNetworksSortingSorting
NetworksNetworks
Recursive Recursive StructureStructure
““Parallel Parallel MergeSort” MergeSort”
StrategyStrategy
Sort Sort nn values in values in O( lgO( lg22n )n ) time time
Bitonic Sorting NetworkBitonic Sorting Network
Bitonic SequenceBitonic Sequence monotonically increases then monotonically decreasesmonotonically increases then monotonically decreases
or can be circularly shifted to conform to thisor can be circularly shifted to conform to this
Example: < 1, 4, 6, 8, 3, 2 >Example: < 1, 4, 6, 8, 3, 2 > {0,1} bitonic sequence has structure:{0,1} bitonic sequence has structure:
00ii 1 1jj 0 0kk or or 11ii 0 0jj 1 1kk
Bitonic Sorter: Bitonic Sorter: comparison network that sorts bitonic {0,1} sequencescomparison network that sorts bitonic {0,1} sequences will be used to construct Sorting Networkwill be used to construct Sorting Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
0,, kji
Bitonic Sorting NetworkBitonic Sorting Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Bitonic Sorter uses HALF-CLEANERsBitonic Sorter uses HALF-CLEANERs
Sample Sample inputs & inputs & outputs:outputs:
HALF-CLEANERHALF-CLEANER: - comparison network of depth 1 : - comparison network of depth 1
- input line i compared with line i + n/2 for i = 1,2,…,n/2- input line i compared with line i + n/2 for i = 1,2,…,n/2
Bitonic Sorting NetworkBitonic Sorting Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Bitonic Sorting NetworkBitonic Sorting Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Bitonic Sorting NetworkBitonic Sorting Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
HALF-CLEANER[n]HALF-CLEANER[n]
BITONIC-BITONIC-SORTER[n/2]SORTER[n/2]
BITONIC-BITONIC-SORTER[n/2]SORTER[n/2]
Recurrence for depth of Recurrence for depth of BITONIC-SORTER[n]BITONIC-SORTER[n]
Sorting Network: StructureSorting Network: Structure
Families of Families of Comparison Comparison
NetworksNetworks
BITONIC-SORTERsBITONIC-SORTERs
MERGERsMERGERs
SORTERsSORTERs
HALF-CLEANERsHALF-CLEANERs
COMPARATORsCOMPARATORs
Bitonic Bitonic
Sorting Sorting
NetworksNetworksMergingMerging
NetworksNetworksSortingSorting
NetworksNetworks
Recursive Recursive StructureStructure
““Parallel Parallel MergeSort” MergeSort”
StrategyStrategy
Sort Sort nn values in values in O( lgO( lg22n )n ) time time
Merging NetworkMerging Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Merge Merge 2 sorted input sequences into 1 sorted output sequence.2 sorted input sequences into 1 sorted output sequence.
use modification of BITONIC-SORTERuse modification of BITONIC-SORTERKEY IDEA: For sorted input sequences X, Y: XYKEY IDEA: For sorted input sequences X, Y: XYR R is bitonicis bitonic
can merge X, Y, using BITONIC-SORTER(XYcan merge X, Y, using BITONIC-SORTER(XYRR))challenge: perform reversal implicitlychallenge: perform reversal implicitly
Merging NetworkMerging Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
BITONIC-BITONIC-SORTER[n/2]SORTER[n/2]
BITONIC-BITONIC-SORTER[n/2]SORTER[n/2]
Sorting Network: StructureSorting Network: Structure
Families of Families of Comparison Comparison
NetworksNetworks
BITONIC-SORTERsBITONIC-SORTERs
MERGERsMERGERs
SORTERsSORTERs
HALF-CLEANERsHALF-CLEANERs
COMPARATORsCOMPARATORs
Bitonic Bitonic
Sorting Sorting
NetworksNetworksMergingMerging
NetworksNetworksSortingSorting
NetworksNetworks
Recursive Recursive StructureStructure
““Parallel Parallel MergeSort” MergeSort”
StrategyStrategy
Sort Sort nn values in values in O( lgO( lg22n )n ) time time
Sorting NetworkSorting Network
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
SORTER[n/2]SORTER[n/2]
SORTER[n/2]SORTER[n/2]
MERGER[n]MERGER[n]MERGER[8]MERGER[8]
MERGER[4]MERGER[4]
MERGER[4]MERGER[4]
MERGER[2]MERGER[2]
MERGER[2]MERGER[2]
MERGER[2]MERGER[2]
MERGER[2]MERGER[2]
1 and2 iflg)2/(
1 if0)(
knnnD
nnD k
Recurrence for depth of Recurrence for depth of SORTER[n]SORTER[n]
Algorithms for
Parallel Computers Chapter 30
Algorithms for
Parallel Computers Chapter 30
PRAM ModelPRAM ModelPointer JumpingPointer Jumping
CRCW Algorithms vs. EREW AlgorithmsCRCW Algorithms vs. EREW Algorithms
PRAM ModelPRAM Model
Need a model for parallel Need a model for parallel computingcomputing
RAM model is serialRAM model is serial Sorting network (Ch28) too Sorting network (Ch28) too
restrictiverestrictive Popular model: PRAMPopular model: PRAM
Parallel Random Access MachineParallel Random Access Machine
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
PRAM ModelPRAM Model
Memory Access Policies:Memory Access Policies:
Common-CRCW model:Common-CRCW model: When processors write “simultaneously” to same When processors write “simultaneously” to same
memory location, they write same valuememory location, they write same value
Alternatives:Alternatives:
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Section 30.1Section 30.1
Section 30.2Section 30.2
Pointer Jumping: List RankingPointer Jumping: List Ranking
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
O( lgn )O( lgn ) time time
( nlgn )( nlgn ) work workWorkWork = time x #processors = time x #processors
Correctness InvariantCorrectness Invariant: At start of each iteration : At start of each iteration of while loop, for each object i, sum of d values of while loop, for each object i, sum of d values for sublist headed by i = correct d[i]for sublist headed by i = correct d[i]
Running-Time InvariantRunning-Time Invariant: Each step of : Each step of pointer jumping transforms each list into 2 pointer jumping transforms each list into 2 interleaved lists (even, odd).interleaved lists (even, odd).
List Ranking ProblemList Ranking Problem: Given singly-linked : Given singly-linked list of n objects, compute, for each object, list of n objects, compute, for each object,
its distance from end of list:its distance from end of list:
Pointer Jumping: PrefixPointer Jumping: Prefix
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
O( lgn )O( lgn ) time time
Correctness InvariantCorrectness Invariant: At end of t: At end of tthth iteration of while iteration of while loop, kth processor stores [max(1,k-2loop, kth processor stores [max(1,k-2 t t +1),k]+1),k]
At each, if we perform prefix computation on each At each, if we perform prefix computation on each existing list, each object obtains correct value.existing list, each object obtains correct value.
start with x[i]=xk in each object i of the list
Differences from LIST-RANKDifferences from LIST-RANK
Pointer Jumping: Euler TourPointer Jumping: Euler Tour
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Problem:Problem: Compute depth of each Compute depth of each node in n-node binary tree.node in n-node binary tree.
1) Construct1) Construct Euler Tour Euler Tour of a graph: (cycle of a graph: (cycle traversing each traversing each edgeedge exactly exactly onceonce.).)
O(1)O(1) timetime
2) Initialize values for each of processor2) Initialize values for each of processor3 processors per node:3 processors per node:
3) Parallel Prefix computation using +3) Parallel Prefix computation using + O(lgn)O(lgn) timetime
CRCW vs. EREW AlgorithmsCRCW vs. EREW Algorithms
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Problem where Problem where concurrent concurrent readsreads help: help: Find identities Find identities
of tree roots in of tree roots in a foresta forest
CRCW vs. EREW AlgorithmsCRCW vs. EREW Algorithms
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.
Problem where Problem where concurrent writesconcurrent writes help: help: Find maximum element in array of real numbersFind maximum element in array of real numbers
CRCW vs. EREW AlgorithmsCRCW vs. EREW Algorithms
source: 91.503 textbook Cormen et al.source: 91.503 textbook Cormen et al.