chapter 2 parallel architectures. outline interconnection networks interconnection networks...
TRANSCRIPT
![Page 1: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/1.jpg)
Chapter 2
Parallel ArchitecturesParallel Architectures
![Page 2: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/2.jpg)
Outline
Interconnection networksInterconnection networks Processor arraysProcessor arrays MultiprocessorsMultiprocessors MulticomputersMulticomputers Flynn’s taxonomyFlynn’s taxonomy
![Page 3: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/3.jpg)
Interconnection Networks
Uses of interconnection networksUses of interconnection networks Connect processors to shared memoryConnect processors to shared memory Connect processors to each otherConnect processors to each other
Interconnection media typesInterconnection media types Shared mediumShared medium Switched mediumSwitched medium
![Page 4: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/4.jpg)
Shared versus Switched Media
QuickTime™ and a decompressor
are needed to see this picture.
![Page 5: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/5.jpg)
Shared Medium
Allows only one message at a timeAllows only one message at a time Messages are broadcastMessages are broadcast Each processor “listens” to every messageEach processor “listens” to every message Arbitration is decentralizedArbitration is decentralized Collisions require resending of messagesCollisions require resending of messages Ethernet is an exampleEthernet is an example
![Page 6: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/6.jpg)
Switched Medium
Supports point-to-point messages between Supports point-to-point messages between pairs of processorspairs of processors
Each processor has its own path to switchEach processor has its own path to switch Advantages over shared mediaAdvantages over shared media
Allows multiple messages to be sent Allows multiple messages to be sent simultaneouslysimultaneously
Allows scaling of network to Allows scaling of network to accommodate increase in processorsaccommodate increase in processors
![Page 7: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/7.jpg)
Switch Network Topologies
View switched network as a graphView switched network as a graph Vertices = processors or switchesVertices = processors or switches Edges = communication pathsEdges = communication paths
Two kinds of topologiesTwo kinds of topologies DirectDirect IndirectIndirect
![Page 8: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/8.jpg)
Direct Topology
Ratio of switch nodes to processor nodes is Ratio of switch nodes to processor nodes is 1:11:1
Every switch node is connected toEvery switch node is connected to 1 processor node1 processor node At least 1 other switch nodeAt least 1 other switch node
![Page 9: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/9.jpg)
Indirect Topology
Ratio of switch nodes to processor nodes is Ratio of switch nodes to processor nodes is greater than 1:1greater than 1:1
Some switches simply connect other Some switches simply connect other switchesswitches
![Page 10: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/10.jpg)
Evaluating Switch Topologies Diameter Diameter
distance between farthest two nodesdistance between farthest two nodes Clique K_n best: d = O(1) Clique K_n best: d = O(1) but #edges m = O(n^2);but #edges m = O(n^2);
m = O(n) in a path P_n or cycle C_n, but d = O(n) as wellm = O(n) in a path P_n or cycle C_n, but d = O(n) as well Bisection widthBisection width
Min. number of edges in a cut which roughly divides a network in two halves Min. number of edges in a cut which roughly divides a network in two halves - determines the min. bandwidth of the network - determines the min. bandwidth of the network
K_n’s bisection width is O(n), but C_n’s O(1)K_n’s bisection width is O(n), but C_n’s O(1) Degree = Number of edges / node Degree = Number of edges / node
constant degree board can be mass producedconstant degree board can be mass produced Constant edge length? (yes/no)Constant edge length? (yes/no) Planar? – easier to buildPlanar? – easier to build
![Page 11: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/11.jpg)
2-D Mesh Network
Direct topologyDirect topology Switches arranged into a 2-D latticeSwitches arranged into a 2-D lattice Communication allowed only between Communication allowed only between
neighboring switchesneighboring switches Variants allow wraparound connections Variants allow wraparound connections
between switches on edge of meshbetween switches on edge of mesh
![Page 12: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/12.jpg)
2-D Meshes Torus
QuickTime™ and a decompressor
are needed to see this picture.
![Page 13: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/13.jpg)
Evaluating 2-D Meshes
Diameter: Diameter: ((nn1/21/2)) m = m = (n)(n) Bisection width: Bisection width: ((nn1/21/2)) Number of edges per switch: 4Number of edges per switch: 4 Constant edge length? YesConstant edge length? Yes planarplanar
![Page 14: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/14.jpg)
Binary Tree Network
Indirect topologyIndirect topology nn = 2 = 2dd processor nodes, processor nodes, nn-1 switches-1 switches
QuickTime™ and a decompressor
are needed to see this picture.
![Page 15: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/15.jpg)
Evaluating Binary Tree Network
Diameter: 2 log nDiameter: 2 log n M = O(n)M = O(n) Bisection width: 1Bisection width: 1 Edges / node: 3Edges / node: 3 Constant edge length? NoConstant edge length? No planarplanar
![Page 16: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/16.jpg)
Hypertree Network
Indirect topologyIndirect topology Shares low diameter of binary treeShares low diameter of binary tree Greatly improves bisection widthGreatly improves bisection width From “front” looks like From “front” looks like kk-ary tree of height -ary tree of height
dd From “side” looks like upside down binary From “side” looks like upside down binary
tree of height tree of height dd
![Page 17: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/17.jpg)
Hypertree Network
QuickTime™ and a decompressor
are needed to see this picture.
![Page 18: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/18.jpg)
Evaluating 4-ary Hypertree
Diameter: logDiameter: log n n
Bisection width: Bisection width: nn / 2 / 2
Edges / node: 6Edges / node: 6
Constant edge length? NoConstant edge length? No
![Page 19: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/19.jpg)
Butterfly Network
Indirect topologyIndirect topology nn = 2 = 2dd processor processor
nodes connectednodes connectedby by nn(log (log nn + 1) + 1)switching nodesswitching nodes
0 1 2 3 4 5 6 7
3,0 3,1 3,2 3,3 3,4 3,5 3,6 3,7
2,0 2,1 2,2 2,3 2,4 2,5 2,6 2,7
1,0 1,1 1,2 1,3 1,4 1,5 1,6 1,7
0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7Rank 0
Rank 1
Rank 2
Rank 3
![Page 20: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/20.jpg)
Butterfly Network Routing
QuickTime™ and a decompressor
are needed to see this picture.
![Page 21: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/21.jpg)
Evaluating Butterfly Network
Diameter: log Diameter: log nn
Bisection width: Bisection width: nn / 2 / 2
Edges per node: 4Edges per node: 4
Constant edge length? NoConstant edge length? No
![Page 22: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/22.jpg)
Hypercube
Direct topologyDirect topology 2 2 xx 2 2 xx … … xx 2 mesh 2 mesh Number of nodes a power of 2Number of nodes a power of 2 Node addresses 0, 1, …, 2Node addresses 0, 1, …, 2kk-1-1 Node Node ii connected to connected to kk nodes whose nodes whose
addresses differ from addresses differ from ii in exactly one bit in exactly one bit positionposition
![Page 23: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/23.jpg)
Hypercube Addressing
0010
0000
0100
0110 0111
1110
0001
0101
1000 1001
0011
1010
1111
1011
11011100
![Page 24: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/24.jpg)
Hypercubes Illustrated
![Page 25: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/25.jpg)
Evaluating Hypercube Network
Diameter: log Diameter: log nn
Bisection width: Bisection width: nn / 2 / 2
Edges per node: log Edges per node: log nn
Constant edge length? NoConstant edge length? No
![Page 26: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/26.jpg)
Shuffle-exchange
Direct topologyDirect topology Number of nodes a power of 2Number of nodes a power of 2 Nodes have addresses 0, 1, …, 2Nodes have addresses 0, 1, …, 2kk-1-1 Two outgoing links from node Two outgoing links from node ii
Shuffle link to node Shuffle link to node LeftCycle(i)LeftCycle(i) Exchange link to node [xor (Exchange link to node [xor (ii, 1)], 1)]
![Page 27: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/27.jpg)
Shuffle-exchange Illustrated
0 1 2 3 4 5 6 7
![Page 28: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/28.jpg)
Shuffle-exchange Addressing
0000 0001 0010 0011 0100 0101
1110 11111000 1001 1010 1011 1100 1101
0110 0111
![Page 29: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/29.jpg)
Evaluating Shuffle-exchange
Diameter: 2log Diameter: 2log nn - 1 - 1
Bisection width: Bisection width: n n / log / log nn
Edges per node: 2Edges per node: 2
Constant edge length? NoConstant edge length? No
![Page 30: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/30.jpg)
Comparing Networks
All have logarithmic diameterAll have logarithmic diameterexcept 2-D meshexcept 2-D mesh
Hypertree, butterfly, and hypercube have Hypertree, butterfly, and hypercube have bisection width bisection width nn / 2 / 2
All have constant edges per node except All have constant edges per node except hypercubehypercube
Only 2-D mesh keeps edge lengths constant Only 2-D mesh keeps edge lengths constant as network size increasesas network size increases
![Page 31: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/31.jpg)
Vector Computers
Vector computer: instruction set includes Vector computer: instruction set includes operations on vectors as well as scalarsoperations on vectors as well as scalars
Two ways to implement vector computersTwo ways to implement vector computers Pipelined vector processor: streams data Pipelined vector processor: streams data
through pipelined arithmetic units - CRAY-I, IIthrough pipelined arithmetic units - CRAY-I, II Processor array: many identical, synchronized Processor array: many identical, synchronized
arithmetic processing elements - Maspar’s MP-arithmetic processing elements - Maspar’s MP-I, III, II
![Page 32: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/32.jpg)
Why Processor Arrays?
Historically, high cost of a control unitHistorically, high cost of a control unit Scientific applications have data parallelismScientific applications have data parallelism
![Page 33: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/33.jpg)
Processor Array
QuickTime™ and a decompressor
are needed to see this picture.
![Page 34: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/34.jpg)
Data/instruction Storage
Front end computerFront end computer ProgramProgram Data manipulated sequentiallyData manipulated sequentially
Processor arrayProcessor array Data manipulated in parallelData manipulated in parallel
![Page 35: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/35.jpg)
Processor Array Performance
Performance: work done per time unitPerformance: work done per time unit Performance of processor arrayPerformance of processor array
Speed of processing elementsSpeed of processing elements Utilization of processing elementsUtilization of processing elements
![Page 36: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/36.jpg)
Performance Example 1
1024 processors1024 processors Each adds a pair of integers in 1 Each adds a pair of integers in 1 secsec What is performance when adding two What is performance when adding two
1024-element vectors (one per processor)?1024-element vectors (one per processor)?
sec/ops10024.1ePerformanc 9sec1
operations1024 ×==
![Page 37: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/37.jpg)
Performance Example 2
512 processors512 processors Each adds two integers in 1 Each adds two integers in 1 secsec Performance adding two vectors of length Performance adding two vectors of length
600?600?
sec/ops103ePerformanc 6sec2
operations600 ×==
![Page 38: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/38.jpg)
2-D Processor Interconnection Network
QuickTime™ and a decompressor
are needed to see this picture.
Each VLSI chip has 16 processing elements
![Page 39: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/39.jpg)
if (COND) then A else B
![Page 40: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/40.jpg)
if (COND) then A else B
![Page 41: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/41.jpg)
if (COND) then A else B
![Page 42: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/42.jpg)
Processor Array Shortcomings
Not all problems are data-parallelNot all problems are data-parallel Speed drops for conditionally executed Speed drops for conditionally executed
codecode Don’t adapt to multiple users wellDon’t adapt to multiple users well Do not scale down well to “starter” systemsDo not scale down well to “starter” systems Rely on custom VLSI for processorsRely on custom VLSI for processors Expense of control units has droppedExpense of control units has dropped
![Page 43: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/43.jpg)
Multiprocessors
Multiprocessor: multiple-CPU computer Multiprocessor: multiple-CPU computer with a shared memorywith a shared memory
Same address on two different CPUs refers Same address on two different CPUs refers to the same memory locationto the same memory location
Avoid three problems of processor arraysAvoid three problems of processor arrays Can be built from commodity CPUsCan be built from commodity CPUs Naturally support multiple usersNaturally support multiple users Maintain efficiency in conditional codeMaintain efficiency in conditional code
![Page 44: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/44.jpg)
Centralized Multiprocessor
Straightforward extension of uniprocessorStraightforward extension of uniprocessor Add CPUs to busAdd CPUs to bus All processors share same primary memoryAll processors share same primary memory Memory access time same for all CPUsMemory access time same for all CPUs
Uniform memory access (UMA) Uniform memory access (UMA) multiprocessormultiprocessor
Symmetrical multiprocessor (SMP) - Sequent Symmetrical multiprocessor (SMP) - Sequent Balance Series, SGI Power and Challenge Balance Series, SGI Power and Challenge seriesseries
![Page 45: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/45.jpg)
Centralized Multiprocessor
QuickTime™ and a decompressor
are needed to see this picture.
![Page 46: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/46.jpg)
Private and Shared Data
Private data: items used only by a single Private data: items used only by a single processorprocessor
Shared data: values used by multiple Shared data: values used by multiple processorsprocessors
In a multiprocessor, processors In a multiprocessor, processors communicate via shared data valuescommunicate via shared data values
![Page 47: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/47.jpg)
Problems Associated with Shared Data
Cache coherenceCache coherence Replicating data across multiple caches Replicating data across multiple caches
reduces contentionreduces contention How to ensure different processors have How to ensure different processors have
same value for same address?same value for same address? SynchronizationSynchronization
Mutual exclusionMutual exclusion BarrierBarrier
![Page 48: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/48.jpg)
Cache-coherence Problem
Cache
CPU A
Cache
CPU B
Memory
7X
![Page 49: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/49.jpg)
Cache-coherence Problem
CPU A CPU B
Memory
7X
7
![Page 50: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/50.jpg)
Cache-coherence Problem
CPU A CPU B
Memory
7X
7 7
![Page 51: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/51.jpg)
Cache-coherence Problem
CPU A CPU B
Memory
2X
7 2
![Page 52: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/52.jpg)
Write Invalidate Protocol
CPU A CPU B
7X
7 7 Cache control monitor
![Page 53: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/53.jpg)
Write Invalidate Protocol
CPU A CPU B
7X
7 7
Intent to write X
![Page 54: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/54.jpg)
Write Invalidate Protocol
CPU A CPU B
7X
7
Intent to write X
![Page 55: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/55.jpg)
Write Invalidate Protocol
CPU A CPU B
X 2
2
![Page 56: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/56.jpg)
Distributed Multiprocessor
Distribute primary memory among Distribute primary memory among processorsprocessors
Increase aggregate memory bandwidth and Increase aggregate memory bandwidth and lower average memory access timelower average memory access time
Allow greater number of processorsAllow greater number of processors Also called non-uniform memory access Also called non-uniform memory access
(NUMA) multiprocessor - SGI Origin (NUMA) multiprocessor - SGI Origin SeriesSeries
![Page 57: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/57.jpg)
Distributed Multiprocessor
QuickTime™ and a decompressor
are needed to see this picture.
![Page 58: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/58.jpg)
Cache Coherence
Some NUMA multiprocessors do not Some NUMA multiprocessors do not support it in hardwaresupport it in hardware Only instructions, private data in cacheOnly instructions, private data in cache Large memory access time varianceLarge memory access time variance
Implementation more difficultImplementation more difficult No shared memory bus to “snoop”No shared memory bus to “snoop” Directory-based protocol neededDirectory-based protocol needed
![Page 59: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/59.jpg)
Directory-based Protocol
Distributed directory contains information Distributed directory contains information about cacheable memory blocksabout cacheable memory blocks
One directory entry for each cache blockOne directory entry for each cache block Each entry hasEach entry has
Sharing statusSharing status Which processors have copiesWhich processors have copies
![Page 60: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/60.jpg)
Sharing Status
UncachedUncached Block not in any processor’s cacheBlock not in any processor’s cache
SharedShared Cached by one or more processorsCached by one or more processors Read onlyRead only
ExclusiveExclusive Cached by exactly one processorCached by exactly one processor Processor has written blockProcessor has written block Copy in memory is obsoleteCopy in memory is obsolete
![Page 61: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/61.jpg)
Directory-based ProtocolInterconnection Network
Directory
Local Memory
Cache
CPU 0
Directory
Local Memory
Cache
CPU 1
Directory
Local Memory
Cache
CPU 2
![Page 62: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/62.jpg)
Directory-based ProtocolInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X U 0 0 0
Bit Vector
![Page 63: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/63.jpg)
CPU 0 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X U 0 0 0
Read Miss
![Page 64: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/64.jpg)
CPU 0 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X S 1 0 0
![Page 65: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/65.jpg)
CPU 0 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X S 1 0 0
7X
![Page 66: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/66.jpg)
CPU 2 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X S 1 0 0
7X
Read Miss
![Page 67: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/67.jpg)
CPU 2 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X S 1 0 1
7X
![Page 68: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/68.jpg)
CPU 2 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X S 1 0 1
7X 7X
![Page 69: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/69.jpg)
CPU 0 Writes 6 to XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X S 1 0 1
7X 7X
Write Miss
![Page 70: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/70.jpg)
CPU 0 Writes 6 to XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X S 1 0 1
7X 7X
Invalidate
![Page 71: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/71.jpg)
CPU 0 Writes 6 to XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X E 1 0 0
6X
![Page 72: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/72.jpg)
CPU 1 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X E 1 0 0
6X
Read Miss
![Page 73: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/73.jpg)
CPU 1 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
7X
Caches
Memories
Directories X E 1 0 0
6X
Switch to Shared
![Page 74: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/74.jpg)
CPU 1 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
6X
Caches
Memories
Directories X E 1 0 0
6X
![Page 75: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/75.jpg)
CPU 1 Reads XInterconnection Network
CPU 0 CPU 1 CPU 2
6X
Caches
Memories
Directories X S 1 1 0
6X 6X
![Page 76: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/76.jpg)
CPU 2 Writes 5 to XInterconnection Network
CPU 0 CPU 1 CPU 2
6X
Caches
Memories
Directories X S 1 1 0
6X 6X
Write Miss
![Page 77: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/77.jpg)
CPU 2 Writes 5 to XInterconnection Network
CPU 0 CPU 1 CPU 2
6X
Caches
Memories
Directories X S 1 1 0
6X 6X
Invalidate
![Page 78: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/78.jpg)
CPU 2 Writes 5 to XInterconnection Network
CPU 0 CPU 1 CPU 2
6X
Caches
Memories
Directories X E 0 0 1
5X
![Page 79: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/79.jpg)
CPU 0 Writes 4 to XInterconnection Network
CPU 0 CPU 1 CPU 2
6X
Caches
Memories
Directories X E 0 0 1
5X
Write Miss
![Page 80: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/80.jpg)
CPU 0 Writes 4 to XInterconnection Network
CPU 0 CPU 1 CPU 2
6X
Caches
Memories
Directories X E 1 0 0
Take Away
5X
![Page 81: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/81.jpg)
CPU 0 Writes 4 to XInterconnection Network
CPU 0 CPU 1 CPU 2
5X
Caches
Memories
Directories X E 0 1 0
5X
![Page 82: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/82.jpg)
CPU 0 Writes 4 to XInterconnection Network
CPU 0 CPU 1 CPU 2
5X
Caches
Memories
Directories X E 1 0 0
![Page 83: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/83.jpg)
CPU 0 Writes 4 to XInterconnection Network
CPU 0 CPU 1 CPU 2
5X
Caches
Memories
Directories X E 1 0 0
5X
![Page 84: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/84.jpg)
CPU 0 Writes 4 to XInterconnection Network
CPU 0 CPU 1 CPU 2
5X
Caches
Memories
Directories X E 1 0 0
4X
![Page 85: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/85.jpg)
CPU 0 Writes Back X BlockInterconnection Network
CPU 0 CPU 1 CPU 2
5X
Caches
Memories
Directories X E 1 0 0
4X
4X
Data Write Back
![Page 86: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/86.jpg)
CPU 0 Writes Back X BlockInterconnection Network
CPU 0 CPU 1 CPU 2
4X
Caches
Memories
Directories X U 0 0 0
![Page 87: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/87.jpg)
Multicomputer
Distributed memory multiple-CPU computerDistributed memory multiple-CPU computer Same address on different processors refers to Same address on different processors refers to
different physical memory locationsdifferent physical memory locations Processors interact through message passingProcessors interact through message passing Commercial multicomputers iPSC I, II, Intel Commercial multicomputers iPSC I, II, Intel
Paragon, Ncube I, IIParagon, Ncube I, II Commodity clusters – e.g., CheetahCommodity clusters – e.g., Cheetah
![Page 88: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/88.jpg)
Asymmetrical Multicomputer
QuickTime™ and a decompressor
are needed to see this picture.
![Page 89: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/89.jpg)
Asymmetrical MC Advantages
Back-end processors dedicated to parallel Back-end processors dedicated to parallel computations computations Easier to understand, Easier to understand, model, tune performancemodel, tune performance
Only a simple back-end operating system Only a simple back-end operating system needed needed Easy for a vendor to create Easy for a vendor to create
![Page 90: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/90.jpg)
Asymmetrical MC Disadvantages
Front-end computer is a single point of Front-end computer is a single point of failurefailure
Single front-end computer limits scalability Single front-end computer limits scalability of systemof system
Primitive operating system in back-end Primitive operating system in back-end processors makes debugging difficultprocessors makes debugging difficult
Every application requires development of Every application requires development of both front-end and back-end programboth front-end and back-end program
![Page 91: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/91.jpg)
Symmetrical Multicomputer
QuickTime™ and a decompressor
are needed to see this picture.
![Page 92: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/92.jpg)
Symmetrical MC Advantages
Alleviate performance bottleneck caused by Alleviate performance bottleneck caused by single front-end computersingle front-end computer
Better support for debuggingBetter support for debugging Every processor executes same programEvery processor executes same program
![Page 93: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/93.jpg)
Symmetrical MC Disadvantages
More difficult to maintain illusion of single More difficult to maintain illusion of single “parallel computer”“parallel computer”
No simple way to balance program No simple way to balance program development workload among processorsdevelopment workload among processors
More difficult to achieve high performance More difficult to achieve high performance when multiple processes on each processorwhen multiple processes on each processor
![Page 94: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/94.jpg)
ParPar Cluster, A Mixed Model
QuickTime™ and a decompressor
are needed to see this picture.
![Page 95: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/95.jpg)
Commodity Cluster
Co-located computersCo-located computers Dedicated to running parallel jobsDedicated to running parallel jobs No keyboards or displaysNo keyboards or displays Identical operating systemIdentical operating system Identical local disk imagesIdentical local disk images Administered as an entityAdministered as an entity
![Page 96: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/96.jpg)
Network of Workstations
Dispersed computersDispersed computers First priority: person at keyboardFirst priority: person at keyboard Parallel jobs run in backgroundParallel jobs run in background Different operating systemsDifferent operating systems Different local imagesDifferent local images Checkpointing and restarting importantCheckpointing and restarting important
![Page 97: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/97.jpg)
Flynn’s Taxonomy
Instruction streamInstruction stream Data streamData stream Single vs. multipleSingle vs. multiple Four combinationsFour combinations
SISDSISD SIMDSIMD MISDMISD MIMDMIMD
![Page 98: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/98.jpg)
SISD
Single Instruction, Single DataSingle Instruction, Single Data Single-CPU systemsSingle-CPU systems Note: co-processors don’t countNote: co-processors don’t count
FunctionalFunctional I/OI/O
Example: PCsExample: PCs
![Page 99: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/99.jpg)
SIMD
Single Instruction, Multiple DataSingle Instruction, Multiple Data Two architectures fit this categoryTwo architectures fit this category
Pipelined vector processorPipelined vector processor(e.g., Cray-1)(e.g., Cray-1)
Processor arrayProcessor array(e.g., Connection Machine CM-1, (e.g., Connection Machine CM-1, MASPAR 1000/2000)MASPAR 1000/2000)
![Page 100: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/100.jpg)
MISD
MultipleMultipleInstruction,Instruction,Single DataSingle Data
Example:Example:systolic array??systolic array??
QuickTime™ and a decompressor
are needed to see this picture.
![Page 101: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/101.jpg)
MIMD
Multiple Instruction, Multiple DataMultiple Instruction, Multiple Data Multiple-CPU computersMultiple-CPU computers
MultiprocessorsMultiprocessors MulticomputersMulticomputers
![Page 102: Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649eb35503460f94bbaad7/html5/thumbnails/102.jpg)
Summary
Commercial parallel computers appearedCommercial parallel computers appearedin 1980sin 1980s
Multiple-CPU computers now dominateMultiple-CPU computers now dominate Small-scale: Centralized multiprocessorsSmall-scale: Centralized multiprocessors Large-scale: Distributed memory Large-scale: Distributed memory
architectures (multiprocessors or architectures (multiprocessors or multicomputers)multicomputers)