computer architecture distributed memory mimd architectures ola flygt växjö university ...

50
Computer Architecture Computer Architecture Distributed Memory MIMD Distributed Memory MIMD Architectures Architectures Ola Flygt Växjö University http://w3.msi.vxu.se/users/ofl/ [email protected] +46 470 70 86 49

Upload: beverly-summers

Post on 31-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Computer Computer ArchitectureArchitecture

Distributed Memory MIMD Distributed Memory MIMD ArchitecturesArchitecturesOla Flygt

Växjö Universityhttp://w3.msi.vxu.se/users/ofl/

[email protected]+46 470 70 86 49

Page 2: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Outline

Definition and Design SpaceComputational ModelGranularityNode organizationInterconnection network

TopologySwitchingRouting

CH01

Page 3: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Multicomputers

Distributed Memory MIMD systems are often called Multicomputers

They may or may not have a virtually shared address space

They are typically more loosely coupled than Shared Memory MIMDs

Page 4: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Design space of Multicomputers

CH01

Page 5: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Computational Model

In theory any computational model may be used

In practice some have been implementedConventional + communicationCSP (Communicating Sequential Processes)Dataflow or actor based object oriented

Today almost only the conventional model is used

Page 6: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Granularity

As before Granularity is a parameter which applies to both node size and how much the problem is partitioned

They are of course interrelated and as before we have some optionsFine grainedMedium grainedCoarse grained

Page 7: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Generic Node Architecture

CH01

Page 8: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Generic Organization Model Of The Message-Passing Multicomputers

1st generation

CH01

Page 9: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Generic Organization Model Of The Message-Passing Multicomputers

Decentralized 2nd generation

CH01

Page 10: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Generic Organization Model Of The Message-Passing Multicomputers

Centralized 2nd generation

CH01

Page 11: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Generic Organization Model Of The Message-Passing Multicomputers

3rd generation

CH01

Page 12: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Classification Of Multicomputers

CH01

Page 13: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Main Network Topologies

CH01

Page 14: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Main Network Topologies

CH01

Page 15: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Main Network Topologies

CH01

Page 16: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Main Network Topologies

CH01

Page 17: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Static Network parametersTopology

Node degree Diameter

Bisection width

Arc connectivity Cost

Linear array 1 or 2 N-1 1 1 N-1

Ring 2 N/2 2 2 N

Star 1 or N-1 2 1 1 N-1

Binary tree 1, 2 or 3 2log((N+1)/2) 1 1 N-1

2-D mesh 2, 3 or 4 2(N½-1) N½ 2 2(N-N½)

2-D wraparound

mesh 4 N½2N½

4 2N

3-D cube 3, 4, 5 or 6 3(N⅓-1) N⅔ 3 2(N-N⅔)

Hypercube logN logN N/2 logN (NlogN)/2

Completely connected N-1 1 N2/4 N-1 N(N-1)/2

CH01

Page 18: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Design Space of Switching techniques

CH01

Page 19: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Packet Switchingarrangement

CH01

Page 20: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Packet Switchinglatency

CH01

Page 21: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Packet Switching

All the messages are divided into packets which are sent independently via the communication network between the source and destination nodes

The messages are transmitted in a store-and-forward fashion (each byte contained in a message had to be stored at each node along a route and forwarded to the next hope.)

CH01

Page 22: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Packet Switching

A packet consists of a header and the data. A header contains the necessary routing information based on that the switching unit decides where to forward the packet

When a packet arrives at an intermediate node, the whole packet is stored in a packet buffer.

Main drawback: latency is proportional to the message path length. (This was the reason why the diameter was the most important parameter in the first generation multicomputers, and why the hypercube was so popular.

CH01

Page 23: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Circuit Switchingarrangement

CH01

Page 24: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Circuit Switchinglatency

CH01

Page 25: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Circuit Switching

In the first phase of the communication a path (a circuit) is built up between the source and destination by sending a special short message (called probe)

The probe has similar function as the header of packets in the packet switching system

The circuit is held until the entire message is transmitted

During the communication the channels constituting the circuit are reserved exclusively, no other messages can be transmitted by them.

CH01

Page 26: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Circuit Switching

In the last phase the circuit is torn down either by the tail of transmitted message or by an acknowledgement message returned by the destination node

If a desired channel is used by another circuit in the circuit establishment phase, the partially built up circuit may be torn down

Circuit switching does not need packetizing. No matter what the message size is

There is no need for buffering The most important benefit: If the length of the probe, P

is much smaller then the length of the message, M then the latency becomes independent of the communication distance.

CH01

Page 27: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Virtual Cut-Through Switching

arrangement

CH01

Page 28: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Virtual Cut-Through Switching

latency

CH01

Page 29: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Virtual Cut-Through Switching

Attempt to combine the benefits of packet switching and circuit switching

The message is divided in to small units called flow control digits, or flits

As long as the required channels are free, the message is forwarded flit by flit among the nodes in a pipeline fashion

If the required channel is busy, flits are buffered at intermediate nodes

CH01

Page 30: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Virtual Cut-Through Switching

If the buffers are large enough, the entire message is buffered at the blocked intermediate node, resulting a behavior similar to packet switching

If the buffers are not large enough, the message will be buffered across several nodes, holding the links between them

Main benefit: If HF (length of the header flit) << M, the latency becomes independent of the distance, D

CH01

Page 31: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Worm-Hole Switchingarrangement

CH01

Page 32: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Worm-Hole Switchinglatency

CH01

Page 33: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Worm-Hole Switching

Do not create a circuit between sender and receiver. Instead, an initial control message at the start of the message establishes a path through the network and all subsequent data for that message are forwarded along that path

The message is broken into very small pieces (flits) and the network is pipelined. This is referred to as worm-hole routing due to the way that a message worms its way through the system

CH01

Page 34: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Worm-Hole Switching

A special case of virtual cut-through, where the buffers at the intermediate nodes have the size of a flit

There is a no start-up overhead related to distance, the entire message is not penalized due to the pipelining. If P (packet size) is small relative to N (message length), T will be similar to that for the circuit switching system in that T is not very dependent on D (distance)

The primary advantage of such a network is that links need not be blocked for the entire message duration, and (after introducing the virtual channel concept) it is possible to multiplex messages along individual links.

CH01

Page 35: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Routing protocols

Location of routing ”intelligence’’Source-based routing

Routers “eat” the head of a packetLarger packetsNo fault tolerance

Distributed (Local) routingMore complex routersSmaller packets

Page 36: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Classification of Routing protocols

CH01

Page 37: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Classification of Adaptive Routing

protocols

CH01

Page 38: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Routing protocolsTerminology

Minimal = only paths equal to the shortest path is selected

Profitable = only channels known to move closer to the goal is selected

Misrouting = all channels may be usedProgressive = never backtracks even if

blockedPartially adaptive = not all channels

may be selected as the next step

CH01

Page 39: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Routing protocols

Routing may cause: Deadlocks

Buffer deadlock (store-and-forward switching)

Channel deadlock (wormhole routing)Livelocks

Packets are forwarded in a loop in the network

Page 40: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Routing deadlocks

CH01

Page 41: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Routing protocolsDeterministic routing

X-Y RoutingWalk one

dimension at a time

CH01

Page 42: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Routing protocolsDeterministic routing

Interval labelingDistributed routing with simple

routing tables in the nodes

CH01

Page 43: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Routing protocolsAdaptive routing

Decision on next channel based on the current blocking situation

Can potentially give better utilization

CH01

Page 44: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Deadlock avoidance

Deterministic routing (e.g. X-Y)Partially adaptive routing

For example, west-first routing for 2D meshes: route a packet first to the west (if required), then route the packet adaptively to north, south or east

CH01

Page 45: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

West-first routing example

CH01

Page 46: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Deadlock avoidance, cont.

Virtual channelsVirtual channels are logical links

between two nodes using their own buffers and multiplexed over a single physical channel

Virtual channels “break” dependency cycles

CH01

Page 47: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Virtual channels

AdvantagesIncreased network throughputDeadlock avoidanceVirtual topologiesDedicated channels (e.g. debugging,

monitoring) Disadvantages

Hardware costHigher latencyIncoming packets may be out-of-order

CH01

Page 48: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Complex communication support

Common communication patternsPartner communication (unicast)Multicast (one-to-many)Broadcast (one-to-all)Exchange (many-to-many or all-to-all)

Routers may feature hardware support for these communication patterns

CH01

Page 49: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Complex communication support

An example: multicast support

Software unicasts often implemented using a tree communication structure

“Replication” and “Routing support” require the destinations in the packet-header

Replication often found in wormhole networks

CH01

Page 50: Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University  Ola.Flygt@msi.vxu.se +46 470 70

Multicomputers today

The idea with a Distributed Memory MIMD is the basis for most parallel (super computer) systems today

The idea have evolved intoCluster computing, using a LAN as

interconnection networkGrid computing, using more loosely

connected nodes (WAN, different owners)

CH01