the alpha 21364 network architecture by shubhendu s. mukherjee, peter bannon steven lang, aaron...

10
The Alpha 21364 The Alpha 21364 Network Architecture Network Architecture By Shubhendu S. Mukherjee, Peter Bannon By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Compaq Computer Corporation Presented by Presented by Luis Alfredo Campos Luis Alfredo Campos

Upload: jennifer-ray

Post on 30-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

The Alpha 21364 Network The Alpha 21364 Network ArchitectureArchitecture

By Shubhendu S. Mukherjee, Peter Bannon Steven By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David WebbLang, Aaron Spink, and David Webb

Compaq Computer CorporationCompaq Computer CorporationPresented by Presented by

Luis Alfredo CamposLuis Alfredo Campos

Page 2: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

Alpha 21364 GoalsAlpha 21364 Goals

Support communication-intensive server applicationsSupport communication-intensive server applications– High performance technical computingHigh performance technical computing– Database serversDatabase servers– Web serversWeb servers– Telecommunication applicationsTelecommunication applications

Achieve:Achieve:– Extremely low latencyExtremely low latency– Enormous bandwidthEnormous bandwidth– Support directory cache coherenceSupport directory cache coherence

Improve:Improve:– ReliabilityReliability– AvailabilityAvailability

Page 3: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

OverviewOverview

Alpha 21264 core with enhancementsTightly-Coupled multiprocessor Tightly-Coupled multiprocessor networknetwork– Connects up to 128 Connects up to 128

processorsprocessors– Two-Dimensional torus Two-Dimensional torus

networknetwork

Integrated L2 CacheIntegrated memory controllerRouter Router – Directory-Based CCDirectory-Based CC– Separate Virtual ChannelsSeparate Virtual Channels– Packet ClassesPacket Classes

Page 4: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

Network Packet ClassesNetwork Packet Classes

Seven Packet ClassesSeven Packet Classes– Request (3 Flits)Request (3 Flits)– Forward (3 Flits)Forward (3 Flits)– Block Response (18 or 19 Flits)Block Response (18 or 19 Flits)– Non-Block Response (2 or 3 Flits)Non-Block Response (2 or 3 Flits)– Write I/O (19 Flits)Write I/O (19 Flits)– Read I/O (3 Flits)Read I/O (3 Flits)– Special (1 or 3 Flits)Special (1 or 3 Flits)

Flits Are 32 Bits Data Plus 7 Bits ECCFlits Are 32 Bits Data Plus 7 Bits ECC

Page 5: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

Network ArchitectureNetwork Architecture

Two-dimensional Two-dimensional torustorus– Limited Support for Limited Support for

Imperfect ToriImperfect Tori

Allows Fault Allows Fault RemappingRemapping

Virtual Cut-Through Virtual Cut-Through RoutingRouting– Buffer space for 316 Buffer space for 316

packetspackets

Page 6: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

Adaptive RoutingAdaptive Routing

Four Rectangles With Four Rectangles With Current and Current and Destination At Destination At DiagonalsDiagonalsPackets route within Packets route within the minimum the minimum rectanglerectangleMaximize the Maximize the bandwidth between bandwidth between source and source and destinationdestination

Page 7: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

Avoiding Deadlocks in Adaptive RoutingAvoiding Deadlocks in Adaptive Routing

““Adaptive routing will not Adaptive routing will not deadlock a network as deadlock a network as long as packets can drain long as packets can drain via a deadlock-free path”via a deadlock-free path”19 Virtual Channels19 Virtual Channels– 3 sets of virtual channel per 3 sets of virtual channel per

Packet class except for the Packet class except for the Special Class (only one Special Class (only one channel)channel)

Adaptive, VC0, and VC1Adaptive, VC0, and VC1

– Adaptive Is First ChoiceAdaptive Is First Choice– VC0 and VC1 combination VC0 and VC1 combination

creates deadlock-free creates deadlock-free networknetwork

Page 8: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

Router ArchitectureRouter Architecture

9 pipeline types9 pipeline types– Input and Output: Local, Interprocessor, and Input and Output: Local, Interprocessor, and

I/OI/O

Pin to pin latency of 13 cyclesPin to pin latency of 13 cycles– Running at 1.2 GhzRunning at 1.2 Ghz

Network Links run 33% slowerNetwork Links run 33% slower– Running at 0.8 GhzRunning at 0.8 Ghz– Synchronous with outgoing linksSynchronous with outgoing links– Asynchronous with incoming linksAsynchronous with incoming links

Page 9: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

ArbitrationArbitration

Needs to avoid central bottleneckNeeds to avoid central bottleneck– 16 local arbiters16 local arbiters– 7 global arbiters7 global arbiters

Least Recently Selected (LRS) Least Recently Selected (LRS) SchemeScheme– Local ArbitersLocal Arbiters

ClassesClassesVirtual ChannelVirtual Channel

– Global ArbitersGlobal ArbitersInput portsInput ports

Rotary Rule modeRotary Rule mode– Priority to oldest packetsPriority to oldest packets

Coherence Dependence Priority Coherence Dependence Priority (CDP) Rule mode(CDP) Rule mode– Priority depending on class orderingPriority depending on class ordering

Page 10: The Alpha 21364 Network Architecture By Shubhendu S. Mukherjee, Peter Bannon Steven Lang, Aaron Spink, and David Webb Compaq Computer Corporation Presented

QuestionsQuestions

How Is the 1.2 GHz Internal/800 MHz How Is the 1.2 GHz Internal/800 MHz External Clock OK?External Clock OK?

Why 2-d Torus?Why 2-d Torus?– What Are the Limitations Imposed?What Are the Limitations Imposed?