niagara: a 32-way multithreaded sparc processorcgi.di.uoa.gr/~charnik/files/niagara.pdf ·...
Post on 18-Mar-2020
4 Views
Preview:
TRANSCRIPT
Niagara: A 32-Way Multithreaded SparcProcessor
Poonacha Kongetira, Kathirgamar Aingaran, Kunle OlukotunSun Microsystems
Charalampos S. Nikolaoucharnik@di.uoa.gr
Department of Informatics and Telecommunications
25 June 2008
GoalArchitectural GoalGoal’s characteristicsSun’s Approach
NiagaraNiagara OverviewSparc PipelineThread schedulingInteger Register FileMemory Subsystem
PerformancePower Consumption
Architectural Goal
Provide:
I high performance for commercial server applications
I low levels of power consumption
Goal’s characteristics
Commercial server applications tend to have:
I Low ILPhigh cache miss rates (large working sets/poor locality)many unpredictable branchesfrequently undetectable load-load dependencies=> memory access time limits performance
I High TLPlarge numbers of parallel client requests
I High power consumption400− 700W /foot2 for racked server clusters in Google
Sun’s Approach
I Ultra Sparc T1 Processor - Niagara 1
I Avoids high-latency communication between multiprocessors(SMP)
I Multicore approach (cores aggregated on a single die)
I Fine-grain multithreading within core
I Small L1 cache per core
I L2 cache shareable by cores
GoalArchitectural GoalGoal’s characteristicsSun’s Approach
NiagaraNiagara OverviewSparc PipelineThread schedulingInteger Register FileMemory Subsystem
PerformancePower Consumption
Niagara Overview
I 8 cores4 threads per core1 pipeline (Sparc pipeline) per core2 L1 caches (instruction/data) per core shareable by the 4 threadsthread scheduling per core
I 3-Mbyte L2 cache4-way banked and pipelined for high bandwidth12-way set-associative for minimizing conflict missesshared by all threads
I crossbar interconnect of up to 200GB/s bandwidthconnects Sparc pipes with L2 cache banks and other shared
resourcesprovides a port for accessing the I/O subsystemuses age-based priority scheme
I 4 channels of DDR2 DRAMmaximum bandwidth up to 20GB/scapacity up to 128GB
Niagara Processor
Sparc pipe
Single-issue pipeline with six stages: Fetch, Select, Decode,Execute, Memory, Write Back
Unique resources per thread:
I set of registers
I instruction buffer
I store buffer
Shared resources among threads:
I L1 cache
I translation look-aside buffers (TLB — ITLB, DTLB)
I ALU, divider, multiplier, shifter
Sparc pipe block diagram
Thread scheduling (1/2)
Policy based on:
I LRU status
I instruction type
I cache misses
I traps
I resource conflicts
I speculative loads
Figure: Thread selection: all threadsavailable
Thread scheduling (2/2)
Figure: Thread selection: only two threads available (0, 1)
Integer Register File
I One register file per thread
I A reg. file consists of 8 windows, whichconstists of 8 Ins , 8 Outs and 8 Locals regs
I A window corresponds to a procedure call
I Between two procedure calls the windowsshare the registers Ins and Outs
I Only one window is active
I Reads/writes take a single cycle Figure: Integer register file perthread
Memory Subsystem
I 16 KB L1 instruction cache4-way set-associative with a line (block) size of 32 bytesrandom replacement scheme for area savings
I 8 KB L1 data cache4-way set-associative with a line size of 16 byteswrite-through policy (allocate on load, no-allocate on
stores)
L2 cache:
I maintains a sharers list at L1-line granularity
I stores do not update L1 caches until they have updated theL2 cache
I copy-back policy (write-back dirty lines, drop clean lines)
L1 caches succeed 10% miss rate. Threads per core hide thelatencies from L1 and L2 misses.
GoalArchitectural GoalGoal’s characteristicsSun’s Approach
NiagaraNiagara OverviewSparc PipelineThread schedulingInteger Register FileMemory Subsystem
PerformancePower Consumption
Power Consumpion Performace
Niagara’s dissipation of power ranges from 60 to 72 W with itspeak to 75 W.
Figure: Power consumption of various processors
References
P. Kongetira, K. Aingaran, K. Olukotun,Niagara: A 32-Way Multithreaded SPARC Processor, IEEEMicro, March-April 2005, pp. 21-29.
Wikipedia,Comparison of power consumption of some nearly modernCPUshttp://en.wikipedia.org/wiki/CPU_power_dissipation#Intel_processors, 2006.
The End
Thank you!
top related