jaguar microarchitecturemeseec.ce.rit.edu/551-projects/fall2015/3-3.pdfcops, they are dispatched...
TRANSCRIPT
JaguarMicroarchitecture
Alex Avery, Cody Smith
Agenda
● AMD Processors● Jaguar Overview● Example Hardware● Core Pipeline● Instruction Fetch and Cache● Instruction Decoding● Scheduling● Integer & FP Execution● Memory● Cache
What is a Microarchitecture?
Microarchitecture is the Computer Organization
Microarchitecture + Instruction Set Architecture = Computer Architecture
A Microarchitecture describes the electrical circuitry of the device, it is how the ISA is implemented.
AMD Processors● Bobcat (2011)● Piledriver (2012)● Jaguar (2013)● Steamroller (2014)● Puma (2014)● Excavator (2015)
Jaguar Overview
● Targets 2-25W Devices● Low cost● 28 nm Technology● Up to 4 Cores● Split L1 Cache - 32 KiB instruction and 32 KiB data per core● Unified L2 Cache - 1-2 MiB, 16 way● Out-of-order and Speculative Execution● Integrated memory controller● Two-way integer execution● Two-way 128-bit floating-point execution
Example Hardware● Gaming Consoles
○ Xbox One○ PS4
● Desktop Processors○ Athlon 5350○ Sempron 3850
● Laptops/Mini PCs○ A6-5200○ E2-3000
● Tablets○ A6-1450
● Embedded Processors○ GX-420CA
Jaguar Core Pipeline
Instruction Fetch and Cache● 6 Stages● 32KB 2 way set associative L1 cache● Pseudo least recently used (LRU)
replacement algorithm● 32B Instruction fetch window● Branch predictors exploit
characteristics of both direct and indirect branches as well as branch density
Instruction Decoding● Can decode two x86 Instructions per cycle● Variable length x86 instructions are decoded
into complex micro-operations (COPs)● Can handle 128-bit vector units as well as
x86 Advanced Vector Extensions (AVX)
Scheduling● Out-of-order execution● After instructions are decoded into
COPs, they are dispatched● Each COP allocates a Retire
Control Unit (RCU) entry
Integer Execution● Separate Integer and Floating Point
Units● 2 Symmetrical integer pipelines● Integer addition/subtraction takes 3
cycles○ Read operands○ Execute○ Write back
● 6 Cycle multiplication● Separate hardware divider
Floating Point Execution● Designed for 128-bit wide execution● Targets SSE and AVX vector
extensions● 2 Asymmetrical FP pipelines● 4-7 cycles per addition/subtraction
○ Read operands (2 cycles)○ Execute (1-4 cycles)○ Write back (1 cycle)
● Co-processor architecture○ Dedicated decode, rename, out-of-order
scheduler and retire queue
Memory● Separate load and store pipelines● Aggressive re-ordering
○ Loads can occur out-of-order
○ Loads can be moved ahead of stores before the target address is resolved
● Memory Ordering Queue and Store Queue handle memory ordering
L1 Data Cache● 32KB● 8-way associative● Parity protected writeback cache● Pseudo-LRU replacement algorithm● Can handle a 128-bit read and a 128-bit write each cycle● Average latency of 3 cycles for a L1 hit
L2 Cache● 1 - 2 MB (depending on application)● 16-way set associative● Unified, shared by 2 to 4 cores● ECC Memory (Error Correcting Code) for tag and data arrays● Forms an EDC/ECC cache structure● Minimum of 25 cycles per hit
Jaguar Benchmarks● Athlon 5350● Athlon 5150● Sempron 3850
Athlon 5350 vs. Intel Core i3 3220 vs. Celeron J1900
Athlon 5350 vs. Intel Core i7 5930KThe Athlon 5350 is much lower performance, however:
● Much better efficiency● Much lower cost● Better performance per
watt● Better performance per
dollar
Zen● Entirely new core design● New design family ‘Summit Ridge’● Simultaneous Multithreading● New Cache System● FinFET manufacturing process
Resourceshttp://www.anandtech.com/show/6976/amds-jaguar-architecture-the-cpu-powering-xbox-one-playstation-4-kabini-temash
http://www.realworldtech.com/jaguar/
http://www.tomshardware.com/reviews/microsoft-xbox-one-console-review,3681-3.html
https://nathanlamont91.wordpress.com/2015/03/22/my-report-on-the-amd-jaguar-quad-core-cpu/
https://www.deepdyve.com/lp/institute-of-electrical-and-electronics-engineers/the-floating-point-unit-of-the-jaguar-x86-core-1TVYueOORA
http://www.xbitlabs.com/news/cpu/display/20120904201534_AMD_Discloses_Peculiarities_of_Next_Generation_Jaguar_Micro_Architecture.html