operating system impact on smt...

32
EECC722 - Shaaban EECC722 - Shaaban #1 Lec # 4 Fall 2003 9-17-2003 Operating System Impact on SMT Architecture Operating System Impact on SMT Architecture The work published in “An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, November 2000. SMT-7) represents the first study of OS execution on a simulated SMT processor. The SimOS environment adapted for SMT: Alpha-based SMT CPU core added. Digital Unix 4.0d modified to support SMT. Study goals: Compare SMT/OS performance results with previous SMT performance results that do not account for OS behavior and impact. Contrast OS impact between OS intensive and non OS intensive workloads. Two types of workloads selected for the study: Non OS intensive workload: Multiprogrammed 8 SPECInt95 benchmarks . OS intensive workload: Multi-threaded Apache web server (64 server processes), driven by the SPECWeb benchmark (128 clients). No SMT-specific OS optimizations were investigated in this study.

Upload: dokhue

Post on 29-May-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#1 Lec # 4 Fall 2003 9-17-2003

Operating System Impact on SMT ArchitectureOperating System Impact on SMT Architecture• The work published in “An Analysis of Operating System Behavior on a

Simultaneous Multithreaded Architecture”, Josh Redstone et al. , inProceedings of the 9th International Conference on Architectural Support forProgramming Languages and Operating Systems, November 2000. SMT-7)represents the first study of OS execution on a simulated SMT processor.

• The SimOS environment adapted for SMT:

– Alpha-based SMT CPU core added.

– Digital Unix 4.0d modified to support SMT.

• Study goals:

– Compare SMT/OS performance results with previous SMT performanceresults that do not account for OS behavior and impact.

– Contrast OS impact between OS intensive and non OS intensive workloads.

• Two types of workloads selected for the study:

– Non OS intensive workload: Multiprogrammed 8 SPECInt95 benchmarks .

– OS intensive workload: Multi-threaded Apache web server (64 serverprocesses), driven by the SPECWeb benchmark (128 clients).

• No SMT-specific OS optimizations were investigated in this study.

Page 2: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#2 Lec # 4 Fall 2003 9-17-2003

OS Code Vs. User CodeOS Code Vs. User Code

• Operating systems are usually huge programs that canoverwhelm the cache and TLB due to code and data size.

• Operating systems may impact branch predictionperformance, because of frequent branches andinfrequent loops.

• OS execution is often brief and intermittent, invoked byinterrupts, exceptions, or system calls, and can cause thereplacement of useful cache, TLB and branch predictionstate for little or no benefit.

• The OS may perform spin-waiting, explicit cache/TLBinvalidation, and other operations not common in user-mode code.

Page 3: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#3 Lec # 4 Fall 2003 9-17-2003

SimOSSimOS

• SimOS is a complete machine simulation environment developed at Stanford(http://simos.stanford.edu/).

• Designed for the efficient and accurate study of both uniprocessor andmultiprocessor computer systems.

• Simulates computer hardware in enough detail to boot and run commercialoperating systems.

• SimOS currently provides CPU models of the MIPS R4000 and R10000 andDigital Alpha processor families.

• In addition to the CPU, SimOs also models caches, multiprocessor memorybusses, disk drives, ethernet, consoles, and other system devices.

• SimOs has been ported for IRIX versions 5.3 (32-bit) and 6.4 (64-bit) andDigital UNIX; a port of Linux for the Alpha is being developed.

Page 4: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#4 Lec # 4 Fall 2003 9-17-2003

SimOS SimOS System DiagramSystem Diagram

Page 5: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#5 Lec # 4 Fall 2003 9-17-2003

A Base SMT hardware Architecture.

Source: Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor,

Dean Tullsen et al. Proceedings of the 23rd Annual International Symposium on Computer Architecture, May 1996, pages 191-202.

Page 6: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#6 Lec # 4 Fall 2003 9-17-2003

Alpha-based SMTAlpha-based SMTProcessor ParametersProcessor Parameters• Duplicate the register file,

program counter, subroutinestack and internal processorregisters of a superscalarCPU to hold the state ofmultiple threads.

• Add per-context mechanismsfor pipeline flushing,instruction retirement,subroutine return prediction,and trapping.

• Fetch unit, Functional units,Data L1, L2, TLB sharedamong contexts.

• ~ 10% chip-area increaseover superscalar. (comparedto ~ 5% for Intel’s hyper-threaded Xeon)

Page 7: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#7 Lec # 4 Fall 2003 9-17-2003

OS Modifications for SMTOS Modifications for SMTOnly minimal required OS modifications to support SMT considered

(no OS optimizations for SMT considered here):• OS task scheduler must support multiple threads in running status:

– Shared-memory multiprocessor (SMP) aware OS (including DigitalUnix) has this ability but each thread runs on a different CPU in SMPsystems.

– An SMT processor reports to such an OS as multiple shared memoryCPUs (logical processors).

• TLB-related code must be modified:

– Mutual exclusion support to access to address space number (ASN) tagsof the TLB by multiple threads simultaneously.

– Modified ASN assignment to account for the presence of multiplethreads.

– Internal CPU registers used to modify TLB entries replicated percontext.

• No OS changes required to account for the shared L1 cache of SMT vs. thenon shared L1 for SMP.

Page 8: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#8 Lec # 4 Fall 2003 9-17-2003

SPECInt SPECInt Workload Execution CycleWorkload Execution CycleBreakdownBreakdown

• Percentage of execution cycles for OS Kernel instructions:

– During program startup: 18%, mostly due to data TLBmisses and to a lesser extent system calls.

– Steady state: 5% still dominated by TLB misses.

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 9: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#9 Lec # 4 Fall 2003 9-17-2003

Breakdown of Kernel Time forBreakdown of Kernel Time forSPECInt95SPECInt95

5% dominated by TLB misses.

18% mostly due to data TLB misses and system calls

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 10: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#10 Lec # 4 Fall 2003 9-17-2003

SPEC System Calls PercentageSPEC System Calls Percentage

System calls as a percentage of total execution cycles.

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 11: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#11 Lec # 4 Fall 2003 9-17-2003

SPECInt95 Dynamic Instruction MixSPECInt95 Dynamic Instruction Mix

• Percentage of dynamic instructions in the SPECInt workload by instruction type.• The percentages in parenthesis for memory operations represent the proportion of loads and stores that are to physical addresses.• A percentage breakdown of branch instructions is also included.• For conditional branches, the number in parenthesis represents the percentage of conditional branches that are taken.

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 12: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#12 Lec # 4 Fall 2003 9-17-2003

SPECInt95SPECInt95 Total Miss rates &Distribution of Misses

• The miss categories are percentages of all user and kernel misses.

• Bold entries signify kernel-induced interference.

• User-kernel conflicts are misses in which the user thread conflicted with some type ofkernel activity (the kernel executing on behalf of this user thread, some other userthread, a kernel thread, or an interrupt).

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 13: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#13 Lec # 4 Fall 2003 9-17-2003

Metrics for SPECInt95 with and without theMetrics for SPECInt95 with and without theOperating System for both SMT and Operating System for both SMT and SuperscalarSuperscalar..

• The maximum issue for integer programs is 6 instructions on the 8-wide SMT, because thereare only 6 integer units.

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 14: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#14 Lec # 4 Fall 2003 9-17-2003

Apache Workload Execution CycleApache Workload Execution CycleBreakdownBreakdown

• Apache experiences little start-up period since Apache’s ‘start-up’consists simply of receiving the first incoming requests and waking upthe server threads.

• Once requests arrive, Apache spends over 75% of its time in the OS.

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 15: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#15 Lec # 4 Fall 2003 9-17-2003

Breakdown of kernel time for ApacheBreakdown of kernel time for Apachevsvs. SPECInt95 on SMT. SPECInt95 on SMT

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 16: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#16 Lec # 4 Fall 2003 9-17-2003

Apache System Calls By NameApache System Calls By Name

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 17: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#17 Lec # 4 Fall 2003 9-17-2003

Apache System Calls By FunctionApache System Calls By Function

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 18: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#18 Lec # 4 Fall 2003 9-17-2003

Apache Dynamic Instruction MixApache Dynamic Instruction Mix

•The percentages in parenthesis for memory operations represent the proportion of loads and stores that are to physical addresses.• A percentage breakdown of branch instructions is also included.• For conditional branches, the number in parenthesis represents the percentage of conditional branches that are taken.

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 19: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#19 Lec # 4 Fall 2003 9-17-2003

All applications are executing with the operating system.

Metrics for SMT SPEC, Apache & Metrics for SMT SPEC, Apache & SuperscalarSuperscalar Apache Apache

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 20: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#20 Lec # 4 Fall 2003 9-17-2003

Apache+OSApache+OS Total Miss rates &Distribution of Misses

• The miss categories are percentages of all user and kernel misses.

• Bold entries signify kernel-induced interference.

• User-kernel conflicts are misses in which the user thread conflicted with some type ofkernel activity (the kernel executing on behalf of this user thread, some other userthread, a kernel thread, or an interrupt).

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 21: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#21 Lec # 4 Fall 2003 9-17-2003

Percentage of Misses Avoided Due toPercentage of Misses Avoided Due toInterthreadInterthread Cooperation on Apache Cooperation on Apache

• Percentage of misses avoided due to interthread cooperation on Apache, shown byexecution mode.

• The number in a table entry shows the percentage of overall misses for the givenresource that threads executing in the mode indicated on the leftmost column wouldhave encountered, if not for prefetching by other threads executing in the mode shownat the top of the column.

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 22: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#22 Lec # 4 Fall 2003 9-17-2003

OS Impact on Hardware StructuresOS Impact on Hardware StructuresPerformancePerformance

“An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture”, Josh Redstone et al. , in Proc. of the 9th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Nov. 2000

SMT-7

Page 23: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#23 Lec # 4 Fall 2003 9-17-2003

OS Impact on SMT Study SummaryOS Impact on SMT Study Summary

• Results show that for SMT, omission of the operating system did notlead to a serious misprediction of performance for SPECInt,although the effects were more significant for a superscalarexecuting the same workload.

• On the Apache workload, however, the operating system isresponsible for the majority of instructions executed:

– Apache spends a significant amount of time responding to systemservice calls in the file system and kernel networking code.

– The result of the heavy execution of OS code is an increase ofpressure on various low-level resources, including the caches andthe BTB.

– Kernel threads also cause more conflicts in those resources, bothwith other kernel threads and with user threads; on the otherhand, there is an positive interthread sharing effect as well.

Page 24: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#24 Lec # 4 Fall 2003 9-17-2003

Possible SMT-specific OS OptimizationsPossible SMT-specific OS Optimizations• Smart SMT-optimized OS task scheduler for better SMT-core

performance:

– Schedule cooperating threads that benefit from SMT’s resourceand data sharing to run simultaneously.

– To aid SMT’s latency-hiding, avoid scheduling too many threadsthat have conflicts over same specific CPU resource (TLB,cache FP etc.)

– For SMP-SMT system tightly-coupled threads should bescheduled to logical processors in the same physical SMT CPU(processor affinity).

• Introduce a lightweight dedicated kernel context to cached in theSMT-core to handle process management and speedup system calls.

• Prevent the “idle loop” thread from consuming execution resources:

– Intel Hyper-threading solution: use HALT instruction.

• Allow thread caching in the CPU to further reduce context-switching overheads.

Page 25: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#25 Lec # 4 Fall 2003 9-17-2003

Overview of Intel’s Hyper-ThreadingMicroarchitecture & Performance

• Introduced in 2002 by adding 2 thread SMT (Hyper-Threading, HT) support tothe Intel Xeon/P4 Northwood core.

• Major Design Goals:

– Minimize increase of chip area and complexity . Implementation increasedrelative chip area/power by 5%

– Prevent a stalling thread from affecting the progress of the other thread inthe CPU.

– Maintain a good single-thread performance by switching to single

• To limit chip area increase and complexity:

– each thread is only allocated a maximum of 50% of:• Fetch cycles, micro-ops queues, instruction scheduling queues, load/store

buffers, re-order buffer entries, instruction retirement cycles.

– No resources (TLB, casche, queues, buffers, etc.) were resized to account forthe additional demands SMT poses on shared CPU resoureces.

• These design constraints prevented this implementation from achieving the fullperformance potential of a 2 thread full SMT processor.

– Performance gains of this implementation typically range from 5% to 28%

Page 26: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#26 Lec # 4 Fall 2003 9-17-2003

Intel Xeon/P4 HT Processor Pipeline

• Visible to OS as two logical processors when HT enabled

• Duplicated for each thread:

– Architectural thread state

– Advanced programmable Interrupt Controller (APIC)

Front-End: In-order Fetch/Decode Out-of-order Schedule/Execute In-order Commit

SMT-8

Page 27: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#27 Lec # 4 Fall 2003 9-17-2003

Front-End Detailed Pipeline(a)Instruction Trace Cache Hit:Decoded instructions micro-opsfetched from trace cacheFetch is round robin from eachthreadEach thread limited to 50% ofmicro-ops queue ready forrenaming/scheduling

(b)Instruction Trace Cache Miss:Instructions fetched from L2cache in round robin RR1.XInstruction decoding into micro-ops alternates between threadsevery cycleInstruction micto-ops filled intrace cache and micro-ops queueready for renaming/scheduling

SMT-8

Page 28: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#28 Lec # 4 Fall 2003 9-17-2003

Out-Of-Order Execution Engine PipelineIn-order RenameResource Allocation

Out-of-order Schedule/Execute In-order Retirement

SMT-8

50%of entriesper thread

50%perthread(63entries)

50%perthread

Retire up to 3 micro-ops/cycle alternating between threads

Schedule/Dispatchup to 6 micro-ops/cycleNo thread constraints

TwoRegister Alias Tables (RATs)one for eachthread

No threadALU allocationConstraints

128 rename(physical registers) Memory

micro-ops

Othermicro-ops

50%of entriesper thread

Page 29: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#29 Lec # 4 Fall 2003 9-17-2003

HT Transaction Processing Performance

21% HTperformance gain

SMT-8

Page 30: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#30 Lec # 4 Fall 2003 9-17-2003

HT Web Server Benchmark Performance

16-21% performance gain

SMT-8

Page 31: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#31 Lec # 4 Fall 2003 9-17-2003

Summary of HT Performance ForCompute-Intensive Workloads

SMT-9

SMP speedup range 1.65 to 2.00HT (SMT) speedup range 1.05 to 1.28

Page 32: Operating System Impact on SMT Architecturemeseec.ce.rit.edu/eecc722-fall2003/722-9-17-2003.pdfOperating System Impact on SMT Architecture ... – Digital Unix 4.0d modified to support

EECC722 - ShaabanEECC722 - Shaaban#32 Lec # 4 Fall 2003 9-17-2003

Summary of HT Performance ForCompute-Intensive Workloads

SMP speedup range 1.65 to 2.00SMT speedup range 1.05 to 1.28SMT implementation cannot match or exceed SMP performance due to thread resource/cycle constraints imposed in hyper-threading implementation

Ideal 1/3 cycle/uopCPI

Single ThreadSMT-9