temperature-aware design presented by mehul shah 4/29/04
Post on 21-Dec-2015
217 views
TRANSCRIPT
Temperature-Aware Design
Presented by Mehul Shah4/29/04
The Problem
Power & Thermal densities are increasing Currently @ 50W/cm2, 100W/cm2 @ 50nm technology Power density doubles every 3 years
Operating Vdd scaling much more slowly (ITRS) Cost of cooling rising exponentially
$1 - $3 per Watt of power dissipation Packages designed for worst case power
Hot spots – heat dissipation non-uniform across chip Low-Power design techniques not sufficient Big Hammer : Global Clock Gating limits performance
Impact of Temperature on Design
Increased Delay, Lower Reliability Slower Transistors
Carrier mobility lower at higher temperature Inverter 35% slower at 110
o C vs. 60
o C
Higher Leakage Power By orders of magnitude at higher temperature Leakage becoming more significant than switching
power Higher Metal Resistivity
Copper 39% more resistive at 120o C vs. 20
o C
Lower Mean-Time-To-Failure (MTF) MTF = MTFo exp (Ea / kb T) MTF decreases exponentially w/ Temperature
Moral of the Story
Problem: Temperature adversely affects power, performance & reliability
Solution: “Temperature-Aware” Design
Temperature Aware Design
Thermal Modeling Estimate Operating Temperature Simple : Allow architects to easily reason
about thermal effects Detailed : Model runtime temperature at
Functional-Unit granularity Computationally Efficient Flexible : Easily extend to novel architectures
Dynamic Thermal Management Use runtime behavior and thermal status to
adjust/distribute workload among Functional-Units
Talk Outline
Thermal Modeling Model Description Validation & Case Studies
Dynamic Thermal Management Results Conclusions
References
Kevin Skadron et. al, “Temperature-Aware Microarchitecture”
Wei Huang et. al, Compact Thermal Modeling for Temperature-Aware Design”
Thermal Modeling
Thermal model interacts with Power, Performance, Reliability models
Design convergence requires several iterations
Heat Flow vs. Electrical Phenomenon
Both can be described by the same differential equations Heat Flow = Electrical Current Temperature = Voltage Capacitance = Heat Absorption Capacity
Describe design as a Thermal RC circuit Node = Functional Block
Solve RC equations to obtain Node Temperature
HotSpot Package
Equivalent Model
Equivalent Model (Continued)
Die Area divided into micro-architectural blocks Spreader, Sink divided into five blocks
Rsp, Rhs areas under the die Trapezoids not covered by the die
Rconvective represents thermal resistance from package to air RC Model
Vertical R’s : heat flow between layers Lateral R’s : heat diffusion within a layer
R1 = Block1 to Spreader, R2 = Block1 to rest of the chip R = t / k * A
t : thickness k : thermal conductivity of the material A : Cross-sectional area
C = c * t * A c : thermal capacitance per unit volume Require empirical scaling factor due to lumped model
HotSpot Validation
Fallacy of Using a Power Metric
Compact Thermal Model
Equivalent Model
Equivalent Model (Cont.)
Compact Model vs. HotSpot Arbitrary granularity grid Thermal interface material Spreader, Interface under the die are divided into
chip granularity Primary Heat Flow Path
Rvertical = t / (k * A) C = Alpha * cp * ρ * A
Alpha : To account for lumped capacitor model Cp : specific heat ρ : material density
Equivalent Model (Secondary Path)
Interconnect Thermal Model
Self-heating power & wire length prediction
Pself = I2R R = ρm * L / Am
Equivalent Model (Secondary Path, Cont.)
Equivalent Thermal Resistance
Model Validation & Evaluation (Primary)
Steady State
Transient
Model Validation (Secondary)
Case Study
Thermal Management
Dynamic Thermal Management
Emergency Threshold temperature above which chip is in thermal violation
Trigger Threshold temperature above which DTM is applied
DTM Techniques
Temperature-Tracking Frequency Scaling Feedback controlled Fetch Toggling Migrating Computation Dynamic Voltage Scaling (DVS) Global Clock Gating
DTM Results
Conclusions
Accurate Thermal models are essential for early design estimation
Models are similar to electrical RC networks Arbitrary granularity for localized temperature
information Model all parts of the package
Architectural Techniques can reduce demands on the IC package by
Dynamically adjusting workload to avoid emergencies
Reducing Hot Spots