Dustin Peterson and Oliver Bringmann
Eberhard Karls Universität Tübingen, Germany
27 January 2016
SMoSi: A Framework for the Derivation of
Sleep Mode Traces from RTL Simulations
2 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
What is my talk about?
Power Optimization of
Embedded System Designs
Wearables1 Smartphones3Driver Assistance2
1 http://www.technikblog.ch/2014/10/pulsmessung-am-handgelenk-kann-das-ueberhaupt-genau-sein/2 http://www.xilinx.com/applications/automotive.html
3 http://marketingland.com/report-us-smartphone-penetration-now-75-percent-117746
3 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Low Power Optimization in SoC Designs
• Lots of well-known power reduction techniques available!
- Clock gating, power gating, DVFS, …
- Each requires partitioning a design into power domains.
• Optimality is highly application-specific!
• Approach for application-specific partitioning:
e.g. [Dobryal et al., Workload driven power domain partitioning]
- Application model is represented by a sleep mode trace.
11
33
22
44
Partition Set 1
Partition Set 2
Idle Component
Active Component
11 3322 44
11 22 4433
11 3322 44
11 22 4433
11 3322 44
11 22 4433
4 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Sleep Mode Traces
• A sleep mode trace assigns each time frame in a simulation run a
set of idle components.
• Example:
How to obtain a sleep mode trace?
• Manually
• Randomly
• High-level models (e.g. Real Time Calculus1)
• Can we generate it at RTL?
400ns 600ns 800ns
Idle Component
Active Component
11 22
33 44
11 22
33 44
11 22
33 44
1 Chakraborty et al. A general framework for analysing system properties in platform-based embedded system designs
5 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Obtaining Sleep Mode Traces at RTL
• Perform RTL simulation and check in the resulting waveform, if
there is activity in a component?
• Not working! Control structures may cut off redundant datapaths!
ADDADD
SUBSUB
MULMUL
DIVDIV
A=20
B=10OP=0
MU
XM
UX
30
10
200
2
30
Idle Component
Active ComponentToggling, but idle…
6 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Basic Idea of SMoSi
• Idea: A priori generation of rules (Idle Rule Set)
- „Under which condition can a component be set idle?“
- Application of these rules to the waveforms (VCD)
ADDADD
SUBSUB
MULMUL
DIVDIV
OPM
UX
MU
XTime OP
10ns 1
20ns 3
30ns 1
40ns 0
Example Design WaveformsIdle Rule Set
Comp. When idle?
ADD 𝑂𝑃 ≠ 0
SUB 𝑂𝑃 ≠ 1
MUL 𝑂𝑃 ≠ 2
DIV 𝑂𝑃 ≠ 3
MUX 𝑛𝑒𝑣𝑒𝑟Idle Component Active Component
ADDADD
SUBSUB
MULMUL
DIVDIV
OPM
UX
MU
X
ADDADD
SUBSUB
MULMUL
DIVDIV
OPM
UX
MU
X
ADDADD
SUBSUB
MULMUL
DIVDIV
OPM
UX
MU
X
ADDADD
SUBSUB
MULMUL
DIVDIV
OPM
UX
MU
X
7 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Agenda
(1) A Priori Rule Generation in SMoSi
Sleep Mode Simulation in detail
(2) Experimental Results
SMoSi under test with open-source designs
(3) Conclusion and Future Work
Advancement and applications of SMoSi
8 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
1 A PRIORI RULE GENERATION IN
SMOSI
Sleep Mode Simulation (SMoSi) in detail
9 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
A Priori Rule Generation
Design Graph
• Goal: Formal analysis of a design to get component-level idle rules
• First step: Synthesize RTL design to netlist with exact mapping.
- Netlist is already elaborated and mapped to boolean transfer functions!
- But it has same components, registers and ports as RTL design!
• Netlist is parsed and converted into a design graph:
c1
CLK
A[0]
A[1]
B[0]
B[1]
Z[0]
Z[1]
DFF
D
CK
Q
c2
Y
XA
B
𝑍[1] = 𝐴[0] ∧ 𝐴[1]
X = 𝐴 ∨ 𝐵
Component Ports
Register PortsRegisters
RT Components Transfer Functions
10 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
A Priori Rule Generation
From the Design Graph to the Idle Rule Set
1. Build port-level redundancy rules using Shannon Expansion!
- „Port X is not used under condition Y.“
- Applies to component and register ports!
2. From the port-level rules we derive component-level idle rules!
No output should
be used!
No register changes its value!
A component is idle, if all of its output ports are redundant
in each transfer function!
Furthermore, the component‘s state must not change!
11 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
A Priori Rule Generation
From the Design Graph to the Idle Rule Set
1. Building the idle rule for the output interface:
- The port-level redundancy rules of all
output ports must evaluate to true!
2. Building the idle rules for the registers:
Analyze port-level redundancies of the register input
ports (data input, clock, enable) of all registers!
Result: BDD that is representing the full idle rule of a component!
Idle Rule for the ALU component in an 8080-compatible processor
12 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
A Priori Rule Generation
How to derive port-level redundancy rules?
• Derive redundancy rules for each port using Shannon Expansion!
• Given: Design Graph
¬𝑂𝑃𝐴𝐿𝑈 ∧ 𝑅𝐴𝐷𝐷∨ 𝑂𝑃𝐴𝐿𝑈 ∧ 𝑅𝑆𝑈𝐵
¬𝑂𝑃𝐴𝐿𝑈 ∧ 𝑅𝐴𝐷𝐷∨ 𝑂𝑃𝐴𝐿𝑈 ∧ 𝑅𝑆𝑈𝐵
Shannon
Expansion
Shannon
Expansion
OPALUOPALU
𝑅𝑆𝑈𝐵𝑅𝑆𝑈𝐵 𝑅𝐴𝐷𝐷𝑅𝐴𝐷𝐷
11 11 0000
OPALU is never
redundant!
RSUB is redundant
for the result of RALU
when OPALU=0!
RADD is redundant
for the result of RALU
when OPALU=1!
ADDADD
SUBSUB
MU
XM
UX
Heuristic: Sort by
highest fan-out
Transfer Function of RALU
13 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Processor CoreProcessor Core
A Priori Rule Generation
How to derive port-level redundancy rules?
• For each transfer function, we derived a port-level redundancy
• Issue: Hierarchy- or dependency-induced redundancy!
• We solve this issue with a dependency graph!
ADDADD
SUBSUBM
UX
MU
X
ADD and SUB are idle,
when OPCODE is 0 or 1ALU is idle, when e.g.
branch unit is used!
ADD and SUB are also off, if branch unit is used!
14 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
A Priori Rule Generation
Dependency Graph
• Dependency graph stores interdependencies between ports.
- „The value of port X depends on the values of ports A, B,….“
- Also stores port-level redundancy rules!
• Given: Design Graph Given: Redundancy BDD for RADD and RSUB
RALURALU
RADDRADD
RSUBRSUB
OPALUOPALU
IN0ADDIN0ADD
IN1ADDIN1ADD
IN0SUBIN0SUB
IN1SUBIN1SUB
IN0ALUIN0ALU
IN1ALUIN1ALU
𝑂𝑃𝐴𝐿𝑈
¬𝑂𝑃𝐴𝐿𝑈
0
0
0
0
0
0
0
0
0
15 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
A Priori Rule Generation
Backward Propagation
• Remember: Hierarchy- or
dependency-induced redundancy!
• Therefore: Propagate all local rules from the edges back to each
source node in the dependency graph to get global rules!
RALURALU
RADDRADD
RSUBRSUB
OPALUOPALU
IN0ADDIN0ADD
IN1ADDIN1ADD
IN0SUBIN0SUB
IN1SUBIN1SUB
IN0ALUIN0ALU
IN1ALUIN1ALU
𝑂𝑃𝐴𝐿𝑈
¬𝑂𝑃𝐴𝐿𝑈
0
0
0
0
0
Backward PropagationBackward Propagation
0
0
0
0
𝟎
𝟎
¬𝑶𝑷𝑨𝑳𝑼
𝑶𝑷𝑨𝑳𝑼𝑶𝑷𝑨𝑳𝑼
𝑶𝑷𝑨𝑳𝑼
¬𝑶𝑷𝑨𝑳𝑼
¬𝑶𝑷𝑨𝑳𝑼
𝟎
𝟎
𝑶𝑷𝑨𝑳𝑼 ∧ ¬𝑶𝑷𝑨𝑳𝑼 = 𝟎𝑶𝑷𝑨𝑳𝑼 ∧ ¬𝑶𝑷𝑨𝑳𝑼 = 𝟎
16 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
SMoSi: Sleep Mode Simulation
RTL Design
Synthesis Tool
(e.g. Design Compiler)
SMoSi Cell Library
Netlist
Clock Lines Reset Lines
Idle Rule Deriver Idle Rule Set
RTL Simulator
(e.g. Modelsim)
Waveforms (VCD)
Sleep Mode
Simulator
Sleep Mode Trace
Testbench
Idle Rule Derivation
once per design
Sleep Mode Simulation
per stimuli
t State[4] S[5] …
10ns 1 1 …
20ns 0 1 …
30ns 0 0 …
17 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
2 EXPERIMENTAL RESULTS
SMoSi under test with open-source designs
18 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Implementation
• Prototypical implementation in Scala (and partially Java)
• Pre-built rules and dependency information for DesignWare
components (e.g. DW03_mux_any, DW01_mult, …)
• Validation using several testcases:
- Several ALUs with different characteristics
- Detection of pipeline stalls in a MIPS16 processor
- Component idleness inside a 8080-compatible CPU core
✓✓✓
19 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Experimental Results
• 5 Open-Source Designs1 and a 32-bit ALU have been tested
• 64-core AMD Opteron 6376 (2.3GHz) platform with 256GB RAM,
Scientific Linux 6.5, JDK 1.8.0_25 and Scala 2.11.5
1
10
100
1,000
10,000
100,000
1,000,000
10,000,000
cpu8080 MC6809 MIPS_16 Sayeh Zet 32-bitALU
Ru
ntim
e[m
s]
Parsing1
Design Graph
GenerationDependency
Graph Generation
Dependency
Graph Solver
Backward
PropagationBDD Rule
Generation
2 We use the Verilog Parser from EDAUtils (http://www.edautils.com)
1 Projects from opencores.org
20 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Experimental Results
• Performance results promising, but not yet perfect!
- Processing of ALU in reasonable time
- All other designs could be processed in reasonable time, but with
limiting parameters!
- Reasons: Self-made BDD implementation, Shannon Expansion
Design obdd_depth max_tf idle_bdd_depth
cpu80801 16 - 16
mc68091 14 5000 16
mips_161 18 - 18
sayeh1 18 5000 18
zet1 12 5000 12
1 Projects from opencores.org
21 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
3 CONCLUSION AND FUTURE WORK
Applications and advancement of SMoSi
22 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Conclusion
• State of the Art: There is no approach to automatically generating
sleep mode traces for a design from RTL simulation traces!
• We developed a methodology for the derivation of sleep mode
traces at RTL and implemented this methodology in Scala.
• The implementation has been tested successfully for small and
middle sized designs!
• Future Work:
- Integration of an existing BDD library for performance gain
- Evaluation of alternative diagram types (KFDDs, ZDDs, …) and
heuristics for identifying control signals
- Coupling with existing design partitioning approaches
23 | Dustin Peterson SMoSi: A Framework for the Generation of Sleep Mode Traces from RTL Simulations
Any questions?Dustin Peterson
Eberhard Karls Universität TübingenLehrstuhl für Eingebettete SystemeFon: +49 7071 - 29 – 75458Fax: +49 7071 - 29 – 5062
E-Mail: [email protected]