performance-oriented peephole optimisation of balsa dual-rail circuits
DESCRIPTION
Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits. Luis Tarazona and Doug Edwards Advanced Processor Technologies Group School of Computer Science. Syntax-directed compilation. Used in Tangram and Balsa - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/1.jpg)
1/14
Performance-oriented Peephole Optimisation of Balsa
Dual-Rail Circuits
Performance-oriented Peephole Optimisation of Balsa
Dual-Rail Circuits
Luis Tarazona and Doug Edwards
Advanced Processor Technologies Group
School of Computer Science
![Page 2: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/2.jpg)
2/14
Syntax-directed compilationSyntax-directed compilation
• Used in Tangram and Balsa
• One-to-one mapping of each language construct into a network of handshake components (HCs)
• Benefits:
– Transparency and flexibility to the designer
• Drawback: medium-low performance
• Solutions to this have been proposed using:
– Control resynthesis
– Peephole optimisation
![Page 3: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/3.jpg)
3/14
Related workRelated work
• Tangram and Balsa compilers perform some peephole optimisations as a post processing step
• T. Chelcea has proposed resynthesis and peephole optimisations for Balsa, targeting a burst-mode back-end
• Plana et al. have proposed some optimised HCs for Balsa targeting single rail and dual-rail back-ends
Main interest of this work is on dual-rail back-ends due to its potential immunity to process variability.
![Page 4: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/4.jpg)
4/14
The optimisationsThe optimisations
1. Eliminating redundant FalseVariable components
2. New Concurrent RTZ Fetch component
3. Conditional parallel/sequencer component: ParSeq
![Page 5: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/5.jpg)
5/14
Eliminating redundant FVsEliminating redundant FVs
• Targets active input control
• Single access, single read-port FalseVariable or eagerFalseVariable HCs
i -> then
CMD
end
![Page 6: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/6.jpg)
6/14
Eliminating redundant FVs - ExampleEliminating redundant FVs - Example
• Latency and area reduction
• Preserves external behaviour
a, b -> then
o <- a + b
end
![Page 7: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/7.jpg)
7/14
Concurrent RTZ Fetch Concurrent RTZ Fetch
Wires-only dual-rail Fetch and its STG
![Page 8: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/8.jpg)
8/14
Concurrent RTZ Fetch Concurrent RTZ Fetch
New concurrent RTZ Fetch and its STG
![Page 9: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/9.jpg)
9/14
The ParSeqThe ParSeq
• Acts conditionally as a Concur (parallel) or as a Sequencer HC
• Few opportunities to apply it in the design examples available
– Perhaps caused by its inexistence at that time?
• Interesting increase in performance, though.
![Page 10: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/10.jpg)
10/14
Handshake circuit implementationHandshake circuit implementation
||
![Page 11: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/11.jpg)
11/14
Optimised ParSeq SchematicsOptimised ParSeq Schematics
![Page 12: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/12.jpg)
12/14
Simulation ResultsSimulation Results
Pre-layout, transistor-level simulations, 180nm technology
![Page 13: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/13.jpg)
13/14
Conclusions and Future WorkConclusions and Future Work
Future work:
• To incorporate the optimisations into the Balsa design flow
• ParSeq as a construct or as a peephole optimisation?
• To evaluate other peephole and HCs optimisations currently under study
![Page 14: Performance-oriented Peephole Optimisation of Balsa Dual-Rail Circuits](https://reader036.vdocuments.us/reader036/viewer/2022062722/56813a12550346895da1ea79/html5/thumbnails/14.jpg)
14/14
Thank you very much!
Questions?
Acknowledgement
• Thanks to Luis Plana, Andrew, Charlie and Will for their suggestions and comments.
• This work and PhD are supported by EPSCR and UoM School of Computer Science scholarships.