risc v 4-way superscalar
TRANSCRIPT
![Page 1: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/1.jpg)
Risc V 4-way Superscalar R10K OoO Processor
Group 8 OoO
Xiuneng Lu, Siyu Niu, Yunjie Pan, Runyu Zheng
EECS 470 Computer Architecture Project Presentation
Dec. 11th, 2019
![Page 2: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/2.jpg)
High Level Design
2
![Page 3: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/3.jpg)
Features
3
Features Included CommentsRISC V R10k OoO Processor YesGraphical debugging Tool Yes print contents of each buffer each cycleAutomated regression testing infrastructure Yes Run all tests and compare rst w/ P3Superscalar Yes Superwidth = 4Store-to-load forwarding in LSQ Yes We have, but only for SWLoads issue out-of-order past pending stores(non-speculative) Yes As describedPost-retirement store buffer Yes 16 entriesMultiple outstanding load misses Yes Hide miss lantancyNext-line or stride prefetching forinstructions and/or data Yes Prefetch next 16 instructionsWrite-back data cache Yes write back & write allocateAssociativity > 1 Yes 2-WayVictim cache Yes 64 bitReturn address stack Yes 16 entries
Loads speculatively issue past pending stores No Implemented. but still buggy
Load dependence predictor NoImplemented, worked, but inefficientand influence critical path too much
![Page 4: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/4.jpg)
Graphical Debugger
4
![Page 5: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/5.jpg)
High Level Design
5
![Page 6: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/6.jpg)
Results
6
for public testcase(.s files)baseline average CPI=5.04our average CPI = 1.4027.8% of baseline
clock cycle @Synthesis 17ns
![Page 7: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/7.jpg)
Results
7
for public testcase(.c files) with O0 optimization
baseline average CPI=5.01our average CPI = 1.4228.4% of baseline
clock cycle @Synthesis 17ns
![Page 8: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/8.jpg)
Results
8
-O1 average CPI = 1.32 -O2 average CPI = 1.34
![Page 9: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/9.jpg)
Starting it all over again...
● Leave more time for Optimization○ We spent too much time on implementing the
feature
● Always thinking the hardware cost before implementing large structures
● Always assign a initial value for a variable○ simulate may assert xxx as 0 but synthesis may
not
● Finish debugging speculative load and load dependence predictor
9
![Page 10: Risc V 4-way Superscalar](https://reader031.vdocuments.us/reader031/viewer/2022012509/61868596c8d57d445f2c3546/html5/thumbnails/10.jpg)
Q&A
10
EECS 470 makes me a good painter!