insertion policy selection using decision tree analysis
DESCRIPTION
Insertion Policy Selection Using Decision Tree Analysis. Samira Khan, Daniel A. Jiménez University of Texas at San Antonio. Motivation. L1 and L2 filters the cache access Last Level Cache (LLC) does not have much temporal locality - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/1.jpg)
Insertion Policy Selection Using Decision Tree Analysis
Samira Khan, Daniel A. Jiménez University of Texas at San Antonio
![Page 2: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/2.jpg)
Motivation L1 and L2 filters the cache access Last Level Cache (LLC) does not have
much temporal locality Large fraction of blocks brought to cache
are never accessed again (zero reuse lines).
For SPEC CPU 2006 benchmarks, on average 60.18% lines are never accessed again while they are in the LLC
![Page 3: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/3.jpg)
Motivation No cache bursts in LLC Only small portion of hits occur near the MRU
position
![Page 4: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/4.jpg)
Goal Get rid of zero reuse lines as early as
possible Keep lines in cache for sufficient time
to get the first hit Minimal change to LRU policy Use as little space as possible
![Page 5: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/5.jpg)
Insertion Position Selection Find the optimal insertion position
Zero reuse lines will get evicted earlier Most of the non zero reuse lines should be in
cache before their first hit This will get rid of zero reuse lines and make
space for useful lines Use Decision Tree Analysis via set dueling to
find the position This allows choosing among the insertion
positions to set duel
![Page 6: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/6.jpg)
Set dueling betweenmiddle and MRU pos
Set dueling betweenLRU and middle pos
Set dueling between
nearMRU and MRU pos
Set dueling between
nearLRU and middle pos
Insert posLRU
Insert pos
nearLRU
Insert pos
middle
Insert pos
nearMRU
Insert posMRU
LRU pos middle pos MRU pos
nearLRU pos
nearMRU pos
middle pos winner MRU pos winner
Middle pos winnerLRU pos winner nearMRU pos winner
MRU pos winner
nearLRU pos winner
Middle pos winner
For 400.perlbench 66.67% lines brought to cache are never accessed again and 73.03% hits occur in between MRU and middle position
![Page 7: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/7.jpg)
Adaptive Multi Set Dueling Current multi set dueling
Have one leader set for each insertion policy Partial follower sets duplicate the winner set policy Each policy set duel in a tournament manner Not scalable Leader sets performing the looser policies hurt
performance
Adaptive multi set dueling Leader set adaptively chooses the policy No need for partial follower set Scalable
![Page 8: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/8.jpg)
Result
![Page 9: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/9.jpg)
Space Overhead
Parameter Storage Total Storage
LRU overhead per line 4 bits 1024*16*4 = 8 KB
Set type per set 2 bits 1024 * 2 = 2048 bits
Two counters (psel1 & psel2)
Each 10 bits 20 bits
One counter (switched) 1 bit 1 bit
Total 8 KB + 2069 bits
Space overhead for a 1MB 16 way set associative LLC
![Page 10: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/10.jpg)
Conclusion Insertion Position Selection using Decision
Tree Analysis Requires minimal change to LRU Needs only 2069 bits extra space Chooses the best insertion position adaptively Gets rid of zero reuse lines without any storage
hungry predictor Makes multi set dueling scalable
![Page 11: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/11.jpg)
Questions
![Page 12: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/12.jpg)
Zero Reuse Lines in SPEC CPU 2006
![Page 13: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/13.jpg)
pselab
pselcd
pselef
pselgh
psel1
psel2
psel2
psel1
pa
pb
φabpc
pd
pe
pf
pg
ph
φcd
φef
φgh
pb
pa
pα
-1
+1
-1
-1
-1
-1
-1
+1
+1
+1
+1
+1
+1
+1
+1, if pb wins
-1
-1, if pa
wins
All sets in LLC
Leader se
ts in a
daptiv
e m
ulti se
t duelin
g sch
em
eLe
ader se
ts in cu
rrent
multi se
t duelin
g sch
em
eAdaptive Multi Set Dueling
![Page 14: Insertion Policy Selection Using Decision Tree Analysis](https://reader036.vdocuments.us/reader036/viewer/2022062410/5681572e550346895dc4cbc3/html5/thumbnails/14.jpg)
Result
MRU
nearMRU
middle
nearLRU
LRU
psel2
psel1 s