ece7995 caching and prefetching techniques in computer systems
DESCRIPTION
ECE7995 Caching and Prefetching Techniques in Computer Systems. Lecture 8: Buffer Cache in Main Memory (IV). 5. Recency = 1. Recency = 2. 3. 2. 8. 1. 4. 9. Quantifying Locality with LRU Stack . Blocks are ordered by their recencies; - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/1.jpg)
ECE7995 Caching and Prefetching Techniques in Computer Systems
Lecture 8: Buffer Cache in Main Memory (IV)
![Page 2: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/2.jpg)
Quantifying Locality with LRU Stack
• Blocks are ordered by their recencies;
• Blocks enter from the stack top, and leave from its bottom;
1 LRU stack
32
5
98
43. . .4
5544 33
Recency = 1Recency = 2
![Page 3: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/3.jpg)
LRU Stack
• Blocks are ordered by recency in the LRU stack;
• Blocks enter from the stack top, and leave from its bottom;
LRU stack
32
45
98
3. . . 5544 3333
Recency = 2
IRR = 2
Inter-Reference Recency (IRR)The number of other distinct blocks accessed between two consecutive references to the block.
Recency = 0
![Page 4: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/4.jpg)
Locality Strength
Locality Strength
Cache Size
MULTI2
IRR
(Re-
use
Dis
tanc
e in
Blo
cks)
Virtual Time (Reference Stream)
LRU
Good for “absolutely” strong locality
Bad for relatively weak locality
![Page 5: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/5.jpg)
LRU’s Inability with Weak Locality
• Memory scanning (one-time access) Infinite IRR, weak locality; should not be cached at all; not replaced timely in LRU (be cached until their recency
larger than cache size);
![Page 6: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/6.jpg)
LRU’s Inability with Weak Locality
• Loop-like accesses (repeated accesses with a fixed interval)
IRR is the same as the interval The interval larger than cache size, no hits blocks to be accessed soonest can be unfortunately
replaced.
![Page 7: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/7.jpg)
LRU’s Inability with Weak Locality
• Accesses with distinct frequencies: The recencies of frequently accessed blocks become large
because of references to infrequently accessed block; Frequently accessed blocks could be unfortunately replaced.
![Page 8: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/8.jpg)
Looking for Blocks with Strong Locality
Locality Strength
Cache Size
MULTI2IR
R (R
e-us
e D
ista
nce
in B
lock
s)
Virtual Time (Reference Stream)
Cover 1000 Blocks with Strongest
Locality
![Page 9: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/9.jpg)
Challenges
Address the limitations of LRU fundamentally.
Retain the low overhead and adaptability merits of LRU.
• Simplicity: affordable implementation • Adaptability: responsive to access pattern changes
![Page 10: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/10.jpg)
Principle of the LIRS Replacement
We select the blocks with high IRRs for replacement .
LIRS: Low IRR Set Replacement algorithm We keep the set of blocks with low IRRs in cache.
If a block’s IRR is high, its next IRR is likely to be high again.
![Page 11: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/11.jpg)
Requirements on Low IRR Block Set (LIRS)
The set size should be the cache size. The set consists of the blocks with strongest
locality strength (with the lowest IRRs)Dynamically keep the set up to date
![Page 12: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/12.jpg)
Low IRR Block Set Low IRR ( LIR ) block and High IRR (HIR) block
LIR block set
(size is Llirs )
HIR block set
Cache size
L = Llirs + LhirsLhirs
Llirs
Physical CacheBlock Sets
![Page 13: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/13.jpg)
An Example for LIRS
Llirs=2, Lhirs=1V time /Blocks
1 2 3 4 5 6 7 8 9 10 R IRR
A X X X 1 1
B X X 3 1
C X 4 inf
D X X 2 3
E X 0 inf
LIR block set = {A, B}, HIR block set = {C, D, E}
![Page 14: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/14.jpg)
CDE
HIR block set
A B
A BE
LIR block set
Resident blocks
Mapping to Cache
Block Sets
Lhirs=1
Llirs=2
Physical Cache
![Page 15: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/15.jpg)
D is referenced at time 10
V time /Blocks
1 2 3 4 5 6 7 8 9 10 R IRR
A X X X 1 1
B X X 3 1
C X 4 inf
D X X XX 0 3
E X 1 Inf
The resident HIR block (E) is replaced !
Which Block is replaced ? Replace HIR Blocks
![Page 16: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/16.jpg)
V time /Blocks
1 2 3 4 5 6 7 8 9 10 R IRR
A X X X 2 1
B X X 3 1
C X 4 inf
D X X XX 0 2
E X 1 Inf
How LIR Set is Updated ? Recency of LIR Block Used
![Page 17: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/17.jpg)
V time / Blocks
1 2 3 4 5 6 7 8 9 10 R IRR
A X X X 2 1
B X X 3 1
C X 4 inf
D X X XX 0 2
E X 1 Inf
After D is Referenced at Time 10 … …
E is replaced, D enters LIR set
B
D
![Page 18: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/18.jpg)
V time /Blocks
1 2 3 4 5 6 7 8 9 10 R IRR
A X X X 2 1
B X X 4 1
C X XX 0 4
D X X 3 3
E X 1 Inf
If Reference is to C at Time 10 … …
E is replaced, C cannot enter LIR set
![Page 19: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/19.jpg)
The LIRS References with Weak Locality
• Memory scanning (one-time access) Infinite IRR; Not included in the LIR block set; replaced timely.
![Page 20: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/20.jpg)
The LIRS References with Weak Locality
• Loop-like accesses The IRRs of all blocks are the same; Once a block becomes LIR block, it can keep its status; Any cached block can contribute a hit in one loop of
accesses.
![Page 21: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/21.jpg)
The LIRS References with Weak Locality
• Accesses with distinct frequencies: The IRRs of frequently accessed blocks have smaller
IRR, than infrequently accessed blocks. Frequently accessed blocks are LIR blocks; Always cached and get hits.
![Page 22: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/22.jpg)
Making LIRS O(1) Efficient
Rmax (Maximum Recency of LIR blocks)
IRR HIR
(New IRR of the HIR block)
This efficiency is achieved by our LIRS stack.
LRU stack + LIR block with Rmax recency in its bottom ==> LIRS stack.
![Page 23: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/23.jpg)
Differences between LRU and LIRS Stacks
resident blockLIR block
HIR block
Cache size L = 5
3216
5LRU
stack53216948
LIRS stack
Llir = 3
Lhir =2
Stack size of LRU decided by cache size, and fixed; Stack size of LIRS decided by Rmax, and varied.
LRU stack holds only resident blocks; LIRS stack holds any blocks whose recencies are no more than Rmax.
LRU stack does not distinguish “hot” and “cold” blocks in it; LIRS stack distinguishes LIR and HIR blocks in it, and dynamically maintains their statues.
![Page 24: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/24.jpg)
Rmax (Maximum Recency of LIR blocks)
IRR HIR
(New IRR of the HIR block)
Blocks in the LIRS stack ==> IRR < Rmax
Other blocks ==> IRR > Rmax
LIRS Stack
How does LIRS Stack Help?
![Page 25: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/25.jpg)
LIRS Operations resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
53216948
LIRS stack S
53
Resident HIR Stack Q
• Initialization: All the referenced blocks are given an LIR status until LIR block set is full.
We place resident HIR blocks in Stack Q
![Page 26: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/26.jpg)
53216948
53
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 4835795Access an LIR Block (a Hit)
LIRS stack S
Resident HIR Stack Q
![Page 27: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/27.jpg)
532169
4
853
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 835795Access an LIR Block (a Hit)
LIRS stack S
Resident HIR Stack Q
![Page 28: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/28.jpg)
Access an LIR block (a Hit)
69
5321
48
53
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 35795 8
S Q
![Page 29: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/29.jpg)
Access a Resident HIR Block (a Hit)
5321
48
53
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 35795
3
S Q
![Page 30: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/30.jpg)
152
5483
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 35795
Access a Resident HIR Block (a Hit)
S Q
![Page 31: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/31.jpg)
152
5483
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 35795
1
Access a Resident HIR Block (a Hit)
S Q
![Page 32: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/32.jpg)
5483
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 5795
15
Access a Resident HIR Block (a Hit)
S Q
![Page 33: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/33.jpg)
Access a Non-Resident HIR block (a Miss)
5
483
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 795
15
7
7
S Q
![Page 34: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/34.jpg)
5
483
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 95
5
7
7
9
5
9
5
Access a Non-Resident HIR block (a Miss)
S Q
![Page 35: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/35.jpg)
483
resident in cacheLIR block
HIR blockCache size
L = 5Llir =
3 Lhir =2
. . . 5
7
7
9
5
9
7
5
4 7
Access a Non-Resident HIR block (a Miss)
S Q
![Page 36: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/36.jpg)
Workload Traces
• postgres is a trace of join queries among four relations in a relational database system;
• sprite is from the Sprite network file system;
• multi2 is obtained by executing three workloads, cs, cpp, and postgres, together.
![Page 37: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/37.jpg)
Cache Partition
• 1% of the cache size is for HIR blocks
• 99% of the cache size is for LIR blocks
• Performance is not sensitive to a partition.
![Page 38: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/38.jpg)
Looping Pattern: postgres (Access Map)
Virtual Time (Reference Stream)
Logi
cal B
lock
Num
ber
![Page 39: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/39.jpg)
Looping Pattern: Postgres (IRR Map) IR
R (R
e-us
e D
ista
nce
in B
lock
s)
Virtual Time (Reference Stream)
LRU
LIRS
![Page 40: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/40.jpg)
Looping Pattern: postgres (Hit Rates) Postgres
0
10
20
30
40
50
60
70
80
0 500 1000 1500 2000 2500 3000Cache Size (# of Blocks)
Hit R
atio
(%) OPT
LIRSLRU-22QLRFUEELRUARCLRU
![Page 41: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/41.jpg)
Temporally-Clustered Pattern: sprite (Access Map)
Virtual Time (Reference Stream)
Logi
cal B
lock
Num
ber
![Page 42: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/42.jpg)
Temporally-Clustered Pattern: sprite (IRR Map) IR
R (R
e-us
e D
ista
nce
in B
lock
s)
Virtual Time (Reference Stream)
LRULIRS
![Page 43: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/43.jpg)
Temporally-Clustered Pattern: sprite (Hit Ratio)SPRITE
0102030405060708090
100
0 200 400 600 800 1000 1200
Cache Size (# of Blocks)
Hit R
atio
(%) OPT
LIRSLRU-22QLRFUEELRUARCLRU
![Page 44: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/44.jpg)
Mixed Pattern: multi2 (Access Map)
Virtual Time (Reference Stream)
Logi
cal B
lock
Num
ber
![Page 45: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/45.jpg)
Mixed Pattern: multi2 (IRR Map) IR
R (R
e-us
e D
ista
nce
in B
lock
s)
Virtual Time (Reference Stream)
LIRS
LRU
![Page 46: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/46.jpg)
Mixed Pattern: multi2 (Hit Ratio)MULTI-2
0
10
20
30
40
50
60
70
80
90
0 1000 2000 3000 4000Cache Size (# of Blocks)
Hit R
atio
(%)
OPTLIRSLRU-22QLRFUEELRUARCLRU
![Page 47: ECE7995 Caching and Prefetching Techniques in Computer Systems](https://reader035.vdocuments.us/reader035/viewer/2022062310/568160f2550346895dd0296e/html5/thumbnails/47.jpg)
Summay
• LIRS uses both IRR (or reuse distance) and recency for its replacement decision. 2Q uses only reuse distance.
• LIRS adapts to the locality changes when deciding which blocks have small IRRs. 2Q uses a fixed threshold in looking for blocks of small reuse distances.
• Both LIRS and 2Q are of low time overhead (as low as LRU). Their space overheads are acceptably larger.