cache management
TRANSCRIPT
![Page 1: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/1.jpg)
Cache Management
Presented By:
Babar Shahzaad
14F-MS-CP-12
Department of Computer Engineering ,
Faculty of Telecommunication and Information Engineering ,
University of Engineering and Technology (UET), Taxila
![Page 2: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/2.jpg)
Outline What is the Problem
What is a Cache What is Locality
Cache Hit/Miss
Types of Cache Miss Memory Mapping Techniques
Direct Mapping
Fully-Associative Mapping
Set-Associative Mapping
Summary of Memory Mapping Techniques Methods to Overcome Cache Miss
Miss Cache
Victim Caches
Stream Buffers
Summary of Cache Miss Techniques
![Page 3: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/3.jpg)
What is the Problem
![Page 4: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/4.jpg)
What is a Cache
![Page 5: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/5.jpg)
What is a Cache
Small, fast storage used to improve average access time to slow memory
Exploits spatial and temporal locality
In computer architecture, almost everything is a cache! Registers “a cache” on variables – software managed
First-level cache “a cache” on second-level cache
Second-level cache “a cache” on memory
Memory “a cache” on disk (virtual memory)
Translation Lookaside Buffer(TLB) “a cache” on page table
Branch-prediction “a cache” on prediction information
![Page 6: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/6.jpg)
What is a Cache (Cont…)
Processor/
RegistersL1-
Cache
L2-Cache
Memory
Disk, Tape, etc.
Bigger Faster
![Page 7: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/7.jpg)
What is Locality
Temporal Locality: locality in time If an item is referred, it will tend to be referred again soon
Spatial Locality: locality in space/distance If an item is referred, its neighboring addresses will be referred soon
![Page 8: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/8.jpg)
Cache Hit/Miss
![Page 9: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/9.jpg)
Cache Hit/Miss
Cache Hit Data found in cache. Results in data
transfer at maximum speed
Cache Miss Data not found in cache. Processor loads
data from Memory and copies into cache. This results in extra delay, called miss penalty
![Page 10: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/10.jpg)
Types of Cache Miss
![Page 11: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/11.jpg)
Types of Cache Miss
There are 4C’s
![Page 12: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/12.jpg)
Memory Mapping Techniques
![Page 13: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/13.jpg)
Direct Mapping
![Page 14: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/14.jpg)
Direct Mapping
Each block of main memory maps to only one cache line i.e. if a block is in cache, it must be in one specific place
Address is in two parts Least Significant w bits identify unique word/byte
Most Significant s bits specify one memory block
The MSBs are split into a cache line field r and a tag of s-r (most significant)
![Page 15: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/15.jpg)
Direct Mapping (Cont…)
![Page 16: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/16.jpg)
Direct Mapping (Cont…)
Pros and Cons Simple
Inexpensive
No need of expensive associative search
Fixed location for given block
Miss rate may go up due to possible increase of mapping conflicts
If a program accesses 2 blocks that map to the same line repeatedly, cache misses(conflict misses) are very high
Cache Line Table
![Page 17: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/17.jpg)
Fully-Associative Mapping
![Page 18: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/18.jpg)
Fully-Associative Mapping
A main memory block can load into any line of cache
Memory address is interpreted as tag and word
Tag uniquely identifies block of memory
Every line’s tag is examined for a match
Cache searching gets expensive and power consumption due to parallel
![Page 19: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/19.jpg)
Fully-Associative Mapping(Cont…)
![Page 20: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/20.jpg)
Fully-Associative Mapping(Cont…) Pros and Cons
There is flexibility as to which block to replace when a new block is read into the cache
No restriction on mapping from Memory to Cache
Associative search tags is expensive
Feasible for very small size caches only
The complex circuitry required for parallel Tag comparison is however a major disadvantage
![Page 21: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/21.jpg)
Set-Associative Mapping
![Page 22: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/22.jpg)
Set-Associative Mapping
Cache is divided into a number of sets
Each set contains a number of lines
A given block maps to any line in a given set e.g. Block B can be in any line of set i
If 2 lines per set 2 way set associative mapping
A given block can be in one of 2 lines in only one set
![Page 23: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/23.jpg)
Set-Associative Mapping(Cont…)
![Page 24: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/24.jpg)
Set-Associative Mapping(Cont…)
Pros and Cons Most commercial cache have 2,4, or 8 way set associativity
Cheaper than a fully-associative cache
Lower miss ratio than a direct mapped cache
Direct mapped cache is the fastest
After simulating the hit ratio for direct mapped and (2,4,8 way) set associative mapped cache, it is observed that there is significant difference in performance at least up to cache size of 64KB, set associative being the better one.
However, beyond that, the complexity of cache increases in proportion to the associativity, hence both mapping give approximately similar hit ratio.
![Page 25: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/25.jpg)
Summary of Memory Mapping Techniques Number of Misses
Direct Mapping > Set-Associative Mapping > Full-Associative Mapping
Access Latency Direct Mapping < Set-Associative Mapping < Full-Associative Mapping
![Page 26: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/26.jpg)
Methods to Overcome Cache
Miss
![Page 27: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/27.jpg)
Miss Cache
![Page 28: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/28.jpg)
Miss Cache
Fully-associative cache inserted between L1 and L2
Contains between 2 and 5 cache lines of data
Aims to reduce conflict misses
![Page 29: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/29.jpg)
Miss Cache Operation
On a miss in L1, we check the Miss Cache.
If the block is there, then we bring it into L1 So the penalty of a miss in L1 is just a few cycles, possibly as few as
one
Otherwise, fetch the block from the lower-levels, but store the retrieved value in the Miss Cache
![Page 30: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/30.jpg)
Miss Cache Performance
What if we doubled the size of the L1 cache instead? 32% decrease in cache misses
So only .13% decrease per line
However, note that we are storing every block in the Miss Cache twice...
![Page 31: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/31.jpg)
Victim Cache
![Page 32: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/32.jpg)
Victim Cache
Motivation: Can we improve on our miss rates with miss caches by modifying the replacement policy (i.e. can we do something about the wasted space in the pure miss caching system)?
Fully-associative cache inserted between L1 and L2
![Page 33: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/33.jpg)
Victim Cache Operation
On a miss in L1, we check the Victim Cache
If the block is there, then bring it into L1 and swap the ejected value into the miss cache Misses that are caught by the cache are still cheap, but better
utilization of space is made
Otherwise, fetch the block from the lower-levels
![Page 34: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/34.jpg)
Victim Cache Performance
Even better than Miss Cache!
Smaller L1 caches benefit more from victim caches
![Page 35: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/35.jpg)
Wait A Minute!
What about compulsory and capacity misses?
What about instruction misses?
Victim Cache and Miss Cache most helpful when temporal locality can be exploited
Pre-fetching techniques can help Pre-fetch Always
Pre-fetch on Miss
Tagged Pre-fetch
Can we improve on these techniques?
![Page 36: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/36.jpg)
Stream Buffers
![Page 37: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/37.jpg)
Stream Buffers
Stream Buffer is a FIFO queue placed in between L1 and L2
![Page 38: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/38.jpg)
Stream Buffers Operation
When a miss occurs in L1, say at address A, the Stream Buffer immediately starts to pre-fetch elements at A+1
Subsequent accesses check the head of the Stream Buffer before going to L2
Note that non-sequential misses will cause the line to restart pre-fetching (i.e. A+2 and A+4 will each restart the pre-fetching process, even if A+4 was already in the stream)
![Page 39: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/39.jpg)
Stream Buffer Performance
72% of instruction misses removed
25% of data misses removed
![Page 40: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/40.jpg)
Summary of Cache Miss Techniques Miss Caches: Small caches in between L1 and L2. Whenever a
miss occurs in L1, they receive a copy of the data from the lower level cache
Victim Cache: Enhancement of Miss Caches. Instead of copying data from lower levels on a miss, they take whatever block has been evicted from L1
Stream Buffers: Simple FIFO buffers that are immediately pre-fetched into on a miss
![Page 41: Cache management](https://reader031.vdocuments.us/reader031/viewer/2022032117/55c73fd4bb61eb5f6e8b472c/html5/thumbnails/41.jpg)