yannis smaragdakis / 11-jun-14 general adaptive replacement policies yannis smaragdakis georgia tech

Yannis Smaragdakis / Apr 10, 2023

General Adaptive Replacement Policies

Yannis Smaragdakis

Georgia Tech


My Agenda

• Present a cool idea in replacement algorithms

• Argue that replacement algorithms (and especially VM) are as much a part of ISMM as allocation/GC– same locality principles


Overview

• Background: theory of replacement algorithms

• How to make adaptive algorithms with good performance and theoretical guarantees

• Experimental methodologies in replacement algorithms and evaluation of the idea


Storage Hierarchies

• Storage hierarchies are common in systems

memory hierarchy

registers

CPU cache

main memory (VM cache + file cache)

disk (VM + files)


Management of Storage Hierarchies

• One level of the hierarchy acts as a fast cache for elements of the next

• A replacement algorithm determines how the cache is updated when it is full– the most recently used page must always be in the

fast cache for easy access• hence, when the cache is full, references to pages not in the

cache must cause a replacement: a page in the cache needs to be removed

• called a “fault” or a “miss”


Replacement Schematically

a fb e ichm zk gx qp

a bc de

fghi jk

…

yr

all blocks:

buffer:


LRU Replacement• The Least Recently Used (LRU) algorithm has

been predominant for decades• simple and effective

– no general purpose algorithm has consistently outperformed LRU

• supported by numerous results (both theoretical and experimental)

• Under LRU• a cache of size M always holds the M most recently used

elements

• at replacement time, the least recently used element in the cache is removed (“evicted”)


Main Theoretical Results

• Much major theory is based on competitive analysis– how many faults an algorithm incurs relative to

the faults of another algorithm– beautiful results with potential functions


Example Theorems (slightly simplified)

• LRU will not suffer more than M times as many faults as OPT– M: memory size in blocks. Large number– OPT: optimal (clairvoyant) algorithm

• No other algorithm can do better

• LRU with twice as much memory as OPT will suffer at most twice the faults of OPT


Example Proof Technique

• Theorem: LRU with twice as much memory as OPT will suffer at most twice the faults of OPT (Sleator and Tarjan, ’84)

• Proof idea: 2M pages need to be touched between successive LRU faults on the same page. OPT will suffer at least M faults.


Overview





This Paper

• Note that all previous theoretical results are negative results for practical applications– but we don’t care about OPT! We care about

how close we can get to algorithms that work well in well-known cases

– use the force (competitive analysis) to do good!


This Paper

• Main result: take any two replacement algorithms A and B, produce adaptive replacement algorithm AB that will never incur more than twice (or three times) the faults of either A or B– for any input! We get the best of both worlds

– result applies to any caching domain

– thrashing avoidance is important, 2x guarantee is strong

– we can’t avoid the negative theoretical results, but we can improve good practical algorithms indefinitely


Robustness

• Definition: we say R1 is c-robust w.r.t. R2, iff R1 always incurs at most c times as many faults as R2


A Flavor of the Results

• Given A, B, create AB such that it simulates what A and B do on the input. Then at fault time, AB does the following:– if A also faults but B doesn’t, imitate B

• i.e., evict a block not in B’s memory (one must exist)

– otherwise imitate A• i.e., evict a block not in A’s memory if one exists,

evict the block that A evicts otherwise


Surprising Result

• This very simple policy AB is 2-robust w.r.t. A and B!

• Proof idea: to “fool” AB into bad behavior, say w.r.t. A, A needs to suffer a fault. For every “wrong” decision, AB takes two faults to correct it.– formalized with two potential functions that

count the difference in cache contents between AB and A or B


More Sophisticated Adaptation

• We can create a more sophisticated AB– remember the last k faults for either A or B

– imitate the algorithm that incurs fewer

• This is 3-robust relative to A and B (but will do better in practice)

– the proof is quite complex, requires modeling the memory of past faults in the potential function

– the result can probably be made tighter, but I’m not a theoretician


Implementation

• AB needs to maintain three times the data structures in the worst case– one to reflect current memory contents, two for

the memory contents of A and B– but in practice these three structures have a high

content overlap

• AB only performs work at fault time of A, B, or AB– if A, B are realizable, AB is realizable


Overview





Experimental Evaluation

• We show results in virtual memory (VM) management– “boring” area: old and well-researched, with

little fundamental progress in the past 20 years– strong programmatic regularities in behavior

(e.g., regular loops)• unlike, e.g., web caching

– we have the luxury to implement smart memory management algorithms


Trace-Driven Memory Simulation

• Trace-driven simulation is a common technique for evaluating systems policies

• How does it work?– the sequence of all memory references of a

running program is captured to a file– this sequence is then used to simulate the

system behavior under the proposed policy


Simulation Experiments in VM

• Standard experimental evaluation practices:– one program at a time

• otherwise scheduler adds too much noise. In practice VM dictates scheduling, not the other way

• common cases include one large application that pages anyway

– large range of memories• good algorithms are versatile, behave well in

unpredictable situations


Simulation Experiments in VM

• Standard experimental evaluation practices:– simulate idealized policies

• the whole point is to see what policy captures locality

• simulating policies with realizable algorithms is generally possible (although often non-trivial)


Results of Evaluating Adaptivity

• Adaptive replacement is very successful

• Almost always imitates the best algorithm it adapts over– apply the adaptivity scheme repeatedly

• Never tricked by much

• Occasionally better than all component algorithms


Example Results


Conclusions

• Adaptivity is cool– it is very simple– it works– it offers good theoretical guarantees

yannis smaragdakis / 11-jun-14 general adaptive replacement policies yannis smaragdakis georgia tech

Documents

b slide

lru replacement

idea slide

r2 slide

adaptive algorithms

evicted slide

replacement time

lru algorithm