cs5460: operating systemscs5460/slides/lecture16.pdf · cs 5460: operating systems timeline of a...
TRANSCRIPT
CS 5460: Operating Systems
CS5460: Operating Systems Lecture 16: Page Replacement
(Ch. 9)
Last Time: Demand Paging Key idea: RAM is used as a cache for disk
– Don’t give a process a page of RAM until it is needed – When running short on RAM, take pages away from processes – This only works if accesses to memory pages have high temporal
locality » Why don’t we care about spatial locality?
Three basic kinds of page table entries – Valid mapping – the OS is not involved – translation performed
entirely by the CPU – Invalid mapping – trap, then kernel does something special, such
as kill the process – Valid but not present – trap and do demand paging
Demand paging makes the exec() system call fast
CS 5460: Operating Systems
CS 5460: Operating Systems
Timeline of a Page Fault
1. Trap to operating system 2. Save state in PCB 3. Vector to page fault handler 4. If invalid, send SIGSEGV 5. If valid, find or create a free
page a. Possibly involves disk write
6. Issue disk read for page a. Wait until request queued at
disk controller b. Wait for seek/latency c. Wait for data transfer (DMA) d. Wait for completion interrupt
7. (Optional) Schedule another process while waiting
8. Take disk interrupt 9. Update page table 10. Add process to run queue 11. Wait for process to be
scheduled next 12. Restore state from PCB 13. Return from OS 14. Re-execute faulting
instruction
CS 5460: Operating Systems
Effective Access Times What is average access latency?
– L1 cache: 2 cycles – L2 cache: 10 cycles – Main memory: 150 cycles – Disk: 10 ms à 30,000,000 cycles on 3.0 GHz processor – Assume access have following characteristics:
» 98% handled by L1 cache » 1% handled by L2 cache » 0.99% handled by DRAM » 0.01% cause page fault
– Average access latency: » (0.98 * 2) + (0.01 * 10) + (0.0099 * 150) + (0.0001 * 30,000,000) =
1.96 + 0.1 + 1.485 + 3000 = about 3000 cycles / access
Moral: Need LOW fault rates to sustain performance!
CS 5460: Operating Systems
Issues in Demand Paging Page selection policy
– When do we load a page?
Page replacement policy – What page(s) do we swap to disk to make room for new pages? – When do we swap pages out to disk?
How do we handle thrashing?
CS 5460: Operating Systems
Page Selection Policy Demand paging:
– Load page in response to access (page fault) – Predominant selection policy
Pre-paging (prefetching) – Predict what pages will be accessed in near future – Prefetch pages in advance of access – Problems:
» Hard to predict accurately (trace cache) » Mispredictions can cause useful pages to be replaced
Overlays – Application controls when pages loaded/replaced – Only really relevant now for embedded/real-time systems
CS 5460: Operating Systems
Page Replacement Policies Optimal
– Throw out page used farthest in the future
Random – Works surprisingly well
FIFO (first in, first out) – Throw out oldest pages
LRU (least recently used) – Throw out page not used in longest time
NRU (not recently used) – Approximation to LRU à do not throw out recently used pages
How should we evaluate page replacement policies?
CS 5460: Operating Systems
FIFO Page Replacement
A B C A B D A D B C B Frame1 Frame2 Frame3
FIFO: replace oldest page (first loaded)
Example: – Memory system with three pages à all initially free – Reference string: A B C A B D A D B C B
Result: 7 page faults
• A • B
• C
• √ • √
• D • A
• √ • B
• C
• √
CS 5460: Operating Systems
Optimal Page Replacement
A B C A B D A D B C B Frame1 Frame2 Frame3
Optimal: replace page used farthest in the future
Example: – Memory system with three pages à all initially free – Reference string: A B C A B D A D B C B
Result: 5 page faults
• A • B
• C
• √ • √
• D
• √ • C • √
• √ • C • √
CS 5460: Operating Systems
LRU Page Replacement
A B C A B D A D B C B Frame1 Frame2 Frame3
LRU: replace least recently used page
Example: – Memory system with three pages à all initially free – Reference string: A B C A B D A D B C B
Result: 5 page faults
• A • B
• C
• √ • √
• D
• √ • C • √
• √ • √
How would you implement… – Random – FIFO – Optimal – LRU – NRU
Which ones are efficient?
CS 5460: Operating Systems
CS 5460: Operating Systems
NRU Page Replacement Observations
– LRU is pretty good approximation of OPT » Past performance is often reasonable predictor of future performance » Captures “phase” behavior in many (but not all) applications
– Implementing true LRU requires far too much overhead » Logically, need to update “sort order” on every memory access
How can we approximate LRU efficiently? – Exploit “referenced” bit in modern page tables – Only replace pages that have not been recently referenced (NRU) – Periodically clear referenced bits à enforces “recently”
» Optionally: Maintain recent history of referenced bits per-page » Example: 10010101 à denotes times page referenced last 8 sweeps
CS 5460: Operating Systems
NRU Page Replacement This is a modified version of FIFO Checks if the page at the head of the FIFO queue
has its referenced bit set – Yes? Then clear the bit and put it at the back of the queue and
look at the next page – No? Then select this page
Is this fast? What is the worst case?
This is called the “second chance” algorithm
CS 5460: Operating Systems
1
1
1 0
0
0
Clock Algorithm This is basically an
optimized version of second chance
Maintains “next” pointer – Starts sweep there until done – Persists across invocations
While (need more pages) – Check referenced bit – If 0 à add to free pool – If 1 à reset bit
Between sweeps – If a process accesses page,
referenced bit gets set – TLB helps here!
0
0
0
1
0
Next
Referenced
Next
Next
Next
Next
Free!
Free!
CS 5460: Operating Systems
BSD Page Replacement (NRU)
Goal: maintain pool of free pages at all times – Avoid waiting for replacement algorithm/write during page fault – Typical goal: ~5% of main memory in free page pool
Sweeper process – Privileged (kernel) process – Scheduled whenever free page pool drops below threshold
» Low watermark (sweep) –vs- high watermark (goal) – Sweeps through list of allocated pages doing 2nd chance
Nth Chance Like second chance but…
– If page is referenced, clear its counter and move on – If page is not referenced, increment its counter
» If new counter == N, select this page » Otherwise move on
– If N is big, we have a really good LRU approximation » But we spend a lot of time looking for pages
– If N == 1 we have second chance – If N == 0 we have FIFO
Lots more work exists on page replacement…
CS 5460: Operating Systems
CS 5460: Operating Systems
Belady’s Anomaly For some replacement algorithms
– MORE pages in main memory can lead to… – MORE page faults!
This phenomenon is known as “Belady’s Anomaly” Example:
– FIFO replacement policy – Reference string: A B C D A B E A B C D E – Three pages à 9 faults – Four pages à 10 faults!
Interesting since we would expect that adding more memory always helps
CS 5460: Operating Systems
Thrashing Working set: collection of memory currently being
used by a process If all working sets do not fit in memory à thrashing
– One “hot” page replaces another – Percentage of accesses that generate page faults skyrockets
Typical solution: “swap out” entire processes – Scheduler needs to get involved – Two-level scheduling policy à runnable vs memory-available – Need to be fair – Invoked when page fault rate exceeds some bound
When swap devices are full, Linux invokes the “OOM killer”
CS 5460: Operating Systems
Who should we compete against for memory? Global replacement:
– All pages for all processes come from single shared pool – Advantage: very flexible à can globally “optimize” memory usage – Disadvantages: Thrashing more likely, can often do just the
wrong thing (e.g., replace the pages of a process about to be scheduled)
– Many OSes, including Linux, do this
Per-process replacement: – Each process has private pool of pages à competes with itself – Alleviates inter-process problems, but not every process equal – Need to know working set size for each process – Windows kernel does this
» There are Win32 API calls to set a process’s minimum and maximum working set sizes
CS 5460: Operating Systems
Important From Today Demand paging
– What is it? What is the “effective access time”?
Page replacement policies – Random, FIFO, Optimal, LRU, NRU, … – Belady’s anomaly
Thrashing Global vs local allocation
– Concept of a process’s “working set”