chapters 7 - 8cs325/osp_08.pdf · 8. virtual memory 8.1 principles of virtual memory 8.2...
TRANSCRIPT
1
S2004 CS 325 1
Memory Management
Chapters 7 - 8
Real memoryReal memoryLogical storageLogical storageAddress spacesAddress spacesVirtual memoryVirtual memory
S2004 CS 325 2
Chapter 7 - Physical Memory
7.1 Preparing a Program for Execution – Program Transformations – Logical-to-Physical Address Binding
7.2 Memory Partitioning Schemes– Fixed Partitions – Variable Partitions
7.3 Allocation Strategies for Variable Partitions
7.4 Dealing with Insufficient Memory
2
S2004 CS 325 3
8. Virtual Memory
8.1 Principles of Virtual Memory8.2 Implementations of Virtual Memory
– Paging– Segmentation– Paging With Segmentation– Paging of System Tables– Translation Look-aside Buffers
8.3 Memory Allocation in Paged Systems– Global Page Replacement Algorithms– Local Page Replacement Algorithms– Load Control and Thrashing– Evaluation of Paging
S2004 CS 325 4
Address Binding Time
assign physical addresses = relocation
• static binding
– programming time
– compilation time
– linking time
– loading time
• dynamic binding
– execution time
3
S2004 CS 325 5
Relocation
• Assignment of real memory• Programs make use of logical address space
– Code/Data spaces– Segments
• Real memory is linear– divided into segments – subdivided into page frames
• Relocation - Load– Static (link – bind early)– Dynamic (link – bind late)– Hardware supported – MMU and HW caches
S2004 CS 325 6
CPUNL_map
logical address
Data transfer (read/write)
Fig 7-3a: storage reference with dynamic relocation (general principle)
Main storage
Address Relocation
NL_map: {relocatable (virtual) addresses} {real (physical) storage addresses}
physical address
MMU
I/O subsystem
4
S2004 CS 325 7
Address binding
• how to implement dynamic binding– perform for each address at run time:
pa = address_map(la)– la == va pa = F ( va )
– simplest form of address_map function: relocation register
pa = la + RR– more general form: page/segment table (Chapter 8)
S2004 CS 325 8
Partitions - Segments
• Multiprogramming requires that more than one program is kept in main memory
• Memory is divided into logical “partitions”• Partitions can be of fixed or variable size and be
located at specific or selectable locations -segments
• Each partitions should provide a private address space for the program/process/thread/object
• Efficient partition management and address binding requires hardware support
• Segments (base and limit registers) and Page Frames with appropriate relocation support must alleviate this task of the operating system.
5
S2004 CS 325 9
segmented memory – flexible logical address management
NL_map
VM_1
VM_2
physical (real) memory
logical (virtual) memory
S2004 CS 325 10
Drawbacks
• Address translation at run time consumes time (TINSTAAFL)
• Programs and data areas grow and shrink dynamically in size
• Partitions are eventually released (freed by a program)
• Newly arriving processes require memory space I.e. allocation of a partition
• Need to manage free space and allocation of partitions
• Possibly rearrange dynamically the allocation
there is no such thing as a free lunch
6
S2004 CS 325 11
A
B
A
C C
H1 H1
H2
D
D
Memory fragmentation|D| > |H1| , |D| > |H2| |D| < |H1| + |H2|
S2004 CS 325 12
p2
3
2
memory compactioninitial state
p3
p1
4
2
5
5
9
p2
3
p3
p1
2
20
5
9
p2
3
p3
p1
2
11
5
9
p2
3
p3
p1
2
5
9
11
total compaction
partial compaction minimal move
7
S2004 CS 325 13
Address Space Management• Programs have a dynamic calling structure – subroutines,
procedures, functions etc.• Programs also frequently have varying data areas• A programmer may not know how his/her program behaves• Measurements have shown that statistically programs tend
to exhibit high “locality”• Using this characteristic feature one can delegate the
management of the mapping from logical (“virtual”) addresses to real, physical addresses to a combination of hardware and software (operating system)
• Hence, the user is presented with the illusion of an almost unlimited, private address space!
S2004 CS 325 14paged memory – flexible physical address management
NL_mapVM_1
VM_2
physical (real) memory
logical (virtual) memory
frame 0
frame 1
frame 2
frame 3
frame 4
frame 5
frame 6
frame 7
page 0
page 1
page 2
page x
pages page framesMMU
8
S2004 CS 325 15
paging
• The physical memory is subdivided into page frames
• The logical memory is subdivided into pages• This means that memory addresses are now
composed of two parts:– A page number– An offset (distance) within the page
• Hardware assistance is required for the binding:– Page numbers are translated into frame numbers– The offset value is used as is
• Attention: this is not identical to “virtual” memory!!!
S2004 CS 325 16
Fig 8-2: form of virtual and physical address
va
|w| bits
page number p
pa
w
|w| bits
frame number f
|f| bits
|p| bits
w
9
S2004 CS 325 17Fig 8-3: paged
virtual memory
NL_map
Page No
LS - physical memory
0
1
2
(p,w)
P-1
3
0
1
2
f
F-1
3
NS - virtualmemory
Frame No
(f,w)
S2004 CS 325 18
Principles of virtual memory
• system creates illusion of large contiguous memory space(s) for each process
• relevant portions of VM are loaded automatically and transparently
• address map translates virtual addresses to physical addresses
• Figure 8-1
10
S2004 CS 325 19
Fig 8-1: principles of virtual memory
NL_map
VM_1
VM_2
physical (real) memory
logical (virtual) memory
S2004 CS 325 20
Principles of virtual memory
• single-segment VM: – one area of 0..n-1 words– divided into fix-size pages
• multiple-segment VM: – multiple areas of up to 0..n-1 (words)– each holds a logical segment (function, data structure)– each is contiguous or divided into pages
11
S2004 CS 325 21
Main issues in VM design
• address mapping– how to translate virtual addresses to physical
• placement– where to place a portion of VM needed by process
• replacement– which portion of VM to remove when space is needed
• load control– how much of VM to load at any one time
• sharing– how can processes share portions of their VMs
S2004 CS 325 22
Implementation using paging
• VM is divided into fix-size pages• PM is divided into page frames (page size)• system loads pages into frames and translates
addresses• virtual address: va = (p,w) • physical address: pa = (f,w)• |p| determines number of pages in VM• |f| determines number of frames in PM• |w| determines page/frame size
Figure 8-2
12
S2004 CS 325 23
Address translation
• given (p,w):– determine f from p -- how to keep track?– concatenate f+w to form pa
• one solution: frame table– one entry FT[i] for each frame– FT[i].pid records process ID– FT[i].page records page number p– given (id,p,w): search for a match on (id,p): index f is the
frame number
S2004 CS 325 24
physical memory
logical / virtual memory
frame table
(p,w)
(f,w)(p)
pid = 1
(p,w)
pid = 2
(p)
pid = 1
pid = 2
= inverted page table
00
00F…F
FFF…F
13
S2004 CS 325 25
Address translation
address map(id, p, w) {
pa = UNDEFINED;
for (f = 0; f < F; f++)
if (FT[f].pid==id && FT[f].page==p)pa:= (f+w);
return pa;
}
S2004 CS 325 26
distance0
1
2
Id
F-1
3
f
. . . . . Frame table FT
w
pages in main memory
Fig 8-4: address translation with frame table
| w
frame address
. . . . . . . . .
. . . . memory access
.
.
.
.
.
.
.
.
frame f
page p
p
. . . .
. . . .
pid page
Id | p
14
S2004 CS 325 27
Address translation
Figure 8-4
address_map(id, p, w) { pa = UNDEFINED; for (f = 0; f < F; f++) if (FT[f].pid==id && FT[f].page==p)pa = f+w;
return pa; }
• drawbacks of frame table implementation– costly: search must be done in parallel in hardware– sharing of pages difficult or not possible
S2004 CS 325 28
Page tables
• PT associated with each VM (not PM)• PTR points at PT at run time• p’th entry holds frame number of page p:
– *(PTR+p) points to frame f• address translation:
address_map(p, w) { pa = *(PTR+p)+w; return pa
}
Figure 8-5• drawback: extra memory access
15
S2004 CS 325 29
physical memory
logical / virtual memory
page table
(p,w)
(f,w)
(p)
pid = 1
(p,w)
pid = 2 00
00F…F
FFF…F
(f)
(f)
page table
S2004 CS 325 30
distance
f
. . . . . page tables
w
pages in main memory
Fig 8-5: address translation with page table
p | w
PTR
frame address
. . . . . . . . . .
. . . . memory access
. . . .
p
. . . .
16
S2004 CS 325 31
Fig 8-6: page table contents
. . . .
1024
1025
1026
1027
. . . .
. . . .
21504
40960
3072
15360
. . . .
va = 2.100
pa = 3.172
PTR = 1024
S2004 CS 325 32
distance0
1
2
page in Frame_i
F-1
3
i
. . . . . Associative memory
w
pages in main memory
Fig 8-?: address translation with page table
p | wpage no
frame address
parallel search
. . . . . . . . . .
. . . . memory access
17
S2004 CS 325 33
Address Mapping
• Two possible approaches:– Frame table – 1 entry per real page frame– Page table – 1 entry per logical/virtual page
• Both solutions depend on searching the table– Requires hardware support to speed up the process
(associative memory – Translation Lookaside Buffer)
• Both solutions need 2 memory accesses per memory reference unless associative memory is used for the translation table.
S2004 CS 325 34
Demand paging
• all pages of VM can be loaded initially– simple but maximum size of VM = size of PM
• pages a loaded as needed -- on demand– additional bit in PT indicates presence/absence– page fault occurs when page is absent
address_map(p, w) { if (resident(*(PTR+p))) { pa = *(PTR+p)+w; return pa; }
else page_fault;}
18
S2004 CS 325 35
VM using segmentation
• multiple contiguous spaces– more natural match to program/data structure– easier sharing (Chapter 9)
• va = (s,w) mapped to pa (but no frames)
• where/how are segments placed in PM?– contiguous versus paged application
S2004 CS 325 36
Contiguous allocation
• each segment is contiguous in PM• ST keeps track of starting locations• STR point to ST• address translation
address_map(s, w) { if (resident(*(STR+s))) { pa = *(STR+s)+w; return pa; }
else segment_fault; }
• drawback: external fragmentation
19
S2004 CS 325 37
Paging with segmentation
• each segment is divided into fix-size pages• va = (s,p,w)
|s| determines # of segments (size of ST)|p| determines # of pages per seg. (size of PT)|w| determines page size
• pa = *(*(STR+s)+p)+w
Figure 8-7
• drawback: 2 extra memory references
S2004 CS 325 38
STR
. . . . .
w
memorypages
Fig 8-7: address translation with page and segment table
psegment no
. . . . . .
. . .
s
. . . . . .
segment table
ws
. . .
p
. . . . . .
page tables
20
S2004 CS 325 39
Paging of system tables
• ST or PT may be too large to keep in PM– divide ST or PT into pages– keep track by additional page table
• paging of ST– ST divided into pages– segment directory keeps track of ST pages– va = (s1,s2,p,w)– pa = *(*(*(STR+s1)+s2)+p)+w
Figure 8-8
S2004 CS 325 40
STR
. . . . .
w
memorypages
p
. . . . . .
. . .
s1
. . . . . .
ws2
. . .
p
. . . . . .
page table
s1
. . .
. . . . . . Fig 8-8: address translation
with paged segment table
s2
paged segment tables
21
S2004 CS 325 41
32
segment
s
. . . . . .
offset
segment table
Fig 8-9a: address translation in the Pentium processor – segmentation only
s
14 32
segment registers base. . .
32
offs
et
S2004 CS 325 42
32
pagedirectory
s
. . . .
offset
segment table
Fig 8-9b: address translation in the Pentium processor – segmentation with paging
s
14 32
segment registers
base. . . w
p1 w
10 12
p2
10
+
p2
p1
pages ofpage table
program/data pages
22
S2004 CS 325 43
Translation look-aside buffers
• to avoid additional memory accesses– keep most recently translated page numbers in associative
memory:for any (s,p,*) keep (s,p) and frame# f
– bypass translation if match on (s,p) found
Figure 8-10
• TLB is different from cache– TLB only keeps frame #s– cache keeps data values
S2004 CS 325 44
s, p
. . . . .
segment/page number
w
pages in main memory
Fig 8-10: translation lookaside buffer
s
. . . . . . . . . .
. . . .
p w
frame number
replacement info (LRU)
f
23
S2004 CS 325 45
Memory allocation with paging
• placement policy: any free frame is ok• replacement: must minimize data movement• global replacement:
– consider all resident pages (regardless of owner)• local replacement
– consider only pages of faulting process• how to compare different algorithms:
– use reference string: r0 r1 ... rt … rt is the number of the page referenced at time t
– count # of page faults
S2004 CS 325 46
Global page replacement
• optimal (MIN): replace page that will not be referenced for the longest time in the future
Time t | 0| 1 2 3 4 5 6 7 8 9 10RS | | c a d b e b a b c d Frame 0| a| a a a a a a a a a d Frame 1| b| b b b b b b b b b b Frame 2| c| c c c c c c c c c c Frame 3| d| d d d d e e e e e e IN | | e d OUT | | d a
• problem: RS not known in advance
24
S2004 CS 325 47
Global page replacement
• random replacement– simple but– does not exploit the locality of reference
most instructions are sequential (branch instructions typically ,<~10%)most loops are short(within at most a few pages)many data structures are accessed sequentially(but overall arrangements can be critical- "wrap around" for column/row access- phased bank structure)
S2004 CS 325 48
Global page replacement
• FIFO: replace oldest pageTime t | 0| 1 2 3 4 5 6 7 8 9 10
RS | | c a d b e b a b c d Frame 0|>a|>a >a >a >a e e e e >e d Frame 1| b| b b b b >b >b a a a >a Frame 2| c| c c c c c c >c b b b Frame 3| d| d d d d d d d >d c c IN | | e a b c d OUT | | a b c d e
• problem: – favors recently accessed pages but – ignores when program returns to old pages
25
S2004 CS 325 49
Global page replacement
• LRU: replace least recently used pageTime t | 0| 1 2 3 4 5 6 7 8 9 10RS | | c a d b e b a b c d Frame 0| a| a a a a a a a a a d Frame 1| b| b b b b b b b b b b Frame 2| c| c c c c e e e e e d Frame 3| d| d d d d d d d d c cIN | | e c d OUT | | c d eQ.end | d| c a d b e b a b c d
| c| d c a d b e b a b c| b| b d c a d d e e a b
Q.head | a| a b b c a a d d e a
S2004 CS 325 50
Global page replacement
• LRU implementation– software queue: too expensive– time-stamping
stamp each referenced page with current timereplace page with oldest stamp
– hardware capacitor with each framecharge at referencereplace page with smallest charge
– n-bit aging register with each frameshift all registers to right at every referenceset left-most bit of referenced page to 1replace page with smallest value
26
S2004 CS 325 51
Global page replacement
• second-chance algorithm– approximates LRU– implement use-bit u with each frame– u=1 when page referenced– to select a page:
if u==0, select pageelse, set u=0 and consider next frame
– used page gets a second chance to stay in PM
• algorithm is called “clock” algorithm:– search cycles through page frames
S2004 CS 325 52
Global page replacement
• second-chance algorithm
… 4 5 6 7 8 9 10… b e b a b c d . … >a/1 e/1 e/1 e/1 e/1 >e/1 d/1 … b/1 >b/0 >b/1 b/0 b/1 b/1 >b/0… c/1 c/0 c/0 a/1 a/1 a/1 a/0 … d/1 d/0 d/0 >d/0 >d/0 c/1 c/0 … e a c d
27
S2004 CS 325 53
Global page replacement
• third-chance algorithm– second chance makes no difference between read and write
access– write access more expensive– give modified pages a third chance:
u-bit set at every reference (read and write)w-bit set at write referenceto select a page, cycle through frames, resetting bits, until uw==00:
uw → uw1 1 0 11 0 0 0 "cheapest" candidate (cached?)0 1 0 0 * (remember modification)0 0 select
S2004 CS 325 54
Global page replacement
• third-chance algorithm
… 0 | 1 2 3 4 5 … | c aw d bw e … >a/10 |>a/10 >a/11 >a/11 >a/11 a/00*… b/10 | b/10 b/10 b/10 b/11 b/00*… c/10 | c/10 c/10 c/10 c/10 e/10… d/10 | d/10 d/10 d/10 d/10 >d/00… | e
28
S2004 CS 325 55
Local page replacement
• measurements indicate that every program needs a “minimum” set of pages– if too few, thrashing occurs– if too many, page frames are wasted
• the “minimum” varies over time• how to determine and implement this “minimum”?
S2004 CS 325 56
Local page replacement
• optimal (VMIN)– define a sliding window (t,t+τ)– τ is a parameter (constant)– at any time t, maintain as resident all pages visible in window
• guaranteed to generate smallest number of page faults
29
S2004 CS 325 57
Local page replacement
• optimal (VMIN) with τ=3Time t | 0| 1 2 3 4 5 6 7 8 9 10RS | d| c c d b c e c e a d
Page a | -| - - - - - - - - √ -Page b | -| - - - √ - - - - - -Page c | -| √ √ √ √ √ √ √ - - -Page d | √| √ √ √ - - - - - - √Page e | -| - - - - - √ √ √ - -IN | | c b e a d OUT | | d b c e a
• guaranteed optimal but• unrealizable without RS
S2004 CS 325 58
Local page replacement
• working set model: use principle of locality– use trailing window (instead of future window)– working set W(t,τ): all pages referenced during (t–τ,t)– at time t:
remove all pages not in W(t,τ)process may run only if entire W(t,τ) is resident
30
S2004 CS 325 59
Local page replacement
• working set modelTime t | 0| 1 2 3 4 5 6 7 8 9 10RS | a| c c d b c e c e a d
Page a | √| √ √ √ - - - - - √ √Page b | -| - - - √ √ √ √ - - -
Page c | -| √ √ √ √ √ √ √ √ √ √Page d | √| √ √ √ √ √ √ - - - √Page e | √| √ - - - - √ √ √ √ √IN | | c b e a d OUT | | e a d b
• drawback: costly to implement• approximate (aging registers, time stamps)
S2004 CS 325 60
Local page replacement
• page fault frequency– main objective: keep page fault rate low– basic principle of pff:
if time between page faults ≤ τ, grow resident set: add new page to resident setif time between page faults > τ, shrink resident set: add new page but remove all pages not referenced since last page fault
31
S2004 CS 325 61
Local page replacement
• page fault frequency
Time t | 0| 1 2 3 4 5 6 7 8 9 10RS | | c c d b c e c e a d Page a | √| √ √ √ - - - - - √ v Page b | -| - - - √ √ √ √ √ - -Page c | -| √ √ √ √ √ √ √ √ √ √Page d | √| √ √ √ √ √ √ √ √ - v Page e | √| √ v v - - √ √ √ √ √IN | | c b e a d OUT | | ae bd
S2004 CS 325 62
Load control and thrashing
• main issues:– how to choose degree of multiprogramming
– when level decreased, which process should be deactivated
– when new process reactivated, which of its pages should be loaded
32
S2004 CS 325 63
Load control and thrashing
• choosing degree of multiprogramming
• local replacement:– working set of any process must be resident– this automatically imposes a limit
• global replacement– no working set concept– use CPU utilization as a criterion– with too many processes, thrashing occurs
Figure 8-11
S2004 CS 325 64
Fig 8-11: program behaviorL >> S
L < S
L ~ S
degree of multi-programming
33
S2004 CS 325 65
Load control and thrashing
• how to find Nmax?
– L=S criterion:page fault service S needs to keep up with mean time between faults L
– 50% criterion: CPU utilization is highest when paging disk ~50% busy (found experimentally)
S2004 CS 325 66
Load control and thrashing
• which process to deactivate– lowest priority process– faulting process– last process activated– smallest process– largest process
• which pages to load when process activated
Figure 8-12– > prepage last resident set
34
S2004 CS 325 67
Fig 8-12: lifetime curve of a program
frequent page faults
page fault rate increases & MTBPF also
page fault rate decreases, MTBPF grows slowly - >adding pages has small benefits
-> prepage size
S2004 CS 325 68
Evaluation of paging• experimental measurements:
Figure 8-13(a): number of pages referenced over time
– initial set can be loaded more efficiently than by individual page faults
(b): number of instructions executed within one page
(c): influence of page size for fixed total on page fault rate
(d): influence of available number of pages on page fault rate
35
S2004 CS 325 69Fig 8-13: qualitative behavior
S2004 CS 325 70
Conclusions
(a): prepaging is important– initial set can be loaded more efficiently than by
individual page faults
(b,c): suggest page size should be small, however, small pages require
– larger page tables– more hardware– greater I/O overhead
(d): load control algorithm is important