chapters 7 - 8cs325/osp_08.pdf · 8. virtual memory 8.1 principles of virtual memory 8.2...

1

S2004 CS 325 1

Memory Management

Chapters 7 - 8

Real memoryReal memoryLogical storageLogical storageAddress spacesAddress spacesVirtual memoryVirtual memory

S2004 CS 325 2

Chapter 7 - Physical Memory

7.1 Preparing a Program for Execution – Program Transformations – Logical-to-Physical Address Binding

7.2 Memory Partitioning Schemes– Fixed Partitions – Variable Partitions

7.3 Allocation Strategies for Variable Partitions

7.4 Dealing with Insufficient Memory

2

S2004 CS 325 3

8. Virtual Memory

8.1 Principles of Virtual Memory8.2 Implementations of Virtual Memory

– Paging– Segmentation– Paging With Segmentation– Paging of System Tables– Translation Look-aside Buffers

8.3 Memory Allocation in Paged Systems– Global Page Replacement Algorithms– Local Page Replacement Algorithms– Load Control and Thrashing– Evaluation of Paging

S2004 CS 325 4

Address Binding Time

assign physical addresses = relocation

• static binding

– programming time

– compilation time

– linking time

– loading time

• dynamic binding

– execution time

3

S2004 CS 325 5

Relocation

• Assignment of real memory• Programs make use of logical address space

– Code/Data spaces– Segments

• Real memory is linear– divided into segments – subdivided into page frames

• Relocation - Load– Static (link – bind early)– Dynamic (link – bind late)– Hardware supported – MMU and HW caches

S2004 CS 325 6

CPUNL_map

logical address

Data transfer (read/write)

Fig 7-3a: storage reference with dynamic relocation (general principle)

Main storage

Address Relocation

NL_map: {relocatable (virtual) addresses} {real (physical) storage addresses}

physical address

MMU

I/O subsystem

4

S2004 CS 325 7

Address binding

• how to implement dynamic binding– perform for each address at run time:

pa = address_map(la)– la == va pa = F ( va )

– simplest form of address_map function: relocation register

pa = la + RR– more general form: page/segment table (Chapter 8)

S2004 CS 325 8

Partitions - Segments

• Multiprogramming requires that more than one program is kept in main memory

• Memory is divided into logical “partitions”• Partitions can be of fixed or variable size and be

located at specific or selectable locations -segments

• Each partitions should provide a private address space for the program/process/thread/object

• Efficient partition management and address binding requires hardware support

• Segments (base and limit registers) and Page Frames with appropriate relocation support must alleviate this task of the operating system.

5

S2004 CS 325 9

segmented memory – flexible logical address management

NL_map

VM_1

VM_2

physical (real) memory

logical (virtual) memory

S2004 CS 325 10

Drawbacks

• Address translation at run time consumes time (TINSTAAFL)

• Programs and data areas grow and shrink dynamically in size

• Partitions are eventually released (freed by a program)

• Newly arriving processes require memory space I.e. allocation of a partition

• Need to manage free space and allocation of partitions

• Possibly rearrange dynamically the allocation

there is no such thing as a free lunch

6

S2004 CS 325 11

A

B

A

C C

H1 H1

H2

D

D

Memory fragmentation|D| > |H1| , |D| > |H2| |D| < |H1| + |H2|

S2004 CS 325 12

p2

3

2

memory compactioninitial state

p3

p1

4

2

5

5

9

p2

3

p3

p1

2

20

5

9

p2

3

p3

p1

2

11

5

9

p2

3

p3

p1

2

5

9

11

total compaction

partial compaction minimal move

7

S2004 CS 325 13

Address Space Management• Programs have a dynamic calling structure – subroutines,

procedures, functions etc.• Programs also frequently have varying data areas• A programmer may not know how his/her program behaves• Measurements have shown that statistically programs tend

to exhibit high “locality”• Using this characteristic feature one can delegate the

management of the mapping from logical (“virtual”) addresses to real, physical addresses to a combination of hardware and software (operating system)

• Hence, the user is presented with the illusion of an almost unlimited, private address space!

S2004 CS 325 14paged memory – flexible physical address management

NL_mapVM_1

VM_2



frame 0

frame 1

frame 2

frame 3

frame 4

frame 5

frame 6

frame 7

page 0

page 1

page 2

page x

pages page framesMMU

8

S2004 CS 325 15

paging

• The physical memory is subdivided into page frames

• The logical memory is subdivided into pages• This means that memory addresses are now

composed of two parts:– A page number– An offset (distance) within the page

• Hardware assistance is required for the binding:– Page numbers are translated into frame numbers– The offset value is used as is

• Attention: this is not identical to “virtual” memory!!!

S2004 CS 325 16

Fig 8-2: form of virtual and physical address

va

|w| bits

page number p

pa

w

|w| bits

frame number f

|f| bits

|p| bits

w

9

S2004 CS 325 17Fig 8-3: paged

virtual memory

NL_map

Page No

LS - physical memory

0

1

2

(p,w)

P-1

3

0

1

2

f

F-1

3

NS - virtualmemory

Frame No

(f,w)

S2004 CS 325 18

Principles of virtual memory

• system creates illusion of large contiguous memory space(s) for each process

• relevant portions of VM are loaded automatically and transparently

• address map translates virtual addresses to physical addresses

• Figure 8-1

10

S2004 CS 325 19

Fig 8-1: principles of virtual memory

NL_map

VM_1

VM_2



S2004 CS 325 20

Principles of virtual memory

• single-segment VM: – one area of 0..n-1 words– divided into fix-size pages

• multiple-segment VM: – multiple areas of up to 0..n-1 (words)– each holds a logical segment (function, data structure)– each is contiguous or divided into pages

11

S2004 CS 325 21

Main issues in VM design

• address mapping– how to translate virtual addresses to physical

• placement– where to place a portion of VM needed by process

• replacement– which portion of VM to remove when space is needed

• load control– how much of VM to load at any one time

• sharing– how can processes share portions of their VMs

S2004 CS 325 22

Implementation using paging

• VM is divided into fix-size pages• PM is divided into page frames (page size)• system loads pages into frames and translates

addresses• virtual address: va = (p,w) • physical address: pa = (f,w)• |p| determines number of pages in VM• |f| determines number of frames in PM• |w| determines page/frame size

Figure 8-2

12

S2004 CS 325 23

Address translation

• given (p,w):– determine f from p -- how to keep track?– concatenate f+w to form pa

• one solution: frame table– one entry FT[i] for each frame– FT[i].pid records process ID– FT[i].page records page number p– given (id,p,w): search for a match on (id,p): index f is the

frame number

S2004 CS 325 24

physical memory

logical / virtual memory

frame table

(p,w)

(f,w)(p)

pid = 1

(p,w)

pid = 2

(p)

pid = 1

pid = 2

= inverted page table

00

00F…F

FFF…F

13

S2004 CS 325 25

Address translation

address map(id, p, w) {

pa = UNDEFINED;

for (f = 0; f < F; f++)

if (FT[f].pid==id && FT[f].page==p)pa:= (f+w);

return pa;

}

S2004 CS 325 26

distance0

1

2

Id

F-1

3

f

. . . . . Frame table FT

w

pages in main memory

Fig 8-4: address translation with frame table

| w

frame address

. . . . . . . . .

. . . . memory access

.

.

.

.

.

.

.

.

frame f

page p

p

. . . .

. . . .

pid page

Id | p

14

S2004 CS 325 27

Address translation

Figure 8-4

address_map(id, p, w) { pa = UNDEFINED; for (f = 0; f < F; f++) if (FT[f].pid==id && FT[f].page==p)pa = f+w;

return pa; }

• drawbacks of frame table implementation– costly: search must be done in parallel in hardware– sharing of pages difficult or not possible

S2004 CS 325 28

Page tables

• PT associated with each VM (not PM)• PTR points at PT at run time• p’th entry holds frame number of page p:

– *(PTR+p) points to frame f• address translation:

address_map(p, w) { pa = *(PTR+p)+w; return pa

}

Figure 8-5• drawback: extra memory access

15

S2004 CS 325 29

physical memory

logical / virtual memory

page table

(p,w)

(f,w)

(p)

pid = 1

(p,w)

pid = 2 00

00F…F

FFF…F

(f)

(f)

page table

S2004 CS 325 30

distance

f

. . . . . page tables

w


Fig 8-5: address translation with page table

p | w

PTR

frame address

. . . . . . . . . .


. . . .

p

. . . .

16

S2004 CS 325 31

Fig 8-6: page table contents

. . . .

1024

1025

1026

1027

. . . .

. . . .

21504

40960

3072

15360

. . . .

va = 2.100

pa = 3.172

PTR = 1024

S2004 CS 325 32

distance0

1

2

page in Frame_i

F-1

3

i

. . . . . Associative memory

w


Fig 8-?: address translation with page table

p | wpage no

frame address

parallel search

. . . . . . . . . .


17

S2004 CS 325 33

Address Mapping

• Two possible approaches:– Frame table – 1 entry per real page frame– Page table – 1 entry per logical/virtual page

• Both solutions depend on searching the table– Requires hardware support to speed up the process

(associative memory – Translation Lookaside Buffer)

• Both solutions need 2 memory accesses per memory reference unless associative memory is used for the translation table.

S2004 CS 325 34

Demand paging

• all pages of VM can be loaded initially– simple but maximum size of VM = size of PM

• pages a loaded as needed -- on demand– additional bit in PT indicates presence/absence– page fault occurs when page is absent

address_map(p, w) { if (resident(*(PTR+p))) { pa = *(PTR+p)+w; return pa; }

else page_fault;}

18

S2004 CS 325 35

VM using segmentation

• multiple contiguous spaces– more natural match to program/data structure– easier sharing (Chapter 9)

• va = (s,w) mapped to pa (but no frames)

• where/how are segments placed in PM?– contiguous versus paged application

S2004 CS 325 36

Contiguous allocation

• each segment is contiguous in PM• ST keeps track of starting locations• STR point to ST• address translation

address_map(s, w) { if (resident(*(STR+s))) { pa = *(STR+s)+w; return pa; }

else segment_fault; }

• drawback: external fragmentation

19

S2004 CS 325 37

Paging with segmentation

• each segment is divided into fix-size pages• va = (s,p,w)

|s| determines # of segments (size of ST)|p| determines # of pages per seg. (size of PT)|w| determines page size

• pa = *(*(STR+s)+p)+w

Figure 8-7

• drawback: 2 extra memory references

S2004 CS 325 38

STR

. . . . .

w

memorypages

Fig 8-7: address translation with page and segment table

psegment no

. . . . . .

. . .

s

. . . . . .

segment table

ws

. . .

p

. . . . . .

page tables

20

S2004 CS 325 39

Paging of system tables

• ST or PT may be too large to keep in PM– divide ST or PT into pages– keep track by additional page table

• paging of ST– ST divided into pages– segment directory keeps track of ST pages– va = (s1,s2,p,w)– pa = *(*(*(STR+s1)+s2)+p)+w

Figure 8-8

S2004 CS 325 40

STR

. . . . .

w

memorypages

p

. . . . . .

. . .

s1

. . . . . .

ws2

. . .

p

. . . . . .

page table

s1

. . .

. . . . . . Fig 8-8: address translation

with paged segment table

s2

paged segment tables

21

S2004 CS 325 41

32

segment

s

. . . . . .

offset

segment table

Fig 8-9a: address translation in the Pentium processor – segmentation only

s

14 32

segment registers base. . .

32

offs

et

S2004 CS 325 42

32

pagedirectory

s

. . . .

offset

segment table

Fig 8-9b: address translation in the Pentium processor – segmentation with paging

s

14 32

segment registers

base. . . w

p1 w

10 12

p2

10

+

p2

p1

pages ofpage table

program/data pages

22

S2004 CS 325 43

Translation look-aside buffers

• to avoid additional memory accesses– keep most recently translated page numbers in associative

memory:for any (s,p,*) keep (s,p) and frame# f

– bypass translation if match on (s,p) found

Figure 8-10

• TLB is different from cache– TLB only keeps frame #s– cache keeps data values

S2004 CS 325 44

s, p

. . . . .

segment/page number

w


Fig 8-10: translation lookaside buffer

s

. . . . . . . . . .

. . . .

p w

frame number

replacement info (LRU)

f

23

S2004 CS 325 45

Memory allocation with paging

• placement policy: any free frame is ok• replacement: must minimize data movement• global replacement:

– consider all resident pages (regardless of owner)• local replacement

– consider only pages of faulting process• how to compare different algorithms:

– use reference string: r0 r1 ... rt … rt is the number of the page referenced at time t

– count # of page faults

S2004 CS 325 46

Global page replacement

• optimal (MIN): replace page that will not be referenced for the longest time in the future

Time t | 0| 1 2 3 4 5 6 7 8 9 10RS | | c a d b e b a b c d Frame 0| a| a a a a a a a a a d Frame 1| b| b b b b b b b b b b Frame 2| c| c c c c c c c c c c Frame 3| d| d d d d e e e e e e IN | | e d OUT | | d a

• problem: RS not known in advance

24

S2004 CS 325 47


• random replacement– simple but– does not exploit the locality of reference

most instructions are sequential (branch instructions typically ,<~10%)most loops are short(within at most a few pages)many data structures are accessed sequentially(but overall arrangements can be critical- "wrap around" for column/row access- phased bank structure)

S2004 CS 325 48


• FIFO: replace oldest pageTime t | 0| 1 2 3 4 5 6 7 8 9 10

RS | | c a d b e b a b c d Frame 0|>a|>a >a >a >a e e e e >e d Frame 1| b| b b b b >b >b a a a >a Frame 2| c| c c c c c c >c b b b Frame 3| d| d d d d d d d >d c c IN | | e a b c d OUT | | a b c d e

• problem: – favors recently accessed pages but – ignores when program returns to old pages

25

S2004 CS 325 49


• LRU: replace least recently used pageTime t | 0| 1 2 3 4 5 6 7 8 9 10RS | | c a d b e b a b c d Frame 0| a| a a a a a a a a a d Frame 1| b| b b b b b b b b b b Frame 2| c| c c c c e e e e e d Frame 3| d| d d d d d d d d c cIN | | e c d OUT | | c d eQ.end | d| c a d b e b a b c d

| c| d c a d b e b a b c| b| b d c a d d e e a b

Q.head | a| a b b c a a d d e a

S2004 CS 325 50


• LRU implementation– software queue: too expensive– time-stamping

stamp each referenced page with current timereplace page with oldest stamp

– hardware capacitor with each framecharge at referencereplace page with smallest charge

– n-bit aging register with each frameshift all registers to right at every referenceset left-most bit of referenced page to 1replace page with smallest value

26

S2004 CS 325 51


• second-chance algorithm– approximates LRU– implement use-bit u with each frame– u=1 when page referenced– to select a page:

if u==0, select pageelse, set u=0 and consider next frame

– used page gets a second chance to stay in PM

• algorithm is called “clock” algorithm:– search cycles through page frames

S2004 CS 325 52


• second-chance algorithm

… 4 5 6 7 8 9 10… b e b a b c d . … >a/1 e/1 e/1 e/1 e/1 >e/1 d/1 … b/1 >b/0 >b/1 b/0 b/1 b/1 >b/0… c/1 c/0 c/0 a/1 a/1 a/1 a/0 … d/1 d/0 d/0 >d/0 >d/0 c/1 c/0 … e a c d

27

S2004 CS 325 53


• third-chance algorithm– second chance makes no difference between read and write

access– write access more expensive– give modified pages a third chance:

u-bit set at every reference (read and write)w-bit set at write referenceto select a page, cycle through frames, resetting bits, until uw==00:

uw → uw1 1 0 11 0 0 0 "cheapest" candidate (cached?)0 1 0 0 * (remember modification)0 0 select

S2004 CS 325 54


• third-chance algorithm

… 0 | 1 2 3 4 5 … | c aw d bw e … >a/10 |>a/10 >a/11 >a/11 >a/11 a/00*… b/10 | b/10 b/10 b/10 b/11 b/00*… c/10 | c/10 c/10 c/10 c/10 e/10… d/10 | d/10 d/10 d/10 d/10 >d/00… | e

28

S2004 CS 325 55

Local page replacement

• measurements indicate that every program needs a “minimum” set of pages– if too few, thrashing occurs– if too many, page frames are wasted

• the “minimum” varies over time• how to determine and implement this “minimum”?

S2004 CS 325 56


• optimal (VMIN)– define a sliding window (t,t+τ)– τ is a parameter (constant)– at any time t, maintain as resident all pages visible in window

• guaranteed to generate smallest number of page faults

29

S2004 CS 325 57


• optimal (VMIN) with τ=3Time t | 0| 1 2 3 4 5 6 7 8 9 10RS | d| c c d b c e c e a d

Page a | -| - - - - - - - - √ -Page b | -| - - - √ - - - - - -Page c | -| √ √ √ √ √ √ √ - - -Page d | √| √ √ √ - - - - - - √Page e | -| - - - - - √ √ √ - -IN | | c b e a d OUT | | d b c e a

• guaranteed optimal but• unrealizable without RS

S2004 CS 325 58


• working set model: use principle of locality– use trailing window (instead of future window)– working set W(t,τ): all pages referenced during (t–τ,t)– at time t:

remove all pages not in W(t,τ)process may run only if entire W(t,τ) is resident

30

S2004 CS 325 59


• working set modelTime t | 0| 1 2 3 4 5 6 7 8 9 10RS | a| c c d b c e c e a d

Page a | √| √ √ √ - - - - - √ √Page b | -| - - - √ √ √ √ - - -

Page c | -| √ √ √ √ √ √ √ √ √ √Page d | √| √ √ √ √ √ √ - - - √Page e | √| √ - - - - √ √ √ √ √IN | | c b e a d OUT | | e a d b

• drawback: costly to implement• approximate (aging registers, time stamps)

S2004 CS 325 60


• page fault frequency– main objective: keep page fault rate low– basic principle of pff:

if time between page faults ≤ τ, grow resident set: add new page to resident setif time between page faults > τ, shrink resident set: add new page but remove all pages not referenced since last page fault

31

S2004 CS 325 61


• page fault frequency

Time t | 0| 1 2 3 4 5 6 7 8 9 10RS | | c c d b c e c e a d Page a | √| √ √ √ - - - - - √ v Page b | -| - - - √ √ √ √ √ - -Page c | -| √ √ √ √ √ √ √ √ √ √Page d | √| √ √ √ √ √ √ √ √ - v Page e | √| √ v v - - √ √ √ √ √IN | | c b e a d OUT | | ae bd

S2004 CS 325 62

Load control and thrashing

• main issues:– how to choose degree of multiprogramming

– when level decreased, which process should be deactivated

– when new process reactivated, which of its pages should be loaded

32

S2004 CS 325 63


• choosing degree of multiprogramming

• local replacement:– working set of any process must be resident– this automatically imposes a limit

• global replacement– no working set concept– use CPU utilization as a criterion– with too many processes, thrashing occurs

Figure 8-11

S2004 CS 325 64

Fig 8-11: program behaviorL >> S

L < S

L ~ S

degree of multi-programming

33

S2004 CS 325 65


• how to find Nmax?

– L=S criterion:page fault service S needs to keep up with mean time between faults L

– 50% criterion: CPU utilization is highest when paging disk ~50% busy (found experimentally)

S2004 CS 325 66


• which process to deactivate– lowest priority process– faulting process– last process activated– smallest process– largest process

• which pages to load when process activated

Figure 8-12– > prepage last resident set

34

S2004 CS 325 67

Fig 8-12: lifetime curve of a program

frequent page faults

page fault rate increases & MTBPF also

page fault rate decreases, MTBPF grows slowly - >adding pages has small benefits

-> prepage size

S2004 CS 325 68

Evaluation of paging• experimental measurements:

Figure 8-13(a): number of pages referenced over time

– initial set can be loaded more efficiently than by individual page faults

(b): number of instructions executed within one page

(c): influence of page size for fixed total on page fault rate

(d): influence of available number of pages on page fault rate

35

S2004 CS 325 69Fig 8-13: qualitative behavior

S2004 CS 325 70

Conclusions

(a): prepaging is important– initial set can be loaded more efficiently than by

individual page faults

(b,c): suggest page size should be small, however, small pages require

– larger page tables– more hardware– greater I/O overhead

(d): load control algorithm is important

chapters 7 - 8cs325/osp_08.pdf · 8. virtual memory 8.1 principles of virtual memory 8.2...

Documents