10 virtual memory 2 contents

75
10 Virtual Memory

Upload: sammy17

Post on 11-Nov-2014

3.658 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: 10 Virtual Memory 2 Contents

10 Virtual Memory

Page 2: 10 Virtual Memory 2 Contents

2

Contents

Background Demand Paging Process Creation Page Replacement Allocation of Frames Thrashing Operating System Examples Other Considerations

Page 3: 10 Virtual Memory 2 Contents

3

10.1 Background all memory management algorithms outlined so

far require that all of the processes are kept in physical memory– contiguous allocation

– paging

– segmentation

overlay is an exception, but it requires special effort of the programmer– explicit memory allocation and free operations in an

application program

Page 4: 10 Virtual Memory 2 Contents

4

Background

real programs, in many cases, do not need the entire program be kept in memory– code to handle unusual error conditions– arrays, lists and tables are often sparse– even when the entire programs is needed, it

may not all be needed at the same time (such is the case with overlays)

Page 5: 10 Virtual Memory 2 Contents

5

Background

conclusion: if we can keep the program only partially in memory– we can run a program larger than the

physical memory– more programs could be run at the same

time

Page 6: 10 Virtual Memory 2 Contents

6

Virtual Memory

virtual memory – separation of user logical memory from physical memory, allows an extremely large virtual memory to be provided to programmers when only a smaller physical memory is available– only part of the program needs to be in memory for

execution– logical address space can therefore be much larger

than physical address space– allows address spaces to be shared by several

processes– allows for more efficient process creation

Page 7: 10 Virtual Memory 2 Contents

7

virtual memory is larger than physical memory

Page 8: 10 Virtual Memory 2 Contents

8

common implementation

virtual memory can be implemented via:– demand paging

several system provide a paged segmentation scheme, i.e., the user view is segmentation, but the OS can implement this view with demand paging

– demand segmentation

Page 9: 10 Virtual Memory 2 Contents

9

10.2 Demand Paging

a demand-paging system is similar to a paging system with swapping

a lazy swapper – a swapper that never swaps a page into memory unless that page will be needed

compare:– a swapper manipulates entire processes– a pager is concerned with the individual

pages of a process

Page 10: 10 Virtual Memory 2 Contents

10

10.2.1 Basic Concepts

use disk space to emulate memory– a special partition/volume, e.g. Unix, Linux– a special file, e.g. Windows family

Page 11: 10 Virtual Memory 2 Contents

11

Transfer of a Paged Memory to Contiguous Disk Space

Page 12: 10 Virtual Memory 2 Contents

12

valid-invalid bit in page table

hardware support: whether a specific page is in memory?– if it is in memory, valid bit is set;

otherwise, valid bit is reset (i.e. invalid)

Page 13: 10 Virtual Memory 2 Contents

13

Page Table When Some Pages Are Not in Main Memory

6 logical pages with three pages in memory and three not

Page 14: 10 Virtual Memory 2 Contents

14

Page Fault if all reference attempts to access pages

already in memory, the process will run exactly as though we bring in all pages

if there is ever a reference to a page not in memory, first reference will trap to OS page fault

Page 15: 10 Virtual Memory 2 Contents

15

Page Fault Handling

OS looks at another table to decide:– invalid reference abort– just not in memory

find a free frame swap page into frame reset tables, validation bit = 1 restart instruction

Page 16: 10 Virtual Memory 2 Contents

16

Steps in Handling a Page Fault

Page 17: 10 Virtual Memory 2 Contents

17

two type of demand paging

pure demand paging– never bring a page into memory until it is

required demand paging with anticipation– some pre-load with anticipation

Page 18: 10 Virtual Memory 2 Contents

18

10.2.2 Performance of demand paging Page Fault Rate 0 p 1.0– if p = 0, no page faults – if p = 1, every reference is a fault

Effective Access Time (EAT)EAT = (1 – p) * memory access

+ p * (service the page fault interrupt+ [swap a page out]+ swap the page in+ restart overhead)

– (see p.326 for detail)

Page 19: 10 Virtual Memory 2 Contents

19

Performance Example

memory access time = 100 nanosecond swap page time = 25 millisecond effective access time

EAT = (1 – p)*100 + p*25,000,000= 100 + 24,999,900*p

– performance depends heavily on the page fault ratio (probability)

– e.g. less then 10% performance loss => p < 0.000 000 4 (see p.327 for detail)

Page 20: 10 Virtual Memory 2 Contents

20

10.3 Process Creation

in the extreme case, a process may be started with no pages in memory– fast start

virtual memory allows other benefits during process creation– copy-on-write– memory-mapped file

Page 21: 10 Virtual Memory 2 Contents

21

10.3.1 Copy-on-Write

Copy-on-Write allows both parent and child processes to initially share the same pages in memory

if either process modifies a shared page, only then is the page copied– Unix, Linux: fork( ) followed by exec( )

copy-on-write allows more efficient process creation as only modified pages are copied

currently used by Windows 2000, Linux, and Solaris 2

Page 22: 10 Virtual Memory 2 Contents

22

Copy-on-Write

free pages are allocated from a pool of zeroed-out pages, like stack or heap.

vfork( ) system call in some versions of UNIX (e.g. Solaris 2)– study the manual page of vfork( )

Page 23: 10 Virtual Memory 2 Contents

23

10.3.2 Memory-Mapped Files Memory-mapped file I/O allows file I/O to be

treated as routine memory access by mapping a disk block to a page in memory.

A file is initially read using demand paging. A page-sized portion of the file is read from the file system into a physical page. Subsequent reads/writes to/from the file are treated as ordinary memory accesses.

Simplifies file access by treating file I/O through memory rather than read( ) write( ) system calls.

Also allows several processes to map the same file allowing the pages in memory to be shared.

Page 24: 10 Virtual Memory 2 Contents

24

memory-mapped files

Page 25: 10 Virtual Memory 2 Contents

25

10.4 Page Replacement

one process may be bigger than the physical memory

total memory of all processes may be bigger than the physical memory– over-allocating memory

solution– swapping– page replacement: prevent over-allocation of

memory by modifying page-fault service routine to include page replacement

Page 26: 10 Virtual Memory 2 Contents

26

Need for page replacement

Page 27: 10 Virtual Memory 2 Contents

27

10.4.1 Basic Scheme

1. find the location of the desired page on disk2. find a free frame:

- if there is a free frame, use it.- if there is no free frame, use a page replacement algorithm to

select a victim frame- write the victim page to the disk

3. read the desired page into the (newly) free frame; update the page and frame tables

4. restart the process

Page 28: 10 Virtual Memory 2 Contents

28

Page Replacement

Page 29: 10 Virtual Memory 2 Contents

29

Page Replacement

if no frames are free, two page transfers are required

use modify (dirty) bit to reduce overhead of page transfers – only modified pages are written to disk– the modify bit for a page is set by the

hardware whenever any word or byte in the page is written into

Page 30: 10 Virtual Memory 2 Contents

30

Page Replacement Algorithm two problems to implement demand

paging– frame-allocation algorithm: how many

frames to allocate to each process– page-replacement algorithm: how to select the

frames that are to be replaced

– expensive disk I/O makes a good design of these algorithms an important task: we want lowest page-fault rate

Page 31: 10 Virtual Memory 2 Contents

31

Reference String

many algorithms, how to evaluate?– lowest page-fault rate

we must evaluate an algorithm by running it on a particular string of memory references– (page) reference string: page numbers only,

with adjacent duplications eliminated

Page 32: 10 Virtual Memory 2 Contents

32

reference string example

original memory access sequence:0100, 0432, 0101, 0612, 0102, 0103, 0104, 0101, 0611, 0102, 0103, 0104, 0101, 0610, 0102, 0103, 0104, 0101, 0609, 0102, 0105

page reference sequence/string:1, 4, 1, 6, 1, 1, 1, 1, 6, 1, 1, 1, 1, 6, 1, 1, 1, 1, 6, 1, 1

reference string:1, 4, 1, 6, 1, 6, 1, 6, 1, 6, 1

Page 33: 10 Virtual Memory 2 Contents

33

page-faults versus number of frames

for the given reference string:1, 4, 1, 6, 1, 6, 1, 6, 1, 6, 1

if we have three (or more) frames: 3 page-faults if we have only one frames: 11 page-faults

Page 34: 10 Virtual Memory 2 Contents

34

10.4.2 FIFO Page Replacement First-In-First-Out Algorithm:– use a FIFO queue to hold all pages in memory– when a page is brought into memory, insert it at the

tail– when a free frame is needed, we replace the page at

the queue

total 15 page faults

Page 35: 10 Virtual Memory 2 Contents

35

Example 2: Belady’s Anomaly reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 3 frames (3 pages can be in memory at a time per process)

4 frames

FIFO Replacement – Belady’s Anomaly– we expect: more frames less page faults

1

2

3

1

2

3

4

1

2

5

3

4

9 page faults

1

2

3

1

2

3

5

1

2

4

5 10 page faults

44 3

Page 36: 10 Virtual Memory 2 Contents

36

10.4.3 Optimal Page Replacement

replace the page that will not be used for the longest period of time

difficult to implement, because it requires future knowledge of the reference string

total 9 page faults

Page 37: 10 Virtual Memory 2 Contents

37

Example 2

4 frames example 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

optimal algorithm is used for measuring how well your algorithm performs

1

2

3

4

6 page faults

4 5

Page 38: 10 Virtual Memory 2 Contents

38

10.4.4 Least Recently Used (LRU) LRU replacement associates with each page the

time of that page’s last use when a page must be replaced, LRU choose the

page that has not been used for the longest period of time

total 12 page faults

Page 39: 10 Virtual Memory 2 Contents

39

Example 2

reference string:1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

1

2

3

5

4

4 3

5

8 page faults

Page 40: 10 Virtual Memory 2 Contents

40

LRU implementation counter implementation– every page entry has a time-of-use field– a logical clock, incremented for every memory

reference– every time a page is referenced, copy the clock into

the time-of-use field– when a page needs to be changed, look for the page

with smallest time-of-use field difficulties– requires a search of the page table– a write to memory (time-of-use field in page table)

for each memory access– page-table maintain in context switch– overflow of the clock

Page 41: 10 Virtual Memory 2 Contents

41

LRU implementation

Stack implementation– keep a stack of page numbers in a double

linked list– when a page is referenced, move it to the top

requires 6 pointers to be changed

– always replace the page at the bottom No search for replacement

Page 42: 10 Virtual Memory 2 Contents

42

stack implementation example

Page 43: 10 Virtual Memory 2 Contents

43

Hardware support needed!

neither optimal replacement nor LRU replacement suffers from Belady’s anomaly

both implementations of LRU need special hardware support, or we will suffer the performance by a factor of at least ten (e.g. via interrupt) – the updating of the clock fields or stack must

be done for every memory reference

Page 44: 10 Virtual Memory 2 Contents

44

10.4.5 LRU Approximation Reference Bit– with each page associate a bit, initially = 0– when page is referenced, bit set to 1– replace the one which is 0 (if one exists)

we do not know the order, however

Additional Reference Bits– record the reference bits at regular intervals, say,

every 100ms– use right-shift history byte for each page, current

reference bit shift into the left-most bit– choose the page with lowest number

(see Section 10.4.5.1 on page 341 for detail)

Page 45: 10 Virtual Memory 2 Contents

45

LRU Approximation Second-Chance Algorithm– basically an FIFO algorithm + reference bit– all frames form a circular queue– inspect the current frame

if the reference bit is set, reset it, and skip to next frame

otherwise, replace it

if a frame is referenced frequently enough, it will never get replaced

Page 46: 10 Virtual Memory 2 Contents

46

second-chance algorithm

Page 47: 10 Virtual Memory 2 Contents

47

LRU Approximation Enhanced Second-Chance Algorithm– an FIFO circular queue– reference bit– modify bit (dirty bit)

four classes– (0, 0) neither recently used nor modified– (0, 1) not recently used, but modified (dirty)– (1, 0) recently used, but clean– (1, 1) recently used and modified

examine the class to which a page belongs replace the first page encountered in the lowest

nonempty class– drawback: may have to scan the circular queue

several times– used in Macintosh

Page 48: 10 Virtual Memory 2 Contents

48

10.4.6 Counting-based Algorithms keep a counter of number of references

to each page least-frequently-used (LFU) algorithm– the page with the smallest count is replaced– counter shift right at regular interval:

exponentially decaying average most-frequently-used (MFU) algorithm– based on the argument that the page with

the smallest count was probably just brought in and has yet to be used

Page 49: 10 Virtual Memory 2 Contents

49

10.5 Allocation of Frames Each process needs minimum number of pages– performance: fewer pages => more page-faults– architecture limit: machine instructions’

requirements example: IBM 370 needs at least 6 pages to

handle MVC instruction:– instruction is 6 bytes, might span 2 pages– 2 pages to handle from– 2 pages to handle to

two major allocation schemes– fixed allocation– priority allocation

Page 50: 10 Virtual Memory 2 Contents

50

10.5.2 Allocation Algorithms equal allocation: each process gets an

equal share of total frames proportional allocation: allocate available

memory to each process according to its size– priority-based allocation

in all schemes, the allocation to each process may vary according to the multiprogramming level

mSs

pa

m

sS

ps

iii

i

ii

for allocation

frames of number total

process of size

Page 51: 10 Virtual Memory 2 Contents

51

10.5.3 Global .vs. Local Allocation global replacement – process selects a

replacement frame from the set of all frames– one process can take a frame from another

e.g. a high priority process can take a frame from a lower priority process

– problem: a process cannot control its own page-fault rate

– results in greater system throughput local replacement – each process selects from

only its own set of allocated frames– the number of frames allocated to a process does not

change

Page 52: 10 Virtual Memory 2 Contents

52

10.6 Thrashing

If a process does not have “enough” pages, the page-fault rate is very high. This leads to:– low CPU utilization

– operating system thinks that it needs to increase the degree of multiprogramming

– another process added to the system, and worse the condition

thrashing a process is busy swapping pages in and out

Page 53: 10 Virtual Memory 2 Contents

53

10.6.1 Cause of Thrashing

Why does paging work? -- Locality model

– Process migrates from one locality to another

– Localities may overlap

Page 54: 10 Virtual Memory 2 Contents

54

Locality in a Memory-Reference Pattern as a process executes, it

moves from locality to locality

a locality is a set of pages that are actively used together

e.g. a subroutine defines a new locality

Why does thrashing occur? size of locality > total memory size

Page 55: 10 Virtual Memory 2 Contents

55

10.6.2 Working-Set Model working-set window: a fixed number of page

references Example: 10,000 instruction (example below: 10 pages)

Working-Set Size WSSi (working set of Process Pi) = total number of pages referenced in the most recent (varies in time)– if too small will not encompass entire locality.– if too large will encompass several localities.

if = will encompass entire program.

Page 56: 10 Virtual Memory 2 Contents

56

Working-Set Model

D = WSSi total demand frames if D > m thrashing policy: if D > m, then suspend one of the

processes (and swap it out completely)

the working-set strategy prevents thrashing while keeping the degree of multiprogramming as high as possible– it optimizes CPU utilization

Page 57: 10 Virtual Memory 2 Contents

57

Keeping Track of the Working Set

approximate with interval timer + a reference bit Example: = 10,000 references– timer interrupts about every 5000 references– keep in memory 2 bits for each page– whenever a timer interrupts, copy and sets the values

of all reference bits to 0– if one of the bits in memory = 1 page in working set

why is this not completely accurate? improvement: 10 reference bits, and interrupt

every 1000 time units

Page 58: 10 Virtual Memory 2 Contents

58

10.6.3 Page-Fault Frequency Scheme

establish “acceptable” page-fault rate– if actual rate too low, process loses frame– if actual rate too high, process gains frame

Page 59: 10 Virtual Memory 2 Contents

59

10.7 Case Study

Windows NT Solaris 2 Linux

Page 60: 10 Virtual Memory 2 Contents

60

10.7.1 Windows NT Uses demand paging with clustering. Clustering brings

in pages surrounding the faulting page. Processes are assigned working set minimum and

working set maximum.– Working set minimum is the minimum number of

pages the process is guaranteed to have in memory.– A process may be assigned as many pages up to its

working set maximum. When the amount of free memory in the system falls

below a threshold, automatic working set trimming is performed to restore the amount of free memory.– Working set trimming removes pages from

processes that have pages in excess of their working set minimum.

Page 61: 10 Virtual Memory 2 Contents

61

Windows NT

local page-replacement policy– on single x86 processor – a variation of the

clock algorithm– on multiprocessor and Alpha – a variation of

FIFO How to determine working-set minimum

and maximum?– a mystery not stated by the textbook

Page 62: 10 Virtual Memory 2 Contents

62

10.7.2 Solaris 2 maintains a list of free pages to assign faulting

processes lotsfree – threshold parameter to begin paging Paging is peformed by pageout process.– pageout scans pages using modified clock algorithm

– scanrate is the rate at which pages are scanned This ranged from slowscan to fastscan

– pageout is called more frequently depending upon the amount of free memory available

Page 63: 10 Virtual Memory 2 Contents

63

Solar Page Scanner

Page 64: 10 Virtual Memory 2 Contents

64

Supplementary: LinuxMemory Management (20.6)

Linux’s physical memory-management system deals with allocating and freeing pages, groups of pages, and small blocks of memory.

It has additional mechanisms for handling virtual memory, memory mapped into the address space of running processes.

Page 65: 10 Virtual Memory 2 Contents

65

Splitting of Memory in a Buddy Heap

Page 66: 10 Virtual Memory 2 Contents

66

Managing Physical Memory The page allocator allocates and frees all physical pages;

it can allocate ranges of physically-contiguous pages on request.

The allocator uses a buddy-heap algorithm to keep track of available physical pages.– Each allocatable memory region is paired with an adjacent

partner.– Whenever two allocated partner regions are both freed up they

are combined to form a larger region.– If a small memory request cannot be satisfied by allocating an

existing small free region, then a larger free region will be subdivided into two partners to satisfy the request.

Memory allocations in the Linux kernel occur either statically (drivers reserve a contiguous area of memory during system boot time) or dynamically (via the page allocator).

Page 67: 10 Virtual Memory 2 Contents

67

Virtual Memory The VM system maintains the address space

visible to each process: it creates pages of virtual memory on demand, and manages the loading of those pages from disk or their swapping back out to disk as required

The VM manager maintains two separate views of a process’s address space:– a logical view describing instructions concerning the

layout of the address space. The address space consists of a set of non-overlapping regions, each representing a continuous, page-aligned subset of the address space.

– a physical view of each address space which is stored in the hardware page tables for the process.

Page 68: 10 Virtual Memory 2 Contents

68

Virtual Memory (Cont.) on executing a new program, the process is

given a new, completely empty virtual-address space; the program-loading routines populate the address space with virtual-memory regions.

creating a new process with fork involves creating a complete copy of the existing process’s virtual address space– The kernel copies the parent process’s VMA

descriptors, then creates a new set of page tables for the child.

– The parent’s page tables are copies directly into the child’s, with the reference count of each page covered being incremented.

– After the fork, the parent and child share the same physical pages of memory in their address spaces.

Page 69: 10 Virtual Memory 2 Contents

69

Virtual Memory (Cont.)

The VM paging system relocates pages of memory from physical memory out to disk when the memory is needed for something else.

The VM paging system can be divided into two sections:– the pageout-policy algorithm decides which pages to

write out to disk, and when.– the paging mechanism actually carries out the

transfer, and pages data back into physical memory as needed.

Page 70: 10 Virtual Memory 2 Contents

70

10.9 Summary

virtual memory makes it possible to execute a process whose logical address space is larger than the available physical address space

virtual memory increases multiprogramming level, thus CPU utilization and throughput

Page 71: 10 Virtual Memory 2 Contents

71

Summary

pure demand paging– backing store

– page fault

– page table

– OS internal frame table

page fault rate low => performance acceptable

Page 72: 10 Virtual Memory 2 Contents

72

Summary page replacement algorithms– FIFO

Belady’s anomaly

– optimal– Least Recently Used (LRU)

reference bit additional reference bits second chance algorithm (clock algorithm) enhanced second chance algorithm

– Counter-based Least Frequently Used (LFU)

Page 73: 10 Virtual Memory 2 Contents

73

Summary

frame allocation policy– fixed (i.e. equal share)– proportional (to program size)– priority-based

– static, or local, page replacement supported by working-set model

– dynamic, or global, page replacement

Page 74: 10 Virtual Memory 2 Contents

74

Summary

working-set model– locality– the WS is the set of pages in the current

locality thrash– if a process does not have enough memory

for its working set, it will thrash

Page 75: 10 Virtual Memory 2 Contents

75

Homework

paper– 2, 4, 5, 8, 11, 17, 20

oral– 1, 6, 7, 9, 16, 18

lab– 21

supplementary materials– Intel 80x86 protected mode operation