virtual memory and paging

31
Virtual Memory and Paging J. Nelson Amaral

Upload: pisces

Post on 11-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Virtual Memory and Paging. J. Nelson Amaral. Large Data Sets. Size of address space: 32-bit machines: 2 32 = 4 GB 64-bit machines: 2 64 = a huge number Size of main memory: approaching 4 GB How to handle: Applications whose data set is larger than the main memory size? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Virtual Memory and Paging

Virtual Memory and Paging

J. Nelson Amaral

Page 2: Virtual Memory and Paging

Large Data Sets

• Size of address space:– 32-bit machines: 232 = 4 GB– 64-bit machines: 264 = a huge number

• Size of main memory:– approaching 4 GB

• How to handle:– Applications whose data set is larger than the main

memory size?– Sets of applications that together need more space

than the memory size?

Baer, p. 60

Page 3: Virtual Memory and Paging

Multiprogramming

• More than one program reside in memory at the same time

• I/O is slow:• If the running program needs I/O, it relinquishes the

CPU

Baer, p. 60

Page 4: Virtual Memory and Paging

Multiprogramming Challenges

• How and where to load a program to memory?

• How a program asks for more memory?• How to protect one program from another?

Baer, p. 60

Page 5: Virtual Memory and Paging

Virtual Memory

• Solution:– Give each program the illusion that it could

address the whole addressing space

• CPU works with virtual addresses• Memory works with real or physical addresses

Baer, p. 60

Page 6: Virtual Memory and Paging

Virtual -> PhysicalAddress Translation

• Paging System– Divide both the virtual and the physical address

spaces into pages of the same size.– Virtual space: page– Physical space: frame

• Fully associative mapping between pages and frames.– any page can be stored in any frame

Baer, p. 60

Page 7: Virtual Memory and Paging

Paging SystemVirtual space is muchlarger than physicalmemory

Memory does not needto store the whole programand its data at the same time

Memory can be shared withlittle fragmentation

Pages can be sharedamong programs

Baer, p. 61

Page 8: Virtual Memory and Paging

Address Translation

valid bit = 0 impliesa page fault (there isno frame in memory

for this page)

Baer, p. 62

Page 9: Virtual Memory and Paging

Page Fault

• Exception generated in program P1 because valid bit = 0 in Page Table Entry (PTE)– Page fault handler initiates I/O read for P1

• I/O read takes several miliseconds to complete

– context switch occurs• O.S. saves processor state and starts I/O operation• Handles CPU control to another program P2

– Restores P2’s state into CPU

Baer, p. 62

Page 10: Virtual Memory and Paging

Address TranslationVirtual and physical addresses can be of different sizes. Example:

64 bits

40 or 48 bitsBaer, p. 62

Page 11: Virtual Memory and Paging

Translation Look-Aside Buffer (TLB)

• Problem: – Storing page table entries (PTEs) in memory would

require a load for each address translation.– Caching PTEs interferes with the flow of

instructions or data into the cache

• Solution: TLB, a small, high-associativity, cache dedicated to cache PTEs

Baer, p. 62

Page 12: Virtual Memory and Paging

TLB organization

• Each TLB entry consists of:– tag– data (a PTE entry)– valid bit– dirty bit– bits to encode memory protection– bits to encode recency of access

• A set of TLB entries may be reserved to the Operating System

Baer, p. 62

Page 13: Virtual Memory and Paging

TLB Characteristics

Architecture Page Size (KB)Number of Entries

I-TLB D-TLB

Alpha 21064 8 8 (FA) 32 (FA)

Alpha 21164 8 48 (FA) 64 (FA)

Alpha 21264 8 64 (FA) 128 (FA)

Pentium 4 32 (4-way) 64 (4-way)

Pentium II 4 32 (4-way) 64 (4-way)

Pentium III 4 32 (4-way) 64 (4-way)

Pentium 4 4 64 (4-way) 128 (4-way)

Core Duo 4 64 (FA) 64 (FA)Baer, p. 63

Page 14: Virtual Memory and Paging

Large Pages

• Recent processors implement large page size (typically 4 MB pages)– reduces page faults in applications with lots of

data (scientific and graph)– requires that TLB entries be reserved for large

pages.

Baer, p. 63

Page 15: Virtual Memory and Paging

Referencing Memory

Baer, p. 63

Page 16: Virtual Memory and Paging

Memory Reference Process

TLB hit?

protectionviolation?

Handle TLB missNo

Page Fault0

Access ViolationException

Yes

Turn PTE dirty bit on

Yes

valid bit?

Yes

1

store?

No

Update Recency

No

Baer, p. 63

Page 17: Virtual Memory and Paging

Handling TLB Misses

• Must access page table in memory– entirely in hardware– entirely in software– combination of both

• Replacement Algorithms– LRU for 4-way associativity (Intel)– Not Most Recently Used for full associativity

(Alpha)

Baer, p. 64

Page 18: Virtual Memory and Paging

Handling TLB Miss (cont.)

• Serving a TLB miss takes 100-1000 cycles.– Too short to justify a context switch– Long enough to have significant impact on

performance• even a small TLB miss rate affects CPI

Baer, p. 64

Page 19: Virtual Memory and Paging

OS handling of page fault

Reserve frame from a free list

Find if faulting page is in disk

Invalidate portions of the TLB (maybe Cache)

Initiate read for faulting page

Find page to replace if there is no free frame

Invalidate cache lines mapping to replaced page

Write dirty replaced pages to the disk

Baer, p. 64

Page 20: Virtual Memory and Paging

When page arrives in memory

I/O interruption is raised

OS updates the PTE of the page

OS schedule requesting process for

execution

Baer, p. 64

Page 21: Virtual Memory and Paging

Invalidating TLB Entries on Context Switch

• Page Fault → Exception → Context Switch• Let:– PR: Relinquishing process

– PI: Incoming Process

• Problem: TLB entries are for PR, not PI

– Invalidating entire TLB on context switch leads to many TLB misses when PI is restored

• Solution: Use a processor ID number (PID)

Baer, p. 64

Page 22: Virtual Memory and Paging

Process ID (PID) Number

• O.S. sets a PID for each program• The PID is added to the tag in the TLB entries• A PID Register stores the PID of the active

process• Match PID Register with PID in TLB entry• No need to invalidate TLB entries on context

switch• PIDs are recycled by the OS

Baer, p. 64

Page 23: Virtual Memory and Paging

Page Size X Read/Write Time

SeekTime

RotationTime

Transfer Time

0 to 10 ms ~ 3 ms Page of Size x

SeekTime

RotationTime

Transfer Time

0 to 10 ms ~ 3 msPage of Size 2x

Baer, p. 65

• Amortizing I/O Time:• Large page size• Read/write consecutive pages

Page 24: Virtual Memory and Paging

Large Pages

• Amortize I/O time to transfer pages• Smaller Page Tables– More PTEs are in main memory• lower probability of double page fault for a single

memory reference

• Fewer TLB misses– Single TLB entry translates more locations

• Pages cannot be too large– Transfer time and fragmentation

Baer, p. 65

Page 25: Virtual Memory and Paging

Performance of Memory Hierarchy

Baer, p. 66

Page 26: Virtual Memory and Paging

When to bring a missing item (to cache, TLB, or memory)?

• On demand

Level Miss Frequency Miss Resolution

Cache few times per 100 references 5-100 cyclesentirely in hardware

TLB few times per 10,000 references 100-1000 cyclesin hardware or software

Page Fault few times per 10,000,000 references

millions of cyclesrequire context switch

Baer, p. 66

Page 27: Virtual Memory and Paging

Where to put the missing item?

• Cache: restrictive mapping (direct or low associativity)

• TLB: fully associative or high set associativity

• Paging System: general mapping

Baer, p. 66

Page 28: Virtual Memory and Paging

How do we know it is there?

• Cache: Compare tags and check valid bits

• TLB: Compare tags, PID, check valid bits

• Memory: Check Page Tables

Baer, p. 67

Page 29: Virtual Memory and Paging

What happens on a replacement?

• Caches and TLBs: (approximation to) LRU• Paging Systems: – Sophisticated algorithms to keep page fault rate

very low– O.S. policies allocate a number of page to each

program according to working set

Baer, p. 67

Page 30: Virtual Memory and Paging

Simulating Memory Hierarchy

• Memory Hierarchy simulation is faster than simulation to assess IPC or execution time

• Stack property of some replacement algorithms:– for a sequence of memory references for a given

memory location at a given level of the hierarchy, the number of misses is monotonically non increasing with the size of the memory

– can simulate a range of sizes in a single simulation pass.

Baer, p. 67

Page 31: Virtual Memory and Paging

Belady’s Algorithm

• Belady’s algorithm: replace the entry that will be accessed the furthest in the future.– It is the optimal algorithm– It needs to know the future• not realizable in practice• useful in simulation to compare with practical

algorithms

Baer, p. 67