csci 350 ch. 8 address translation - usc viterbi | ming
TRANSCRIPT
1
CSCI 350Ch. 8 – Address Translation
Mark Redekopp
Michael Shindler & Ramesh Govindan
2
Abstracting Memory
• Thread = Abstraction of the processor
• What about abstracting memory?– "All problems in computer science can be solved by another level of
indirection"
• Address translation => Abstraction of memory
Processor
MemoryOutput
Devices
Input
DevicesSoftware
Program
Data
3
Benefits of Address Translation
• What is enabled through address translation?
– Illusion of more or less memory than physically present
– Isolation
– Controlled sharing of code or data
– Efficient I/O (memory-mapped files)
– Dynamic allocation (Heap / Stack growth)
– Process migration
4
Virtual vs. Physical Addresses
• Translation between:
– Virtual address: Address used by the process (programmer)
– Physical address: Physical memory location of the desired data
Translation Unit /
MMU
(Mem. Mgmt. Unit)
Proc.
Core
Virtual
AddrMemory
Physical
Addr
Data
0x00000000
0x3fffffffframe
Pg. 1
Physical
Memory and
Address Space
Pg. 0
Pg. 3
Pg. 2
Pg. 0
frame
I/O
and
un-
used
area
0xffffffff
Pg. 0
Pg. 1
Pg. 2
Pg. 3
unused
Pg. 0
Pg. 1
Pg. 2
Pg. 0
Pg. 1
Pg. 2
Pg. 3
Secondary
Storage
…
unused
…
unused
…
unused
Pg. 0
Pg. 1
Pg. 2
Pg. 3
Pg. 0
Pg. 1
Pg. 2
Pg. 0
Pg. 1
Pg. 2
Pg. 3
Fictitious Virtual
Address Spaces
5
Translation Goals & Roadmap
• Functional Goals
– Isolation and transparency to programmer
– Controlled sharing
– Support for sparse address spaces and dynamic resizing
• Performance Goals
– Flexible physical placement
– Fast translation
– Low memory overhead (compact tables)
• We'll first focus on function and then discuss efficiency
6
Evolution of Translation
• Base + bounds check
• Segmentation
• Paging
7
SEGMENTATION
8
Base and Bounds
• Each running process has a:– Virtual address space from VA: 0 to N-1
– Physical address space from PA: BASE to BASE+N-1
• Program written (compiled) using VAs and determine necessary BOUND (i.e. N)
• When loaded, OS assigns BASE (phys. addr. space) dynamically
• Hardware converts VAs to PAs and performs bounds check
CPU
esp
VA: 0x02000ebx
eip
eax
Translation Unit / MMU
0x14000
base
0x05000
bound>
+PA: 0x16000
Exception
Physical Addr
Virtual Addr
unused
P1 Phys.
Addr.
Space
Base: 0x14000
Base + Bound:
0x19000
0x16000
P2 Phys
Addr.
Space
• The "BASE" and "BOUNDS" registers and checking hardware are termed the MMU (Mem. Mgmt. Unit) or Translation Unit
• Base and Bound are loaded/restored on process switches.
9
Base and Bounds Pros & Cons
• Pros:– Simple
– Provides isolation amongst processes
• Cons:– No easy way to share data
– Can't enforce access rights within the process (e.g. code = read only)• Processes can "rewrite" their own address space
CPU
esp
VA: 0x02000ebx
eip
eax
Translation Unit / MMU
0x14000
base
0x05000
bound>
+PA: 0x16000
Exception
Physical Addr
Virtual Addr
unused
P1 Phys.
Addr.
Space
Base: 0x14000
Base + Bound:
0x19000
0x16000
P2 Phys
Addr.
Space
• The "BASE" and "BOUNDS" registers and checking hardware are termed the MMU (Mem. Mgmt. Unit) or Translation Unit
• Base and Bound are loaded/restored on process switches.
10
Segmentation
• Allow a process to be broken into multiple base + bounds "segments"– Code, data, stack, heap
• Multiple base + bounds registers stored in a table in the MMU
• Updating the table is a "privileged" (kernel-only) operation
CPU
esp
VA: 0x102000ebx
eip
eax
Translation Unit / MMU
>
+
PA: 0x16000
Exception
Physical Addr
Virtual Addr
unused
Stack
Seg.
Base: 0x14000
Base + Bound:
0x19000
0x16000
0x2a000 0x03200 R/W
Base Bound R/W
0x14000 0x05000 R/W
0x08000 0x0400 R
1 02000
seg. offset
0
1
2
De
scrip
tor
Ta
ble
1:1:3ss
13 bits=Seg. 1=LDT/0=GDT 0-3=RPLSeg. Reg. Format:
Data
Seg.
Base: 0x2a000
Code
Seg.
Base + Bound:
0x2d200
Base: 0x08000
Base + Bound:
0x80400
http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html
11
Segmentation Pros• Enforce access rights to various segment using access bit
entries in the segment descriptor
• Detect out of bounds accesses (segmentation fault)
• Memory mapped files (segment = file on disk)
• Sharing code + data (see example in a few slides)
– Shared code / libraries and/or data
– One code segment for multiple instances of a running program
• Key idea: What is behind a segment can be "anything"– Translation gives us a chance to intervene
CPU
esp
VA: 0x102000ebx
eip
eax
Translation Unit / MMU
>
+
PA: 0x16000
Exception
Physical Addr
Virtual Addr
unused
Stack
Seg.
Base: 0x14000
Base + Bound:
0x19000
0x16000
0x2a000 0x03200 R/W
Base Bound R/W
0x14000 0x05000 R/W
0x08000 0x0400 R
1 02000
seg. offset
0
1
2
De
scrip
tor
Ta
ble
1:1:3ss
Data
Seg.
Base: 0x2a000
Code
Seg.
Base + Bound:
0x2d200
Base: 0x08000
Base + Bound:
0x80400
12
Segmentation Cons
• External Fragmentation– As a system runs and segments are created and deleted
the physical address may have many small, unusable gaps and we effectively lose that memory for use
• Growing a segment may be hard since they must be contiguous (other segments may be bracketing our growth)
CPU
esp
VA: 0x102000ebx
eip
eax
Translation Unit / MMU
>
+
PA: 0x16000
Exception
Physical Addr
Virtual Addr
unused
Stack
Seg.
Base: 0x14000
Base + Bound:
0x19000
0x16000
0x2a000 0x03200 R/W
Base Bound R/W
0x14000 0x05000 R/W
0x08000 0x0400 R
1 02000
seg. offset
0
1
2
De
scrip
tor
Ta
ble
1:1:3ss
13 bits=Seg. 1=LDT/0=GDT 0-3=RPLSeg. Reg. Format:
Data
Seg.
Base: 0x2a000
Code
Seg.
Base + Bound:
0x2d200
Base: 0x08000
Base + Bound:
0x80400
http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html
13
x86 Segmentation
• Segment descriptors are stored in tables in the kernel's memory region
• Local descriptor table (LDT) [Mem]: Up to 8192 segment descriptors for each process
• Global descriptor table (GDT) [Mem]: Kernel, LDT, and a few other types of descriptors
– Pointed to by GDTR / Interrupt descriptor table pointed to by IDTR
• MMU (in CPU) stores a cache of descriptors from the process' LDT for each segment register (CS, DS, SS, ES, FS, GS)
CPU
esp
VA: 0x102000ebx
eip
eax
Translation Unit / MMU
>
+
PA: 0x16000Exception
Physical Addr
Virtual Addr
unused
Stack
Seg.
Base: 0x14000
Base + Bound:
0x19000
0x16000
0x2a000 0x03200 R/W
Base Bound R/W
0x14000 0x05000 R/W
0x08000 0x0400 R
1 02000
seg. offset
DS
SS
CS
De
scrip
tor C
ache
1:1:3ss
Data
Seg.
Base: 0x2a000
Code
Seg.
Base + Bound:
0x2d200
Base: 0x08000
Base + Bound:
0x80400
http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html
LDT1
GDT
0xc4000GDTR
0xc0000
0xc4000
IDTR
LDT/TR 0xc0000
0
8191
0:1:3cs2:1:3ds
14
Multiple Processes + Sharing
• A physical segment can be shared by 2 processes by setting up the descriptor
CPU
esp
VA: 0x102000ebx
eip
eax
Translation Unit / MMU
>
+
PA: 0x16000Exception
Physical Addr
Virtual Addr
unused
P1
Stack
Base: 0x14000
0x16000
0x7a000 0x03200 R/W
Base Bound R/W
0x14000 0x05000 R/W
0x08000 0x0400 R
1 02000
seg. offset
DS
SS
CS
De
scrip
tor C
ache
1:1:3ss
P2
Data
Base: 0x7a000
P1/P2
Code Base: 0x08000
http://ece-research.unm.edu/jimp/310/slides/micro_arch2.html
LDT1
GDT
0xc4000GDTR
0xc0000
0xc4000
IDTR
LDT/TR 0xc0000
0
8191
0:1:3cs2:1:3ds
LDT2
0
8191
P1
Data
P2
Stack
0x7e000 0x01000 R/W
Base Bound R/W
0x6e000 0x05000 R/W
0x08000 0x0400 R
DS
SS
CS
Base: 0x6e000
Base: 0x7e000Process 2
Process 1
15
Forking a Process & Copy-On-Write
• Recall forking a process makes a "copy" of the address space
– Generally going to "exec" soon afterward
• In reality we can just make a copy of the page directory but mark all entries read/only
• On any write, we will gen. an exception and we can make a copy of the segment for the child (aka Copy-On-Write)
• If no write, the two processes can share and thus we save time avoiding the copy
• This is an example of a general concept called Lazy Evaluation– Start by doing minimal required work; do more work when required
CPU
esp
VA: 0x102000ebx
eip
eax
Translation Unit / MMU
unused
Stack
Seg.
Base: 0x14000
Base + Bound:
0x19000
0x16000
0x2a000 0x03200 R/W
Base Bound R/W
0x14000 0x05000 R/W
0x08000 0x0400 R
0
1
2
1:1:3ss
Data
Seg.
Base: 0x2a000
Code
Seg.
Base + Bound:
0x2d200
Base: 0x08000
Base + Bound:
0x80400
0x2a000 0x03200 R
0x14000 0x05000 R
0x08000 0x0400 R
0
1
2
Forked Proc Seg.
Table
(Read only)
Parent Proc Seg.
Table
(Read only)
16
Shared Library Tangent• All programs don't need to link in the same
library functions (think printf, strcpy, memset, etc.)
• Place the library functions in a shared library
– Problem: Compiled progs. won't know where the library code will be when compiled
– Solution: Compiler generates code to lookup any call to a shared function's in a table that it generates but loader fills in with actual library
– Map a segment to describe that shared code area
unused
P1
Stack
Base: 0x14000
0x16000
P2
Data
Base: 0x7a000
Shared
Lib. Base: 0x08000
LDT1
GDT
0xc0000
0xc4000
8191
LDT2
0
8191
P1
Code
P2
Code Base: 0x6e000
Base: 0x7e000
CPU
esp
VA: 0x102000ebx
eip
eax
Translation Unit / MMU
0x2a000 0x03200 R/W
Base Bound R/W
0x14000 0x05000 R/W
0x08000 0x0400 R
0
1
2
1:1:3ss
Proc. 2
Proc. 1
0x7e000 0x01000 R/W
Base Bound R/W
0x6e000 0x05000 R/W
0x08000 0x0400 R
2
3
4
printf 0x400strcpy 0x640
… …
printf 0x400strcpy 0x640
… …
17
PAGING
18
Paging
• Helps to avoid external fragmentation– No unused "gaps" in memory
• Virtual memory (process' viewpoint) is divided into equal size "pages" (4KB)
• Physical memory is broken into page frames (which can hold any page of virtual memory and then be swapped for another page)
• Virtual address spaces are contiguous while physical layout is not
Physical Frame of
memory can hold data
from any virtual page.
Since all pages are the
same size any page can
go in any frame (and be
swapped at our desire).
0x00000000
0x3fffffffframe
Pg. 1
Pg. 0
Pg. 3
Pg. 2
Pg. 0
frame
I/O
and
un-
used
area
0xffffffff
Pg. 0
Pg. 1
Pg. 2
Pg. 3
unused
…
Pg. 0
Pg. 1
Pg. 2
unused
unused
…
Phys. Addr.
Space
Proc. 1 VAS Proc. 2 VAS
19
Page Size and Address Translation
• Since pages are usually retrieved from disk, we size them to be fairly large (several KB) to amortize the large access time
• Virtual page number to physical page frame translation performed by HW unit = MMU (Mem. Management Unit)
• Page table is an in-memory data structure that the HW MMU will use to look up translations from VPN to PPFN
Offset within pageVirtual Address Virtual Page Number
31 12 11 0
Offset within pagePhysical AddressPhys. Page Frame
Number
31 30 12 11 0
00
Copied
12
Translation
Process
(MMU +
Page Table)
29
20
18
20
Analogy for Page Tables
• Suppose we want to build a caller-ID mechanism for your contacts on your cell phone– Let us assume 1000 contacts represented by a 3-digit integer (0-999) by
the cell phone (this ID can be used to look up their names)
– We want to use a simple Look-Up Table (LUT) to translate phone numbers to contact ID’s, how shall we organize/index our LUT
213-745-9823
LUT indexed w/
contact ID
000
LUT indexed w/ all
possible phone #’s
626-454-9985
…
323-823-7104
818-329-1980
001
002
999
null000-000-0000
..
…
null
000213-745-9823
999-999-9999
Sorted LUT indexed
w/ used phone #’s
436
213-745-9823 000
…
002
999323-823-7104
213-730-2198
818-329-1980
O(n) - Doesn’t Work
We are given phone # and
need to translate to ID
(1000 accesses)
O(log n) - Could Work
Since its in sorted order we
could use a binary search
(log21000 accesses)
O(1) - Could Work
Easy to index & find but
LARGE
(1 access)
1 2 3
21
Page Tables• VA is broken into:
– VPN (upper bits)
– Page offset: Based on page size (i.e. 0 to 4K-1 for 4KB page)
• VPN is index to the "Page Table"
• MMU uses VPN & PTBR to access the page table in memory and lookup physical frame
• Physical frame is combined with offset to form physical address
• For 20-bit VPN, how big is the page table?
VAOffset w/in pageVirtual Page Number
31 12 11 0
Page Table Size
= 220 entries * 19 bits
= approx. 220*4bytes = 4MB
PTBR = Page Table Base Reg.
Offset w/in page PAPhys. Frame #
31 12 11 0
00
Page Frame Number
…
Valid /
Present
20
CPUMemory
18
22
Paging• Each process has its own virtual
address space and thus needs its own page table
• On context switch to new process, reload the PTBR
CPU
esp
VA: 0x001040ebx
VA: 0x002eaceip
eax
Translation Unit / MMU
+
PA: 0x6e040
Physical Addr
Virtual Addr
unused
VPN offsetss
Code 2.1
PT1
GDT
0xc4000PTBR/CR3
0xc4000
0xd0000
cs
ds
PT2
0x7e000 R/W
Phys. Frame # R/W
0x6e000 R/W
0x08000 R
0
1
2
0x6e000
Process 2 Page Table
Process 1
VPN
0x7e000 R/W
Phys. Frame # R/W
0x6e000 R/W
0x08000 R
Stack 2.1
Data 1.1
Stack 1.1
Code 1.2
Data 2.1
Code1.1
0
1
2
VPN
offs: 0x040
PPFN: 0x6e000
0x08000
0x002 0xeac
0x001 0x040
offs: 0xeac
PA: 0x08eac
PPFN: 0x08000
0xc4000
Physical Addr
Process 1 Page Table
23
Reducing the Size of Page Tables
• Can we use the table indexed using all possible phone numbers (because it only requires 1 access) but somehow reduce the size especially since much of it is unused?
• Do you have friends from every area code? Likely contacts are clustered in only a few area codes.
• Use a 2-level organization
– 1st level LUT is indexed on area code and contains pointers to 2nd level tables
– 2nd level LUT’s indexed on local phone numbers and contains contact ID entries
LUT indexed w/ all
possible phone #’s
null
…
…
000
213
323
1st Level Index =
Area Code
Area Code
null000-000-0000
..
…
null
000213-745-9823
999-999-9999
…
213
Table
2nd Level Index =
Local Phone #
000-0000
999-9999
323
Table
000-0000
999-9999
If only 2 used
area codes
then only 1000
+ 2(107) entries
rather than 1010
entries
24
Analogy for Multi-level Page Tables
• Could extend to 3 levels if desired– 1st Level = Area code and pointers to 2nd level tables
– 2nd Level = First 3-digits of local phone and pointers to 3rd level tables
– 3rd Level = Contact ID’s
null
…
…
000
213
323
1st Level Index =
Area Code
Area Code
…
2nd Level Index =
Local Phone #
000
999
000
999
323
Table
213
Table
null
null
745
823
null
null
3rd Level Index =
Local Phone #
000
999
213-745
Table
null
000
null
9823
000
999
323-823
Table
null
999
null
7104
25
Analogy for Multi-level Page Tables
• If we add a friend from area code 408 we would have to add a second and third level table for just this entry
null
…
…
000
213
323
1st Level Index =
Area Code
Area Code
…
2nd Level Index =
Local Phone #
000
999
000
999
323
Table
213
Table
null
null
745
823
null
null
3rd Level Index =
Local Phone #
000
999
213-745
Table
null
000
null
9823
000
999
323-823
Table
null
999
null
7104
26
Multi-level Page Tables
CPU
esp
VA: 0x001040ebx
VA: 0x002eaceip
eax
Translation Unit / MMU
+
PA: 0x6e040
Physical Addr
Virtual Addr
unused
VPN offset
ss
Code 2.1
PTy
PD
0xd0000
PDBR/CR3
0xc4000
0xd0000
cs
ds
PTx
0x0007e R/W
Phys. Frame # Acess
0x0006e R/W
0x00008 R
0
1023
0x6e000
PTIdx
Phys PT Pointer Access
Stack 2.1
Data 1.1
Stack 1.1
Code 1.2
Data 2.1
Code1.1
0
1023
PDIdx
offs: 0x040
PPFN: 0x6e000
0x08000
0x000 0x040
0xd0000
Physical Addr
Proc 1 Page Table y
0x001
PDIdx PTIdx
Page Directory
0x0007a R
Phys. Frame # R/W
… …
0x00041 R/W
0
1023
PTIdx
0xe8000
0xe8000
1
0xc4000
0x4a000
0x7e000
0x7a000
0x000e8 -
… …
0x000c4 -
Proc 1 Page Table x
27
Another View
0
1
2
1023
0
1
2
1023
0
1
2
1023
0
1
2
1023
Offset w/in pageLevel
Index 1
31 12 11 022 21
Level
Index 21010
Pointer to start of
2nd Level Table
PPFN’s
frame
I/O
and
un-
used
area
frame 0x0Unused entries can store a NULL pointer (dark shaded entries)
"Swapping" can be performed if no physical frames available when a new page need be allocated. We simply swap the page of data to disk to free up the physical frame. We can retrieve it when necessary.
A secondary data structure (not shown) can be maintained for pages and store where that page’s data resides on disk
28
Page Table Entries (PTEs)
• Valid bit (1 = desired page in memory / 0 = page not present / page fault)
• Referenced = To implement replacement algorithms (e.g. pseudo-LRU)
• Protection: Read/Write/eXecute
Page Frame Number
Valid / Present
Modified / Dirty
Referenced
Protection
Cacheable
29
Page Fault Steps
• What happens when you reference a page that is not present?
• HW will…– Record the offending address and generate a page fault exception
• SW will…– Pick an empty frame or select a page to evict
– Writeback the evicted page if it has been modified
• May block process while waiting and yield processor
– Bring in the desired page and update the page table
• May block process while waiting and yield processor
– Restart the offending instruction
• Key Idea: Handler can bring in the page or do anything appropriate to handle the page fault– Allocate a new page, zero it out, perform copy-on-write, etc.
30
Multi-level Page Tables• Think of multi-level page tables as a tree
– Internal nodes contain pointers to other page tables
– Leaves hold actual translations
0x40 0x0400x35
Virtual
Addr
VPN
offsetIdx1 Idx27 bits 7 bits 12 bits
0xd0000
PDBR/CR3
… …
0x3f
6 bitsIdx3
[0x45]
PT2[] = start addr
PD start addr
[0x3f]
[0x35]
PT3[] = start addr
Level 1
Level 2
Level 3
CPU
Phys. Frame Addr
…
Translations live in this level
31
Sparse Address Spaces
0x40 0x0400x35
Virtual
Addr
VPN
offsetIdx1 Idx27 bits 7 bits 12 bits
0xd0000
PDBR/CR3
… …
0x3f
6 bitsIdx3
[0x45]
PT2[] = start addr
PD start addr
[0x3f]
[0x35]
PT3[] = start addr
CPU
Phys. Frame Addr
…
unu
sed
unused
used
unu
sed
used
unu
sed
used
used
unu
sed
used
use
d
Physical Address SpacePages can be anywhere.
used
used
used
unused
unused
used
unused
used
unused
unused
used
Virtual Address Space may be "sparse". In that case any PT entry
can be null indicating no translations (and page tables are needed for those address ranges)
1
2
3 3
Notes:1. No 2nd level table means…2. …no 3rd level tables for that range3. No 3rd level tables for NULL 2nd level
entries
32
x86 Segments + Paging
• x86 allows use of segments + paging
• Most modern OS's do NOT use x86 segmentation for newer (protected-mode) apps
• All segments use Base = 0, Bounds = 0xffffffff and thus little use of LDTs
Segment
Translation
(Bounds checks)
Proc.
Core
Virtual
Addr
Memory
Physical
Addr
Data
Paging
33
32-bit x86 (Pintos) Translation
0x040Virtual
Addr
VPNoffset12 bits
0xd0000
PDBR/CR3
0x001 0x001
VPN
PDIdx PTIdx
10 bits10 bits
[0x3f]
CPU
…
Level 1
(Page Dir) Level 2
(Page Table)
unused
Code 2.1
PTy
0xc4000
0xd0000
PTx
0x6e000
Stack 2.1
Data 1.1
Stack 1.1
Code 1.2
Data 2.1
Code1.10x08000
0xe8000
0x4a000
0x7e000
0x7a000
PD
0x040
+
PA: 0x6e040
Physical Addr
PPFN: 0x6e000
• 2 Level Page table
34
SPARC VM Implementation
Offset w/in pageIndex 1
8 11 06
Index 2Process ID Index 3
6
0
4095
MMU hold 4096 entry table
(one entry per
context/process)
[Essentially, PTBR for each
process]
Context TableFirst
Level Second
Level Third
Level 4K
Page
Desired
word
PPFN
28 * 4
bytes 26 * 4
bytes 26 * 4
bytes
Virtual Address:
35
Shared Memory
• In current system, all memory is private to each process
• To share memory between two processes, the OS can allocate an entry in each process’ page table to point to the same physical page
• Can use different protection bits for each page table entry (e.g. P1 can be R/W while P2 can be read only)
0
1
2
3
0
1
2
…
…
0
1
2
…Physical
Memory
Page Tables
P1
P2
36
Paged Segmentation (Not covered)
• Paging & segmentation can be combined
• Segment could point to a page table
– PT for Code, Data, Stack, etc.
• VA is broken into segment, VPN, and offset
• Used when VA < PA
CPU
esp
VA: 0x102040ebx
VA: 0x202eaceip
eax
Translation Unit / MMU
+
PA: 0x6e040
Physical Addr
Virtu
al A
ddr
unused
ss
Code 2.1
PT1.1
0xc4000
0xcc000
cs
ds
PT1.0
0x7e000 R
Phys. Frame # R/W
0x6e000 R
0x08000 R
0
1
2
0x6e000
Proc. 1 CS Page Table
VPN
0x75000 R/W
Phys. Frame # R/W
0x48000 R/W
0x6e000 R/W
Stack 2.1
Stack 1.0
Stack 1.2
Code 1.0
Stack 1.1
Code1.1
0
1
2
VPN
PPFN: 0x6e000
0x080000x02 0x040
0xc4000
Physical Addr
Proc. 1 SS Page Table
0x1
0xcc000 0x18 R/W
PTBaseBound
(# pages) R/W
0xc4000 0x03 R/W
0xc8000 0x08 R/W
0
1
2
>
0x75000
0x48000
VPN offsetseg.
0xcc000
Exception
offs: 0x040
De
scrip
tor
Ta
ble
37
Inverted Page Tables• Page tables may seem expensive in terms of memory
overhead– Though they really aren't that big
• One option to consider is an "inverted" page table– One entry per physical frame
– Hash the virtual address and whatever results is where that page must reside
• What about collisions?– Becomes hard to maintain in hardware, but can be used by secondary
software structures
213-745-9823
LUT indexed w/
contact ID
000
626-454-9985
…
323-823-7104
818-329-1980
001
002
999
626-454-9985
Hash
func.
38
A Note on Portability
• You can see that VM implementations leverage specific HW registers and capabilities
• OSs may provide some VM abstraction for easier portability
• Also may maintain secondary data structures about virtual and physical pages to help this process
39
EFFICIENCY
40
VM Design Implications
• SLOW secondary storage access on page faults (10 ms)
– Implies page size should be fairly large (i.e. once we’ve taken the time to find data on disk, make it worthwhile by accessing a reasonably large amount of data)
– Implies a “page fault” is going to take so much time to even access the data that we can handle them in software (via an exception) rather than using HW like typical cache misses
[Other implications you might not yet understand yet w/o caching knowledge]
– Implies eviction algorithms like LRU can be used since reducing page miss rates will pay off greatly
– Implies the placement of pages in main memory should be fully associative to reduce conflicts and maximize page hit rates
– Implies write-back (write-through would be too expensive)
41
Page Table Performance
• How many accesses to memory does it take to get the desired word that corresponds to the given virtual address?
• So for each needed memory access, we need 3 additional?
– That sounds BAD!
• Would that change for a 1- or 2- level table?
• Walking the page table can take a long time
Offset w/in pageIndex 1
8 11 06
Index 2Process ID Index 3
6
0
4095
MMU hold 4096 entry table
(one entry per
context/process)
[Essentially, PTBR for each
process]
Context TableFirst
Level Second
Level Third
Level 4K
Page
Desired
word
PPFN
28 * 4
bytes 26 * 4
bytes 26 * 4
bytes
Virtual Address
42
Processor Chip
Translation Unit / MMU
Translation Lookaside Buffer (TLB)
• Solution: Let’s create a cache for translations = Translation Lookaside Buffer (TLB)
• Needs to be small (64-128 entries) so it can be fast, with high degree of associativity (at least 4-way and many times fully associative) to avoid conflicts
– On hit, the PPFN is produced and concatenated with the offset
– On miss, a page table walk is needed
TLB
CacheCPUVA
VPN
Page Offset
PPFN
PA data10 ns
10 ns
Memory
Memory
(Page Table)
Hit
Mis
s
Mis
s
Hit
Cost of Translation: Cost of TLB lookup + (1-P(TLB-hit) * Cost(Page Table Walk))
43
Translation Lookaside Buffer (TLB)
Offset w/in page Virtual AddressVirtual Page Number
31 12 11 0
Page Frame #
0x308ac
Offset w/in pagePhysical Address
Phys. Frame #
31 12 11 0
V D
0x7ffe4
Tag = VPN
=
=
=
=
Fully Associative TLB
(Entry can be anywhere
and thus we must check
all locations in TLB for a
hit)
20
12
TLB
0x7ffe4
44
A 4-Way Set Associative TLB
• 64 entry 4-way SA TLB (wet field indexes each “way”)– On hit, page frame # supplied quickly w/o page table access
Offset w/in page Virtual AddressVirtual Page Number
31 12 11 0
Offset w/in page Physical AddressPhys. Frame #
31 12 11 0
SetTag
Tag PF# Tag PF#Tag PF# Tag PF#
= = = =
Way 1Way 0 Way 2 Way 34
16
45
TLB Miss Process
• On a TLB miss, there is some division of work between the hardware (MMU) and OS
• Option 1
– MMU can perform page table walk if needed
– If page fault occurs, OS takes over to bring in the page
• Option 2
– If TLB miss, OS can perform both the page table walk and bring in page if necessary
• When we want to remove a page from memory
– First flush out blocks belong to that page from data cache (writing back if necessary)
– Invalidate tags of those blocks
– Invalidate TLB entry (if any) corresponding to that page
• If D=1, set dirty bit in page table
– If page is dirty, copy page back to the disk
– Simple way to remember this…
• If parents (page) leave a party then the children (cache blocks & TLB entries) must leave too
46
Multiple Processes• On a process context switch can TLB maintain mappings
– Can TLB share mappings from multiple processes?
• Recall each process has its own virtual address space, page table, and translations
– Virtual addresses are not unique between processes
• How does TLB handle context switch
– Can choose to only hold translations for current process and thus invalidate all entries on context switch
– Can hold translations for multiple processes concurrently by concatenating a process or address space ID (PID or ASID) to the VPN tag
Offset VAVPN
31 12 11 0
ASID
Unique ID for
each process
Page Frame # V MTag
=
=
=
=
ASID
47
Page Sizes & Superpages• For large contiguous chunks of virtual address space we can attempt to allocate a
large contiguous physical chunk
– All the lower (2nd) level entries of the page table would point at a contiguous range
– Instead of the level 1 entry pointing at a level 2 page table, it has the physical address translation
– A bit in the level 1 entry indicates if it contains this phys. translation or a pointer to the 2nd
level table
0x00000000
0x3a000000
Super
Page
Pg. 3
Pg. 2
Pg. 0
Pg. 0
frame
I/O
and
un-
used
area
0xffffffff
Phys. Addr. Space
Pg. 0
Pg. 1
Pg. 2
unused
unused
Super
Page
Virt. Addr.
Space
0x040Virtual
Addr
VPNoffset
12 bits
0xd0000
PDBR/CR3
0x010 0x001
VPN
PDIdx PTIdx
10 bits10 bits
CPU
…Level 1
(Page Dir)
Level 2
(Page Table)
x86 allows for 2MB or 1GB super pages by skipping 1 or 2
last levels of its 4-level page table scheme.
0x001040
offset 22-bits
0x3a001040
0x01001040
48
Processor Chip
Translation Unit / MMU
Modern Processor Organizations
• Often times, separate TLB’s for instruction and data address translation (and data/instruction caches as well)
TLB
IL1 $ /
DL1 $CPU
VA
VPN
Page Offset
PPFN
PA data10 ns
10 ns
Memory
Memory
(Page Table)
Hit
Mis
s
Mis
s
Hit
ITLB /
DTLB
IL1 $ /
DL1 $
49
Multiprocessor TLB Issues & Shootdown
• Assume 2 threads from 2 processes executing on 4 cores
• TLB's store copies of PTEs but no HW support for "coherency"
• What if one thread needs to reduce access privileges of a page?– Conservatively need to invalidate TLB entry in the other processor
• What if access in P2 cause eviction of pages whose translations are cached in the TLBs from P1?
– SHOOTDOWN: 1 proc. interrupts another and indicates a TLB entry should be invalidated
Processor Chip
Translation Unit / MMU
Core1
(P1)
TLB Cache
Translation Unit / MMU
Core3
(P2)
TLB Cache
Translation Unit / MMU
Core2
(P1)
TLB Cache
Translation Unit / MMU
Core4
(P2)
TLB Cache
…
Level 1
(Page Dir)
Level 2
(Page Table)
…
Level 1
(Page Dir)
Level 2
(Page Table)
50
IMPLICATIONS FOR DATA CACHES
51
Cache Addressing with VM
• Review of cache
– Store copies of data indexed based on the address they came from in MM
– Simplified view: 2 steps to determine hit
• Index: Hash portion of address to find "set" to look in
• Tag match: Compare remaining address to all entries in set to determine hit
– Sequential connection between indexing these two steps (index + tag match)
• Rather than waiting for address translation and then performing this two step hit process, can we overlap the translation and portions of the hit sequence?
– Yes if we choose page size, block size, and set/direct mapping carefully
0
1
2
3
4
…
addr, data addr, data
Index/HashTag Offset
Address
52
Virtual vs. Physical Addressed Cache
• Physically indexed, physically tagged (PIPT)
– Wait for full address translation
– Then use physical address for both indexing and tag comparison
• Virtually indexed, physically tagged (VIPT)
– Use portion of the virtual address for indexing then wait for address translation and use physical address for tag comparisons
– Easiest when index portion of virtual address w/in offset (page size) address bits, otherwise aliasing may occur
• Virtually indexed, virtually tagged (VIVT)
– Use virtual address for both indexing and tagging…No TLB access unless cache miss
– Requires invalidation of cache lines on context switch or use of process ID as part of tags
Offset VAVPN
31 12 11 0
Offset PAPFN
31 12 11 0
Set/BlkTag
PIP
T
Offset VAVPN
31 12 11 0
Offset PAPFN
31 12 11 0
Tag
Set/Blk
Offset VAVPN
31 12 11 0
Offset PAPFN
31 12 11 0
Set/BlkTagV
IPT
VIV
T
53
Virtual vs. Physical Addressed Cache
• Another view:
Virtually addressed Cache Physically addressed Cache
In a modern system the L1 caches may be virtually addressed while L2 may be physically addressed.
54
SOFTWARE PROTECTION
55
Protection Without (With Less) HW
• Interpreted languages can perform checking before dereferencing – Most languages don't provide raw pointer features
• Intermediate code– MS.NET - Many languages (C#, VB, etc.) compiled to intermediate
byte-code and then run by an interpreter
– Java
• Program Analysis– Insert code to perform checks that prove the program cannot violate
certain properties
– If proven we can remove checks