part three: memory management · 2013-03-08 · implementation of page table the page table is...
TRANSCRIPT
Part three: Memory ManagementPart three: Memory Management
� programs, together with the data they access, must be
in main memory (at least partially) during execution.
� the computer keeps several processes in memory.
� Many memory-management schemes exist, reflecting
various approaches, and the effectiveness of each
8.1/72
various approaches, and the effectiveness of each
algorithm depends on the situation.
� Selection of a memory-management scheme for a
system depends on many factors, especially on the
hardware design of the system.
� Each algorithm requires its own hardware support.
P271
Chapter 8: Main MemoryChapter 8: Main Memory
Chapter 8: Main Memory Chapter 8: Main Memory
� Background
� Contiguous Memory Allocation
� Paging
� Structure of the Page Table
Segmentation
8.3/72
� Segmentation
ObjectivesObjectives
� To provide a detailed description of various
ways of organizing memory hardware
� To discuss various memory-management
techniques, including paging and
segmentation
8.4/72
BackgroundBackground
� Program must be brought (from disk) into memory and
placed within a process for it to be run
� Main memory and registers are only storage CPU can
access directly
� Register access in one CPU clock (or less)
� Main memory can take many cycles
8.5/72
� Main memory can take many cycles
� Cache sits between main memory and CPU registers
� Protection of memory required to ensure correct operation
� ensure correct operation ha to protect the operating
system from access by user processes and, in addition,
to protect user processes from one another.
P276
Base and Limit RegistersBase and Limit Registers
� A pair of base and limit registers define the
logical address space
8.6/72
Binding of Instructions and Data to MemoryBinding of Instructions and Data to Memory
� Address binding of instructions and data to memory
addresses can happen at three different stages
� Compile time: If memory location known a priori,
absolute code can be generated; must recompile
code if starting location changes
8.7/72
code if starting location changes
� Load time: Must generate relocatable code if
memory location is not known at compile time
� Execution time: Binding delayed until run time if the
process can be moved during its execution from one
memory segment to another. Need hardware support
for address maps (e.g., base and limit registers)
P278
Multistep Processing of a User Program Multistep Processing of a User Program
8.8/72
Logical vs. Physical Address SpaceLogical vs. Physical Address Space
� The concept of a logical address space that is bound to a
separate physical address space is central to proper
memory management
� Logical address – generated by the CPU; also
referred to as virtual address
� Physical address – address seen by the memory unit
8.9/72
� Physical address – address seen by the memory unit
� We now have two different types of addresses: logical
addresses (in the range 0 to max ) and physical addresses
(for example, in the range R + 0 to R + max). The user
generates only logical addresses and thinks that the
process runs in locations 0 to max. these logical
addresses must be mapped to physical addresses before
they are used.
P279
Logical vs. Physical Address SpaceLogical vs. Physical Address Space
� Logical and physical addresses are the same in
compile-time and load-time address-binding schemes;
logical (virtual) and physical addresses differ in
execution-time address-binding scheme
8.10/72
MemoryMemory--Management Unit (MMU)Management Unit (MMU)
� Hardware device that maps virtual to physical
address
� In MMU scheme, for example, the value in the
relocation register is added to every address
generated by a user process at the time it is sent
8.11/72
generated by a user process at the time it is sent
to memory
� The user program deals with logical addresses; it
never sees the real physical addresses
P280
Dynamic relocation using a relocation registerDynamic relocation using a relocation register
8.12/72
Dynamic LoadingDynamic Loading
� Routine is not loaded until it is called
� Better memory-space utilization; unused routine is
never loaded
� Useful when large amounts of code are needed to
handle infrequently occurring cases
8.13/72
handle infrequently occurring cases
� No special support from the operating system is
required implemented through program design
P280
Dynamic LinkingDynamic Linking
� Linking postponed until execution time
� Small piece of code, stub, used to locate the
appropriate memory-resident library routine
� Stub replaces itself with the address of the routine,
and executes the routine
8.14/72
and executes the routine
� Operating system needed to check if routine is in
processes’ memory address
� all processes that use a language library execute
only one copy of the library code.
� System also known as shared libraries
P281
SwappingSwapping
� A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution
� Backing store – fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these
8.15/72
users; must provide direct access to these memory images
� Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-priority process is swapped out so higher-priority process can be loaded and executed
P282
Schematic View of SwappingSchematic View of Swapping
8.16/72
Contiguous AllocationContiguous Allocation
� Main memory usually into two partitions:
� Resident operating system, usually held in low
memory with interrupt vector
� User processes then held in high memory
� each process is contained in a single contiguous
8.17/72
� each process is contained in a single contiguous
section of memory.
P284
Contiguous AllocationContiguous Allocation
� Relocation registers used to protect user processes
from each other, and from changing operating-system
code and data
� Relocation register contains value of smallest
physical address
8.18/72
� Limit register contains range of logical addresses –
each logical address must be less than the limit
register
� MMU maps logical address dynamically
Hardware Support for Relocation and Limit RegistersHardware Support for Relocation and Limit Registers
8.19/72P285
Memory AllocationMemory Allocation
� fixed-sized partitions.
� 将内存空间划分成若干个固定大小的区域,在每个区域
可装入一个进程。
� the degree of multiprogramming is bound by the
number of partitions.
8.20/72
� Internal Fragmentation
� Multiple partition method
P286
Contiguous Allocation (Cont.)Contiguous Allocation (Cont.)
� Multiple-partition allocation
� Hole – block of available memory; holes of various size
are scattered throughout memory
� When a process arrives, it is allocated memory from a
hole large enough to accommodate it
8.21/72
� Operating system maintains information about:
a) allocated partitions b) free partitions (hole)
OS
process 5
OS
process 5
OS
process 5
process 9
OS
process 5
process 9
8.22/72
process 2
process 8
process 2 process 2
process 10
process 2
Dynamic StorageDynamic Storage--Allocation ProblemAllocation Problem
� First-fit: Allocate the first hole that is big enough
� Best-fit: Allocate the smallest hole that is big enough; must search entire list, unless ordered by size
� Produces the smallest leftover hole
How to satisfy a request of size n from a list of free holes?
8.23/72
� Produces the smallest leftover hole
� Worst-fit: Allocate the largest hole; must also search entire list
� Produces the largest leftover hole
P287
FragmentationFragmentation
� External Fragmentation – total memory space exists to
satisfy a request, but it is not contiguous
� Internal Fragmentation – allocated memory may be
slightly larger than requested memory; this size difference
is memory internal to a partition, but not being used
8.24/72
� Reduce external fragmentation by compaction
� Shuffle memory contents to place all free memory
together in one large block
� Compaction is possible only if relocation is dynamic,
and is done at execution time
� This scheme can be expensive.
P287
PagingPaging
� 连续分配造成碎片,而Compaction对系统性能十分不利。
� Physical address space of a process can be noncontiguous;
process is allocated physical memory whenever the latter is
available
� Divide physical memory into fixed-sized blocks called
frames (size is power of 2, between 512 bytes and 8,192
8.25/72
frames (size is power of 2, between 512 bytes and 8,192
bytes)
� Divide logical memory into blocks of same size called
pages
P288
PagingPaging
� Keep track of all free frames
� To run a program of size n pages, need to find n free
frames and load program
� Set up a page table to translate logical to physical
addresses
� Internal fragmentation
8.26/72
� Internal fragmentation
Paging Model of Logical and Physical MemoryPaging Model of Logical and Physical Memory
8.27/72
Address Translation SchemeAddress Translation Scheme
� Address generated by CPU is divided into:
� Page number (p) – used as an index into a page
table which contains base address of each page in
physical memory
8.28/72
� Page offset (d) – combined with base address to
define the physical memory address that is sent to the
memory unit
P289
Address Translation SchemeAddress Translation Scheme
� For given logical address space 2m and page size 2n
page number page offset
p d
8.29/72
p d
m - n n
Paging HardwarePaging Hardware
8.30/72
Paging ExamplePaging Example
0
1
2
3
0
1
2
3
8.31/72
32-byte memory and 4-byte pages
4
5
6
7
Page SizePage Size
� If process size is independent of page size, we expect
internal fragmentation to average one-half page per
process. This consideration suggests that small page
sizes are desirable.
� However, overhead is involved in each page-table entry,
and this overhead is reduced as the size of the pages
8.32/72
and this overhead is reduced as the size of the pages
increases.
� Also, disk I/O is more efficient when the number of data
being transferred is larger. Generally, page sizes have
grown over time as processes, data sets, and main
memory have become larger.
� Today, pages typically are between 4 KB and 8 KB in size
P291
Page SizePage Size
� Today, pages typically are between 4 KB and 8 KB in size
� Usually, each page-table entry is 4 bytes long, but that
size can vary as well.
� A 32-bit entry can point to one of 232 physical page
frames. If frame size is 4 KB, then a system with 4-byte
8.33/72
entries can address 244 bytes (or 16 TB) of physical
memory.
Free FramesFree Frames
8.34/72
Before allocation After allocation
User’s view of memoryUser’s view of memory
� the clear separation between the user's view of memory
and the actual physical memory.
� The user program views memory as one single space,
containing only this one program.
� In fact, the user program is scattered throughout physical
memory, which also holds other programs.
8.35/72
memory, which also holds other programs.
� The difference between the user's view of memory and the
actual physical memory is reconciled by the address-
translation hardware.
P291
User’s view of memoryUser’s view of memory
� The logical addresses are translated into physical
addresses. This mapping is hidden from the user and is
controlled by the operating system.
� Notice that the user process by definition is unable to
access memory it does not own. It has no way of
addressing memory outside of its page table, and the
8.36/72
table includes only those pages that the process owns.
Frame tableFrame table
� Since the operating system is managing physical memory,
it must be aware of the allocation details of physical
memory-which frames arc allocated, which frames are
available, how many total frames there are, and so on.
� This information is generally kept in a data structure called
a frame table.
8.37/72
a frame table.
� The frame table has one entry for each physical page
frame, indicating whether the latter is free or allocated
and, if it is allocated, to which page of which process or
processes.
P292
contextcontext--switch timeswitch time
� If a user makes a system call (to do I/O, for example) and
provides an address as a parameter (a buffer, for
instance), that address must be mapped to produce the
correct physical address.
� The operating system maintains a copy of the page table
for each process, just as it maintains a copy of the
8.38/72
for each process, just as it maintains a copy of the
instruction counter and register contents.
� It is also used by the CPU dispatcher to define the
hardware page table when a process is to be allocated the
CPU.
� Paging therefore increases the context-switch time.
P292
Implementation of Page TableImplementation of Page Table
� the page table is implemented as a set of dedicated
registers.
� These registers should be built with very high-speed
logic to make the paging-address translation efficient.
� Every access to memory must go through the paging
map, so efficiency is a major consideration.
8.39/72
map, so efficiency is a major consideration.
� The CPU dispatcher reloads these registers, just as it
reloads the other registers.
� The use of registers for the page table is satisfactory if the
page table is reasonably small (for example, 256 entries).
P292
Implementation of Page TableImplementation of Page Table
� Page table is kept in main memory
� Page-table base register (PTBR) points to the page
table
� Page-table length register (PTLR) indicates size of
the page table
In this scheme every data/instruction access requires
P296
8.40/72
� In this scheme every data/instruction access requires
two memory accesses. One for the page table and one
for the data/instruction.
P293
TLBTLB
� The two memory access problem can be solved by the
use of a special fast-lookup hardware cache called
associative memory or translation look-aside buffer
(TLB)
� The TLB is associative high-speed memory. Each entry
in the TLB consists of two parts: a key (or tag) and a
8.41/72
in the TLB consists of two parts: a key (or tag) and a
value. When the associative memory is presented with
an item, the item is compared with all keys
simultaneously. If the item is found, the corresponding
value field is returned.
� The search is fast; the hardware, however, is expensive.
Typically, the number of entries in a TLB is small, often
numbering between 64 and 1,024.
Associative MemoryAssociative Memory
� Associative memory – parallel search Page # Frame #
8.42/72
Address translation (p, d)
� If p is in associative register, get frame # out
� Otherwise get frame # from page table in
memory
How to use TLBHow to use TLB
� The TLB contains only a few of the page-table entries.
� When a logical address is generated by the CPU, its page
number is presented to the TLB.
� If the page number is found, its frame number is
immediately available and is used to access memory. The
8.43/72
whole task may take less than 10 percent longer than it
would if an unmapped memory reference were used.
How to use TLBHow to use TLB
� If the page number is not in the TLB (known as a TLB
miss), a memory reference to the page table must be
made. When the frame number is obtained, we can use it
to access memory . In addition, we add the page number
and frame number to the TLB, so that they will be found
quickly on the next reference.
8.44/72
quickly on the next reference.
� If the TLB is already full of entries, the operating system
must select one for replacement.
� every time a new page table is selected (for instance, with
each context switch), the TLB must be flushed (or erased)
to ensure that the next executing process does not use
the wrong translation information.
Paging Hardware With TLBPaging Hardware With TLB
8.45/72
Effective Access TimeEffective Access Time
� Associative Lookup = ε time unit
� Assume memory cycle time is 1 microsecond
� Hit ratio – percentage of times that a page number is
found in the associative registers; ratio related to number
of associative registers
8.46/72
of associative registers
� Hit ratio = α� Effective Access Time (EAT)
EAT = (1 + ε) α + (2 + ε)(1 – α)
= 2 + ε – α
P294
Effective Access TimeEffective Access Time
� An 80-percent hit ratio means that we find the desired page
number in the TLB 80 percent of the time.
� If it takes 20 nanoseconds to search the TLB and 100
nanoseconds to access memory,
� then a mapped-memory access takes 120 nanoseconds
8.47/72
� then a mapped-memory access takes 120 nanoseconds
when the page number is in the TLB.
� If we fail to find the page number in the TLB (20
nanoseconds), then we must first access memory for the
page table and frame number (100 nanoseconds) and then
access the desired byte in memory (100 nanoseconds ),for
a total of 220 nanoseconds.
EAT = 0.80 x 120 + 0.20 x 220= 140 nanoseconds.
Memory ProtectionMemory Protection
� Memory protection implemented by associating protection
bit with each frame
� One bit can define a page to be read-write or read-only
� easily expand this approach to provide a finer level of
protection.
8.48/72
protection.
� We can create hardware to provide read-only, read-
write, or execute-only protection
� or, by providing separate protection bits for each kind
of access, we can allow any combination of these
accesses.
P295
Memory ProtectionMemory Protection
� Valid-invalid bit attached to each entry in the page table:
� “valid” indicates that the associated page is in the
process’ logical address space, and is thus a legal
page
� Rarely does a process use all its address range. In
8.49/72
� Rarely does a process use all its address range. In
fact, many processes use only a small fraction of the
address space available to them.
� It would be wasteful in these cases to create a page
table with entries for every page in the address range.
Most of this table would be unused but would take up
valuable memory space.
Valid (v) or Invalid (i) Bit In A Page TableValid (v) or Invalid (i) Bit In A Page Table
8.50/72
Memory ProtectionMemory Protection
� provide hardware in the form of a page-table length
register (PTLR), to indicate the size of the page table.
� This value is checked against every logical address to
verify that the address is in the valid range for the
process.
8.51/72
� Failure of this test causes an error trap to the operating
system.
P296
Shared PagesShared Pages
� Shared code
� One copy of read-only (reentrant) code shared among
processes (i.e., text editors, compilers, window
systems).
� Shared code must appear in same location in the
logical address space of all processes
8.52/72
logical address space of all processes
� Private code and data
� Each process keeps a separate copy of the code and
data
� The pages for the private code and data can appear
anywhere in the logical address space
P296
Shared Pages ExampleShared Pages Example
8.53/72
Structure of the Page TableStructure of the Page Table
� Hierarchical Paging
� Hashed Page Tables
Inverted Page Tables
8.54/72
� Inverted Page Tables
Hierarchical Page TablesHierarchical Page Tables
� a large logical address space needs a large page
table
� the page table needs access contiguously in main
memory.
� It is very difficult to find a contiguous space larger
8.55/72
� It is very difficult to find a contiguous space larger
than page size to store page table.
� Break up the logical address space into multiple page
tables
� A simple technique is a two-level page table
P297
TwoTwo--Level PageLevel Page--Table SchemeTable Scheme
8.56/72
TwoTwo--Level Paging ExampleLevel Paging Example
� A logical address (on 32-bit machine with 4K page size) is divided into:
� a page number consisting of 20 bits
� a page offset consisting of 12 bits
� Since the page table is paged, the page number is further divided into:
8.57/72
divided into:
� a 10-bit page number
� a 10-bit page offset
P298
TwoTwo--Level Paging ExampleLevel Paging Example
� Thus, a logical address is as follows:
page number page offset
p1 p2 d
10 10 12设计非常精妙!
8.58/72
where p1 is an index into the outer page table, and p2 is the displacement within the page of the outer page table
10 10 12设计非常精妙!
AddressAddress--Translation SchemeTranslation Scheme
8.59/72
ThreeThree--level Paging Schemelevel Paging Scheme
8.60/72P299
Hashed Page TablesHashed Page Tables
� Common in address spaces > 32 bits
� The virtual page number is hashed into a page
table. This page table contains a chain of
elements hashing to the same location.
8.61/72
elements hashing to the same location.
� Virtual page numbers are compared in this chain
searching for a match. If a match is found, the
corresponding physical frame is extracted.
P300
Hashed Page TableHashed Page Table
8.62/72
Inverted Page TableInverted Page Table
� Only one page table is in the system
� One entry for each real page of memory
� Entry consists of the virtual address of the page
stored in that real memory location, with
information about the process that owns that
8.63/72
information about the process that owns that
page
� Decreases memory needed to store each page
table, but increases time needed to search the
table when a page reference occurs
P301
Inverted Page Table ArchitectureInverted Page Table Architecture
8.64/72
SegmentationSegmentation
� paging --- the separation of the user's view of memory
and the actual physical memory.
� the user's view of memory is not the same as the
actual physical memory.
� A program is a collection of segments. A segment is a
8.65/72
� A program is a collection of segments. A segment is a logical unit such as:
main program, procedure, function, method, object,
local variables, global variables, common block, stack,
symbol table, arrays
P302
User’s View of a ProgramUser’s View of a Program
8.66/72
Logical View of SegmentationLogical View of Segmentation
1
3
2
1
4
2
8.67/72
34
2
3
user space physical memory space
Segmentation Architecture Segmentation Architecture
� Logical address consists of a two tuple:
<segment-number, offset>,
� Segment table – maps two-dimensional physical
addresses; each table entry has:
� base – contains the starting physical address where
8.68/72
� base – contains the starting physical address where
the segments reside in memory
� limit – specifies the length of the segment
� Segment-table base register (STBR) points to the
segment table’s location in memory
� Segment-table length register (STLR) indicates number
of segments used by a program;
segment number s is legal if s < STLRP303
Segmentation HardwareSegmentation Hardware
8.69/72P304
Segmentation Architecture (Cont.)Segmentation Architecture (Cont.)
� Protection
� With each entry in segment table associate:
�validation bit = 0 ⇒ illegal segment
�read/write/execute privileges
� Protection bits associated with segments; code
8.70/72
� Protection bits associated with segments; code
sharing occurs at segment level
� Since segments vary in length, memory allocation is
a dynamic storage-allocation problem
� A segmentation example is shown in the following
diagram
Example of SegmentationExample of Segmentation
8.71/72
HomeworkHomework::::::::44、、、、、、、、55、、、、、、、、66、、、、、、、、77、、、、、、、、99、、、、、、、、1111、、、、、、、、1212、、、、、、、、1313
End of Chapter 8End of Chapter 8