Download - CENG 334 – Operating Systems 06- Memory Asst. Prof. Yusuf Sahillioğlu Computer Eng. Dept,, Turkey
Memory Management2 / 105
Program must be brought (from disk) into memory to run
Main memory and registers are only storage CPU can access directly
Register access in one CPU clock cycle (perform multiplication a * b)
Main memory can take many cycles (read the operands from memory, or write result back to memory)
Cache sits between main memory and CPU registers (2-3 cycles) Instructions that are executed and data that is operated
on
Protection of memory required to ensure correct operation
Memory Management4 / 105
When code is generated (or assembly program is written) we use memory addresses for varaibles, functions, and branching/jumping
Those addresses can be physical or logical (=virtual) memory addreses
Physical: discontinuous locations in main memory.
Logical:
Memory Management5 / 105
When code is generated (or assembly program is written) we use memory addresses for varaibles, functions, and branching/jumping
Those addresses can be physical or logical (=virtual) memory addreses
Physical:
Logical: each process is given its owncontinuous logical memory space = ownview of memory with its own addr. spaceLogical addresses divided into fixed-sizepages.
Memory Management6 / 105
Key advantage of logical addressing (= paging). Eliminates the issue of external fragmentation. Since CPU translates logical page-based addresses to
physical frame-based addresses there is no need for the physical frames to be continuous
Memory Management7 / 105
Physical address of a variable is 0x0734432. That variable has to sit there while the program is executing: no relocation.
Logical address of a variable is 7 for myArray[7]. Not has to sit at physical 0x0000007 (7 in hexadecimal).
Memory Management10 / 105
We cannot have a multiprogramming environment. We cannot load a program to an arbitrary position. Early systems have this physical addressing idea. Thank god
we don’t.
Memory Management11 / 105
Logical address space concept. A program uses logical addresses. Logical address space has to be mapped somewhere in
physical (main) memory.
Memory Management12 / 105
Logical addresses provide Multiprogramming environment Relocatable code
Binding: mapping logical addresses to physical addresses. physicalAddr = logicalAddr + base Logical address space is bound to a physical address space
Memory Management15 / 105
An example
Memory Mananagement Unit (MMU) converts logical address 28 into physical address (28 + 24 = 52 M[52]) in execution time.
Memory Management16 / 105
Hardware device that at run time maps virtual (logical) to physical address
In prev simple example we used 1 relocation register: base More complicated schemes around
The user program deals with logical addresses; it never sees the real physical addresses
Execution-time binding occurs when reference is made to location in memory
Logical address bound to physical addresses
Memory Management18 / 105
Another memo management idea: Swapping Assume 10 programs loaded into memo and memory is filled
up A process can be swapped temporarily out of memory to a
backing store (disk), and then brought back into memo for continued execution
Started if more than threshold amount of memory allocated Disabled again once memory demand reduced below
threshold
Memory Management20 / 105
Contiguous allocation (continuous): allocate physical space that is equal to process’ logical address space
Main memory usually into two partitions: Resident operating system, usually held in low memory
with interrupt vector User processes then held in high memory
Relocation registers used to protect user processes from each other, and from changing operating-system code and data Base register contains value of smallest physical address Limit register contains range of logical addresses (size of
the program): each logical address must be less than the limit register
MMU maps logical address dynamically
Memory Management22 / 105
Contiguous allocation
After a while we see partitions, some of which are empty (hole).
Memory Management23 / 105
Contiguous allocation Multiple-partition allocation
Degree of multiprogramming limited by number of partitions
Hole: block of available memory; holes of various size are scattered throughout memory
When a process arrives, it is allocated memory from a hole large enough to accommodate it
Process exiting frees its partition, adjacent free partitions combined
Operating system maintains information about:a) allocated partitions b) free partitions (hole)
Memory Management24 / 105
Contiguous allocation How to satsify a request of size n from a list of free holes?
First-fit: Allocate the first hole that is big enough Best-fit: Allocate the smallest hole that is big enough; must
search entire list, unless ordered by size Produces the smallest leftover hole
Worst-fit: Allocate the largest hole; must also search entire list Produces the largest leftover hole
Memory Management25 / 105
Fragmentation: There will be useless holes that cannot accommodate any process continuously
External Fragmentation: external to allocated partitions you have unused space Total memory space exists to satisfy a request, but it is not
contiguous Reduce external fragmentation by compaction:
Shuffle memory contents to place all free memory together in 1 large block
Internal Fragmentation: allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used
Memory Management26 / 105
More advanced idea for memory management: Paging Used for implementing virtual memory which allows a
program whose size is > physical memory size to be run Also good for eliminating the external fragmentation Allows logical and physical address spaces to be
noncontiguous High utilization of memory space
Memory Management27 / 105
More advanced idea for memory management: Paging Divide physical memory into fixed-sized blocks called frames Size is power of 2, between 512 bytes and 16 Mbytes Divide logical memory into blocks of same size called pages Keep track of all free frames To run a program of size N pages, need to find N free frames
and load program Set up a page table to translate logical to physical addresses
Not a simple translation anymore: phyAddr != logAddr + base
Memory Management29 / 105
More advanced idea for memory management: Paging Divide physical memory into fixed-size blocks, called (page)
frames.
Page frame is a container that can hold a content which is a page.
Memory Management30 / 105
More advanced idea for memory management: Paging Divide physical memo into fixed-size (4K) blocks, called
(page) frames. Frame0 has address space from 0 to 4095, frame1 has 4096
to 8191, ..
Memory Management31 / 105
More advanced idea for memory management: Paging Divide logical address space into fixed-size blocks, called
pages, whose size is equal to the page frame size (4K) Frame0 has address space from 0 to 4095, frame1 has 4096
to 8191, ..
Memory Management32 / 105
More advanced idea for memory management: Paging Divide logical address space into fixed-size blocks, called
pages, whose size is equal to the page frame size (4K) When program loaded into memo, allocation not have to be
contiguous
Memory Management33 / 105
More advanced idea for memory management: Paging Divide logical address space into fixed-size blocks, called
pages, whose size is equal to the page frame size (4K) When program loaded into memo, allocation not have to be
contiguous
Info is kept in a Page Table
Memory Management34 / 105
More advanced idea for memory management: Paging Divide logical address space into fixed-size blocks, called
pages, whose size is equal to the page frame size (4K) When program loaded into memo, allocation not have to be
contiguous
Info is kept in a Page Table, determined by OS, for each process
Conversion logical->physical done by HW (CPU)
Memory Management35 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU) Assume pageSize = 4 bytes pageNumber = 1 & offset = 3 for h LA = 7; PA = ? PA = 4*6 + 3 = 27
Memory Management36 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU) Assume pageSize = 4 bytes _ _ _ _ //4bit logical address First _ _ for page number Next _ _ for offset (displacement)
inside page Logical address of h is 0 1 1 1
Memory Management37 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU) Assume pageSize = 4 bytes LA for f = 5; PA = ?
Logical address is 0 1 0 1 (5 in binary) Page number = 01 1 in decimal PA = 110 01 (Frame number = 6)
Offset will not change ‘cos it is relative position; copy from LA 110 01 = 25 in decimal, which is the PA case for f
Memory Management38 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU) Assume pageSize = 4 bytes LA for l = 11; PA = ?
Logical address is 1 0 1 1 (11 in binary) Page number = 10 2 in decimal PA = 001 11 (Frame number = 1)
Offset will not change ‘cos it is relative position; copy from LA 001 11 = 7 in decimal, which is the PA case for l
Memory Management39 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU) Assume pageSize = 4 bytes LA for n = 13; PA = ?
Logical address is 1 1 0 1 (13 in binary) Page number = 11 3 in decimal PA = 010 01 (Frame number = 2)
Offset will not change ‘cos it is relative position; copy from LA 010 01 = 9 in decimal, which is the PA case for n
Memory Management40 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU) In general Address generated by CPU is divided into:
Page number (p): used as an index into a page table which contains base address of each page in physical memory
Page offset (d): combined with base address to define the physical memory address that is sent to the memory unit
For given logical address space 2m and page size 2n
Memory Management41 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU) In general
Memory Management42 / 105
More advanced idea for memory management: Paging Conversion logical->physical done by HW (CPU)
Must be very fast ‘cos done for every memory reference At least 1 memory access to fetch the instruction Plus potential memory operation(s) for that instruction
(LOAD) Setting up the page table done by SW (OS)
When program is loaded into memo, OS knows into which frames the pages of the program are loaded
Memory Management43 / 105
More advanced idea for memory management: Paging We will have free and used frames in memory at any time t
Memory Management44 / 105
More advanced idea for memory management: Paging 8-byte long program (each byte is an instruction, not a char, but
anyway)
Memory Management45 / 105
Implementation of Page Table Page table is kept in main memory, per process Page-table base register (PTBR) points to the page table (load with
context switch) Page-table length register (PTLR) indicates size of the page table (load
with cs) In this scheme every data/instruction access requires two memory
accesses: 1 for the page table (‘cos table is in memory) and 1 for the data/instruction (by phy adr)
Memory Management46 / 105
Implementation of Page Table
Access to Page Table in memory (for logical physical conversion)
Access to that physical address in memory (2nd access)
Memory Management47 / 105
Implementation of Page Table The two memory access problem can be solved by the use of a special
fast-lookup hardware cache called associative memory or translation look-aside buffers (TLBs)
Some TLBs store address-space identifiers (ASIDs) in each TLB entry: uniquely identifies each process to provide address-space protection for that process
After we learn page frame, we store this association in 1 entry of TLB. A page has 4096 instructions so it is likely that I’ll access the same
page again soon (in the next instruction); keep that page frame mapping in the cache.
Without ASIDs you have to flush (erase) TLB at every context switch (0 17 of P1 may not work for P2).
TLBs typically small (64 to 1,024 entries)
Memory Management48 / 105
TLB associative memory Associative memory: parallel search
Address translation (p, d) If p is in associative register, get frame # out Otherwise get frame # from page table in memory
Memory Management50 / 105
Effective memory access time w/ Paging HW with TLB Associative Lookup = e time unit //e = epsilon Assume memory access (cycle) time is 1 msec (>> e)
Hit ratio = alpha Hit ratio: percentage of times that a page number is found in
the TLB
Effective Access Time (EAT)EAT = HIT + MISS
= (1 + e)alpha + (2 + e)(1 – alpha) = 2 + e – alpha msecs //e << alpha so ignore it
Memory Management51 / 105
Memory protection with paging scheme Memory protection implemented by associating protection bit
with each frame to indicate if read-only or read-write access is allowed
Valid-invalid bit attached to each entry in the page table: “valid” indicates that the associated page is in the process’
logical address space, and is thus a legal page “invalid” indicates that the page is not in the process’
logical address space
Memory Management53 / 105
Shared pages
Shared code One copy of read-only (reentrant) code shared among
processes (i.e., text editors, compilers, window systems) Similar to multiple threads sharing the same process space Also useful for interprocess communication if sharing of
read-write pages is allowed
Private code and data Each process keeps a separate copy of the code and data The pages for the private code and data can appear
anywhere in the logical address space
Memory Management55 / 105
Structure of the page table is important 1D page table can grow to a large size if u have a large
address space 4GB of logical memory (32bit systems may have < 232
=4GB spce) Each page is 4KB Then you have 4GB / 4KB = ~1M (million) pages
1M entries needed in a 1D page table Each entry 4 bytes 4MB page table per process; too
large!
Memory Management56 / 105
Solutions to 1D page table problem Hierarchical paging Hashed page tables Inverted page tables
Memory Management57 / 105
Hierarchical page tables
Break up the logical address space into multiple page tables Some portion of the logical address space will be mapped
by some page table, some portion by another page tables, and so on This idea replaces the Single page table responsible for
mapping the whole logical address space You usually use a small portion of your logical addr space
need small page tables to map those portions. Unused portion stored on disk (brought to memo when
necessary) A simple technique is a two-level page table
Memory Management58 / 105
Hierarchical page tables Two-level page table scheme (2D): page the page table
page table
level1
level2
Memory Management59 / 105
Hierarchical page tables Two-level page table scheme (2D): page the page table 32 bit logical address:
d=10 bits page size 2^10 = 1K (each page stores 1K data) p2 = 10 bits second-level page table can have at most 1K
entries p1 = 12 bits first-level page table can have at most 4K
entries p1 part is used as an index to outer page table; p2 to 2nd level
page table
page number page offset
p1 p2 d
12 10 10
Memory Management60 / 105
Hierarchical page tables Two-level page table scheme (2D): page the page table From outer page table (idx: p1), get the address of the 2nd
level page table. From 2nd level page table (idx: p2), get the page frame
number in PA. From page frame in physical memo (idx: d), get the desired
content.
Memory Management61 / 105
Hierarchical page tables Two-level page table scheme (2D): page the page table Benefit: reduce page table space needed for a program. Logical address length = 32 bits Page size = 4K (4096 bytes) Logical address division: 10, 10, 12
Program has 4GB (2^32) address space But only uses bottom and top portions (20 MB) Don’t need a page table for the unused part Need only 1 top-level page table Need ?? second-level page tables
Memory Management62 / 105
Hierarchical page tables Two-level page table scheme (2D): page the page table 2nd-level page table has 210 entries Each entry can map a page of 4K size 210 * 212 = 222 = 4MB of logical address space can be mapped by a single2nd-level page table!
Use this fact to find theanswer for the prev slide..
Memory Management63 / 105
Hierarchical page tables Two-level page table scheme (2D): page the page table Each 4MB can be mapped by 1 2nd-level page table need
20/4=5!!!!
Memory Management64 / 105
Hierarchical page tables Two-level page table scheme (2D): page the page table Benefit: using whole address space (4GB per process) of 1D
case reduces to only 24KB per process using 2-level page table scheme.
Assume each entry (in both top-level and 2nd-level tables) 4 bytes.
Memory Management65 / 105
Hierarchical page tables Three-level page table scheme (3D) for 64 bit logical address
space. Same idea but have 2 outer pages now. Every process has to have outer page table in memory: 242
too big.
232 still huge for top-level page table (4GB per process).
Memory Management66 / 105
Hashed page tables Solves the huge top-level page table problem of hierarchial
page tables Arises after exceeding a threshold address spacing length
Reduce large pinto a numberin, say, [0, 1K]
Virtual Memory67 / 105
Utilize memory management techniques to implement Virtual Memory
Again separation of the logical memory and physical memory Virtual memory size can be much bigger than the physical
memory
Idea Just bring a small portion of the program into memory
initially Bring the rest whenever that part is needed (unbring
unused part) Initially bring the part that includes the main() Then jump to a long function and bring it
Benefit Execute more programs in parallel ‘cos each needs less
storage in physical memory Logical address space can be much larger than physical
memory
Virtual Memory69 / 105
Typical virtual address space layout
Compiler sets this up It places the code/instructions Then data (global variables) Then heap section to allocate storage for pointer, e.g. malloc returnsspace to you from heap section (then you can write/read something to/from there) Then run-time stack for called functionsFunction A calls B, which calls C, .. stack grows
Virtual Memory70 / 105
Virtual memory can be implemented in the following way
Demand Paging Bring pages into memory when they are used, i.e.,
allocate memo for pages when they are used
Virtual Memory71 / 105
Demand Paging: bring a page into memory only when it is needed
Less I/O needed, no unnecessary I/O Less memory needed Faster response More users (each user program needs less space in physical
memo)
Page is needed reference to it invalid reference abort not-in-memory bring to memory
Lazy swapper: never swap a page into memory unless page’ll be needed Swapper that deals with pages is a pager
Virtual Memory72 / 105
Valid-invalid bit: who is currently in memory?
Initially all invalid: i Page[2] in memory but Page[n-1] is not During address translation, if bit is i page fault
Virtual Memory73 / 105
Page table when some pages not in main memory
Instructions in Page[0] may be all pages in disk all the time
using an operand stored in Page[4], (must fit otherwise can’t run)
which makes a page fault! suspend
the program and bring it in. schedule another program during suspension
Virtual Memory74 / 105
Page fault: not being able to find a page in memory When triggered?
While CPU is executing an instruction (w/ a memo operand/address at a different page not in physical memo)
While CPU is fetching an instruction (the page containing the next instruction to execute not in main/physical memory)
Handling? Get an empty frame to load the new page into Load the page to that empty frame (disk I/O) Reset page table (new index, validation bit) Restart the instruction that caused the page fault
Virtual Memory76 / 105
Performance of demand paging
Page fault rate 0 <= p <= 1 p = 0 means no page fault p = 1 means every reference is a page fault (page not in
memo)
Effective Access Time to Memory: EAT
EAT = (1-p) memoryAccess + p(pageFaultOverhead + swap page out + swap page in + restartOverhead)
Virtual Memory78 / 105
COW: Copy-on-Write
Just another benefit of Virtual Memory Used during process creation (fork()) for fast child creation
Virtual Memory79 / 105
COW: Copy-on-Write
After fork() Child has its own address space that’s duplicated from the
parent Child has its own memory whose content is initially same
as parent
Since they’re the same, initially we can have the child share the pages of the parent
No need to copy everything as long as child & parent are just reading those pages
Do copying when 1 of the processes modify/write
Virtual Memory82 / 105
Page replacement: what happens if there is no free frame to put the new page in?
Find some not-in-use page in memory and swap it out
Whish page to remove? An algorithm which results in min # of page faults is
preferrable
With page replacement same page may be brought into mem 1+ times
Virtual Memory83 / 105
Prevent over-allocation by 1 process by giving each process a fixed # frames to play with
Modify page-fault service routine to include page-replacement over those frames
Only modified pages are written back to disk (I/O) while being removed/replaced
Attach a modify (dirty) bit to each page in memory to do this, i.e., to reduce the overhead of page transfers
Separation b/w logical and physical memo achieved w/ page replacmnt Compiler writers or assembly coders can now use a large
virtual memory on a smaller physical memory
Virtual Memory85 / 105
Basic page replacement OS finds the location of the desired page on disk Find a free frame
If there is one, use it If there is none, use a page replacement policy to select a
victim frame; if the victim is modified, write it back to disk See policies at Onur hoca’s cool paging.js demo
Optimum FIFO Second chance Least recently used (LRU)
Bring the desired page into the free frame; update page table Restart the process at the instruction that caused the page
fault
Virtual Memory90 / 105
Other page replacement algorithms to know Optimum FIFO Second chance Least recently used (LRU) See Onur hoca’s demo: paging.js
Virtual Memory91 / 105
Allocation of frames So far we learnt which page to remove/replace in case of a
page fault Now learn how many frames should we allocate to a process
If the process doesn’t have enough pages in memory then page fault rate is high, which leads to Low CPU utilization (more I/O to retrieve pages from disk) OS thinks that it needs to increase the degree of
multiprogramming (‘cos CPU seems to be underutilized) Another process added to the sys, which makes it even
worse Thrashing: a process is busy doing I/O (swapping pages in
and out)
Virtual Memory92 / 105
Thrashing
Initially 1 process utilizes half of the CPU (half I/O) As # processes increase, utilization increases as 1+ of them
needs CPU After a while, not enough frames in memory causes page
faults
Virtual Memory93 / 105
To prevent thrashing, use demand paging: don’t bring the whole program into memory at once. Just bring the needed pages.
This works ‘cos we have locality in program execution Some set of instructions (loops) executed repeatedly
Pages storing those instructions heavily accessed for a while
When does thrashing occur? SUM size-of-locality > total memory size No thrashing: Guarantee SUM size-of-locality < total
memory size
Virtual Memory94 / 105
Working-set model An algorithm to decide how many frames to give to each
process Can also be used as a page replacement algorithm
Look Delta back to learn the heavily used pages: WS Alloc |WS| frames as those pages are likely to be used again
(locality)
Virtual Memory95 / 105
Working-set model How to use as a page replacer? If you have only 2 frames allocated and in those frames
you have a page p1 in WS and another page p2 not in WS Remove p2
Virtual Memory96 / 105
Working-set model WSSi (working set of Process Pi) =
total # of pages referenced in the most recent Delta (varies in time) if Delta too small will not encompass/cover entire locality if Delta too large will encompass several localities if Delta = INF will encompass entire program
D = SUM WSSi = total demand frames (approximation of locality)
if D > m Thrashing
Policy if D > m, then suspend or swap out one of the processes
Virtual Memory97 / 105
Page-Fault Frequency (PFF) scheme Alternative to the Working-set model for frame allocation
Dynamically tune memory size of process based on # page faults
Monitor page fault rate for each process (faults per sec) If page fault rate above threshold, give process more memory
Should cause process to fault less Doesn't always work!
Recall Belady's Anomaly If page fault rate below threshold, reduce memory allocaton
Virtual Memory98 / 105
You can understand from fault curve where the locality starts and ends
# page faults increases as pages (of this locality) are brought in
When needed pages (of locality) in memory, # page faults decreases
Virtual Memory100 / 105
Memory-mapped files: treat file disk blocks as memory pages Memory-mapped file I/O allows file I/O to be treated as routine
memory access by mapping a disk block to a page in memory A file is initially read using demand paging (map some blocks
of file into pages of virtual memory). A page-sized portion of the file is read from the file system into a physical page. Subsequent reads/writes to/from the file are treated as ordinary memory accesses
Simplifies and speeds file access by driving file I/O through memory rather than read() and write() system calls
Also allows several processes to map the same file allowing the pages in memory to be shared
Virtual Memory101 / 105
Memory-mapped files: treat file disk blocks as memory pages
Process maps file to some portion of its logical addr space via mmap()
Virtual Memory102 / 105
Memory-mapped files: treat file disk blocks as memory pages
Shared memory can be implemented this way (no disk access at all)
Virtual Memory103 / 105
Allocating kernel memory Allocate memory to dynamic kernel objects/structures Static allocation during loading/booting is easy
So far we learnt allocating memo to processes Allocate frames in physical memory to the pages of the
processes
Dynamic kernel objects?
Virtual Memory104 / 105
Allocating kernel memory Allocate memory to dynamic kernel objects/structures Static allocation during loading/booting is easy
So far we learnt allocating memo to processes Allocate frames in physical memory to the pages of the
processes
Dynamic kernel objects? Creation of a process initiates a PCB structure in kernel Creation of a semaphore, queues of condition variables
Virtual Memory105 / 105
Allocating kernel memory Why dynamic kernel memory allocation is a problem?
‘cos kernel objects are much smaller than processes’ objects
Semaphore: 8/16 bytes PCB: 1200 bytes Much less than the page size
Can’t give whole page to a semaphore (wasteful) Some frames are reserved for dynamically allocated kernel
objects Solution
We could use first/best-fit heap management but it causes external fragmentation
Better techniques Buddy system allocator Slab allocator