cs162 operating systems and systems programming midterm review march 18, 2013
TRANSCRIPT
CS162Operating Systems andSystems Programming
Midterm Review
March 18, 2013
Midterm Review.2CS162 ©UCB Fall 2013
Overview
• Synchronization, Critical Sections
• Deadlock
• CPU Scheduling
• Memory Multiplexing, Address Translation
• Caches, TLBs
• Page Allocation and Replacement
• Kernel/User Interface
Midterm Review.3CS162 ©UCB Fall 2013
Synchronization,Critical sections
Midterm Review.4CS162 ©UCB Fall 2013
Definitions
• Synchronization: using atomic operations to ensure cooperation between threads
• Mutual Exclusion: ensuring that only one thread does a particular thing at a time
– One thread excludes the other while doing its task
• Critical Section: piece of code that only one thread can execute at once
– Critical section is the result of mutual exclusion– Critical section and mutual exclusion are two ways of
describing the same thing.
Midterm Review.5CS162 ©UCB Fall 2013
Locks: using interrupts• Key idea: maintain a lock variable and impose mutual
exclusion only during operations on that variable
int value = FREE;
Acquire() {disable interrupts;if (value == BUSY) {
put thread on wait queue;Go to sleep();// Enable interrupts?
} else {value = BUSY;
}enable interrupts;
}
Release() {disable interrupts;if (anyone on wait queue) {
take thread off wait queuePlace on ready queue;
} else {value = FREE;
}enable interrupts;
}
Midterm Review.6CS162 ©UCB Fall 2013
Better locks: using test&set• test&set (&address) { /* most architectures */
result = M[address];M[address] = 1;return result;
}
Release() {// Short busy-wait timewhile (test&set(guard));if anyone on wait queue {
take thread off wait queuePlace on ready queue;
} else {value = FREE;
}guard = 0;
int guard = 0;int value = FREE;
Acquire() {// Short busy-wait timewhile (test&set(guard));if (value == BUSY) {
put thread on wait queue;go to sleep() & guard = 0;
} else {value = BUSY;guard = 0;
}}
Midterm Review.7CS162 ©UCB Fall 2013
• Does this busy wait? If so, how long does it wait?
• Why is this better than the version that disables interrupts?
• Remember that a thread can be context-switched while holding a lock.
• What implications does this have on the performance of the system?
Midterm Review.8CS162 ©UCB Fall 2013
Semaphores• Definition: a Semaphore has a non-negative integer value
and supports the following two operations:– P(): an atomic operation that waits for semaphore to become
positive, then decrements it by 1 » Think of this as the wait() operation
– V(): an atomic operation that increments the semaphore by 1, waking up a waiting P, if any
» This of this as the signal() operation
– You cannot read the integer value! The only methods are P() and V().
– Unlike lock, there is no owner (thus no priority donation). The thread that calls V() may never call P(). Example: Producer/consumer
Midterm Review.9CS162 ©UCB Fall 2013
Condition Variables
• Condition Variable: a queue of threads waiting for something inside a critical section
– Key idea: allow sleeping inside critical section by atomically releasing lock at time we go to sleep
– Contrast to semaphores: Can’t wait inside critical section
• Operations:– Wait(&lock): Atomically release lock and go to sleep. Re-
acquire lock later, before returning. – Signal(): Wake up one waiter, if any– Broadcast(): Wake up all waiters
• Rule: Must hold lock when doing condition variable ops!
Midterm Review.10CS162 ©UCB Fall 2013
Complete Monitor Example(with condition variable)
• Here is an (infinite) synchronized queueLock lock;Condition dataready;Queue queue;
AddToQueue(item) {lock.Acquire(); // Get Lockqueue.enqueue(item); // Add itemdataready.signal(); // Signal any
waiterslock.Release(); // Release Lock
}
RemoveFromQueue() {lock.Acquire(); // Get Lockwhile (queue.isEmpty()) {
dataready.wait(&lock); // If nothing, sleep}item = queue.dequeue(); // Get next itemlock.Release(); // Release Lockreturn(item);
}
Midterm Review.11CS162 ©UCB Fall 2013
Mesa vs. Hoare monitors• Need to be careful about precise definition of signal and wait.
Consider a piece of our dequeue code:while (queue.isEmpty()) {
dataready.wait(&lock); // If nothing, sleep}item = queue.dequeue();// Get next item
– Why didn’t we do this?if (queue.isEmpty()) {
dataready.wait(&lock); // If nothing, sleep}item = queue.dequeue();// Get next item
• Answer: depends on the type of scheduling– Hoare-style (most textbooks):
» Signaler gives lock, CPU to waiter; waiter runs immediately» Waiter gives up lock, processor back to signaler when it exits
critical section or if it waits again– Mesa-style (most real operating systems):
» Signaler keeps lock and processor» Waiter placed on ready queue with no special priority» Practically, need to check condition again after wait
Midterm Review.12CS162 ©UCB Fall 2013
Reader/Writer
• Two types of lock: read lock and write lock (i.e. shared lock and exclusive lock)
• Shared locks is an example where you might want to do broadcast/wakeAll on a condition variable. If you do it in other situations, correctness is generally preserved, but it is inefficient because one thread gets in and then the rest of the woken threads go back to sleep.
Midterm Review.13CS162 ©UCB Fall 2013
Read/Writer
Reader() {// check into systemlock.Acquire();while ((AW + WW) > 0) {
WR++;okToRead.wait(&lock);WR--;
}AR++;lock.release();
// read-only accessAccessDbase(ReadOnly);
// check out of systemlock.Acquire();AR--;if (AR == 0 && WW > 0)
okToWrite.signal();lock.Release();
}
Writer() {// check into systemlock.Acquire();while ((AW + AR) > 0) {
WW++;okToWrite.wait(&lock);
WW--;}AW++;lock.release();
// read/write accessAccessDbase(ReadWrite);
// check out of systemlock.Acquire();AW--;if (WW > 0){
okToWrite.signal();} else if (WR > 0) {
okToRead.broadcast();}lock.Release();
}
What if we remove this line?
Midterm Review.14CS162 ©UCB Fall 2013
Read/Writer Revisited
Reader() {// check into systemlock.Acquire();while ((AW + WW) > 0) {
WR++;okToRead.wait(&lock);WR--;
}AR++;lock.release();
// read-only accessAccessDbase(ReadOnly);
// check out of systemlock.Acquire();AR--;if (AR == 0 && WW > 0)
okToWrite.broadcast();
lock.Release();}
Writer() {// check into systemlock.Acquire();while ((AW + AR) > 0) {
WW++;okToWrite.wait(&lock);
WW--;}AW++;lock.release();
// read/write accessAccessDbase(ReadWrite);
// check out of systemlock.Acquire();AW--;if (WW > 0){
okToWrite.signal();} else if (WR > 0) {
okToRead.broadcast();}lock.Release();
}
What if we turn signal to broadcast?
Midterm Review.15CS162 ©UCB Fall 2013
Read/Writer Revisited
Reader() {// check into systemlock.Acquire();while ((AW + WW) > 0) {
WR++;
okContinue.wait(&lock);WR--;
}AR++;lock.release();
// read-only accessAccessDbase(ReadOnly);
// check out of systemlock.Acquire();AR--;if (AR == 0 && WW > 0)
okContinue.signal();lock.Release();
}
Writer() {// check into systemlock.Acquire();while ((AW + AR) > 0) {
WW++;okContinue.wait(&lock);WW--;
}AW++;lock.release();
// read/write accessAccessDbase(ReadWrite);
// check out of systemlock.Acquire();AW--;if (WW > 0){
okToWrite.signal();} else if (WR > 0) {
okContinue.broadcast();}lock.Release();
}
What if we turn okToWrite and okToRead into okContinue?
Midterm Review.16CS162 ©UCB Fall 2013
Read/Writer RevisitedReader() {
// check into systemlock.Acquire();while ((AW + WW) > 0) {
WR++;
okContinue.wait(&lock);WR--;
}AR++;lock.release();
// read-only accessAccessDbase(ReadOnly);
// check out of systemlock.Acquire();AR--;if (AR == 0 && WW > 0)
okContinue.signal();lock.Release();
}
Writer() {// check into systemlock.Acquire();while ((AW + AR) > 0) {
WW++;okContinue.wait(&lock);WW--;
}AW++;lock.release();
// read/write accessAccessDbase(ReadWrite);
// check out of systemlock.Acquire();AW--;if (WW > 0){
okToWrite.signal();} else if (WR > 0) {
okContinue.broadcast();}lock.Release();
}• R1 arrives • W1, R2 arrive while R1 reads• R1 signals R2
Midterm Review.17CS162 ©UCB Fall 2013
Read/Writer Revisited
Reader() {// check into systemlock.Acquire();while ((AW + WW) > 0) {
WR++;
okContinue.wait(&lock);WR--;
}AR++;lock.release();
// read-only accessAccessDbase(ReadOnly);
// check out of systemlock.Acquire();AR--;if (AR == 0 && WW > 0)
okContinue.broadcast();lock.Release();
}
Writer() {// check into systemlock.Acquire();while ((AW + AR) > 0) {
WW++;okContinue.wait(&lock);WW--;
}AW++;lock.release();
// read/write accessAccessDbase(ReadWrite);
// check out of systemlock.Acquire();AW--;if (WW > 0){
okToWrite.signal();} else if (WR > 0) {
okContinue.broadcast();}lock.Release();
}Need to change to broadcast!
Why?
Midterm Review.18CS162 ©UCB Fall 2013
Example Synchronization Problems
• Fall 2012 Midterm Exam #3
• Spring 2013 Midterm Exam #2
• Hard Synchronization Problems from Past Exams– http://inst.eecs.berkeley.edu/~cs162/fa13/hand-outs/
synch-problems.html
Midterm Review.19CS162 ©UCB Fall 2013
Deadlock
Midterm Review.20CS162 ©UCB Fall 2013
Four requirements for Deadlock
• Mutual exclusion– Only one thread at a time can use a resource.
• Hold and wait– Thread holding at least one resource is waiting to acquire
additional resources held by other threads• No preemption
– Resources are released only voluntarily by the thread holding the resource, after thread is finished with it
• Circular Wait– There exists a set {T1, …, Tn} of waiting threads
» T1 is waiting for a resource that is held by T2
» T2 is waiting for a resource that is held by T3
» …» Tn is waiting for a resource that is held by T1
Midterm Review.21CS162 ©UCB Fall 2013
Resource Allocation Graph Examples
T1 T2 T3
R1 R2
R3R4
Simple ResourceAllocation Graph
T1 T2 T3
R1 R2
R3R4
Allocation GraphWith Deadlock
T1
T2
T3
R2
R1
T4
Allocation GraphWith Cycle, butNo Deadlock
• Recall:– request edge – directed edge T1 Rj
– assignment edge – directed edge Rj Ti
Midterm Review.22CS162 ©UCB Fall 2013
T1
T2
T3
R2
R1
T4
Deadlock Detection Algorithm• Only one of each type of resource look for loops• More General Deadlock Detection Algorithm
– Let [X] represent an m-ary vector of non-negative integers (quantities of resources of each type):[FreeResources]: Current free resources each type[RequestX]: Current requests from thread X
[AllocX]: Current resources held by thread X
– See if tasks can eventually terminate on their own[Avail] = [FreeResources] Add all nodes to UNFINISHED do {
done = trueForeach node in UNFINISHED {
if ([Requestnode] <= [Avail]) {remove node from UNFINISHED[Avail] = [Avail] + [Allocnode]done = false
}}
} until(done)– Nodes left in UNFINISHED deadlocked
Midterm Review.23CS162 ©UCB Fall 2013
Banker’s algorithm
• Method of deadlock avoidance.
• Process specifies max resources it will ever request. Process can initially request less than max and later ask for more. Before a request is granted, the Banker’s algorithm checks that the system is “safe” if the requested resources are granted.
• Roughly: Some process could request it’s specified Max and eventually finish; when it finishes, that process frees up its resources. A system is safe if there exists some sequence by which processes can request their max, terminate, and free up their resources so that all other processes can finish (in the same way).
Midterm Review.24CS162 ©UCB Fall 2013
Example Deadlock Problems
• Spring 2004 Midterm Exam #4
• Spring 2011 Midterm Exam #2
Midterm Review.25CS162 ©UCB Fall 2013
CPU Scheduling
Midterm Review.26CS162 ©UCB Fall 2013
Scheduling Assumptions• CPU scheduling big area of research in early 70’s• Many implicit assumptions for CPU scheduling:
– One program per user
– One thread per program
– Programs are independent
• In general unrealistic but they simplify the problem – For instance: is “fair” about fairness among users or programs?
» If I run one compilation job and you run five, you get five times as
much CPU on many operating systems
• The high-level goal: Dole out CPU time to optimize some desired parameters of system
USER1USER2 USER3USER1 USER2
Time
Midterm Review.27CS162 ©UCB Fall 2013
Scheduling Metrics• Waiting Time: time the job is waiting in the ready queue
– Time between job’s arrival in the ready queue and launching the job
• Service (Execution) Time: time the job is running• Response (Completion) Time:
– Time between job’s arrival in the ready queue and job’s completion
– Response time is what the user sees:» Time to echo a keystroke in editor» Time to compile a program
Response Time = Waiting Time + Service Time
• Throughput: number of jobs completed per unit of time – Throughput related to response time, but not same thing:
» Minimizing response time will lead to more context switching than if you only maximized throughput
Midterm Review.28CS162 ©UCB Fall 2013
Scheduling Policy Goals/Criteria
• Minimize Response Time– Minimize elapsed time to do an operation (or job)
• Maximize Throughput– Two parts to maximizing throughput
» Minimize overhead (for example, context-switching)» Efficient use of resources (CPU, disk, memory, etc)
• Fairness– Share CPU among users in some equitable way
– Fairness is not minimizing average response time:» Better average response time by making system less fair
– Tradeoff: fairness gained by hurting average response time!
Midterm Review.29CS162 ©UCB Fall 2013
First-Come, First-Served (FCFS) Scheduling• First-Come, First-Served (FCFS)
– Also “First In, First Out” (FIFO) or “Run until done”» In early systems, FCFS meant one program
scheduled until done (including I/O)» Now, means keep CPU until thread blocks
• Example: Process Burst TimeP1 24 P2 3P3 3
– Suppose processes arrive in the order: P1 , P2 , P3 The Gantt Chart for the schedule is:
– Waiting time for P1 = 0; P2 = 24; P3 = 27– Average waiting time: (0 + 24 + 27)/3 = 17– Average completion time: (24 + 27 + 30)/3 = 27
• Convoy effect: short process behind long process
P1 P2 P3
24 27 300
Midterm Review.30CS162 ©UCB Fall 2013
Round Robin (RR)• FCFS Scheme: Potentially bad for short jobs!
– Depends on submit order– If you are first in line at supermarket with milk, you don’t care
who is behind you, on the other hand…• Round Robin Scheme
– Each process gets a small unit of CPU time (time quantum), usually 10-100 milliseconds
– After quantum expires, the process is preempted and added to the end of the ready queue
– n processes in ready queue and time quantum is q » Each process gets 1/n of the CPU time » In chunks of at most q time units » No process waits more than (n-1)q time units
• Performance– q large FCFS– q small Interleaved– q must be large with respect to context switch, otherwise
overhead is too high (all overhead)
Midterm Review.31CS162 ©UCB Fall 2013
What if we Knew the Future?• Could we always mirror best FCFS?• Shortest Job First (SJF):
– Run whatever job has the least amount of computation to do
• Shortest Remaining Time First (SRTF):– Preemptive version of SJF: if job arrives and has a shorter
time to completion than the remaining time on the current job, immediately preempt CPU
• These can be applied either to a whole program or the current CPU burst of each program
– Idea is to get short jobs out of the system– Big effect on short jobs, only small effect on long ones– Result is better average response time
Midterm Review.32CS162 ©UCB Fall 2013
Discussion
• SJF/SRTF are the best you can do at minimizing average response time
– Provably optimal (SJF among non-preemptive, SRTF among preemptive)
– Since SRTF is always at least as good as SJF, focus on SRTF
• Comparison of SRTF with FCFS and RR– What if all jobs the same length?
» SRTF becomes the same as FCFS (i.e., FCFS is best can do if all jobs the same length)
– What if jobs have varying length?» SRTF (and RR): short jobs not stuck behind long ones
Midterm Review.33CS162 ©UCB Fall 2013
Summary• Scheduling: selecting a process from the ready queue and
allocating the CPU to it
• FCFS Scheduling:– Run threads to completion in order of submission– Pros: Simple (+)– Cons: Short jobs get stuck behind long ones (-)
• Round-Robin Scheduling: – Give each thread a small amount of CPU time when it
executes; cycle between all ready threads– Pros: Better for short jobs (+)– Cons: Poor when jobs are same length (-)
Midterm Review.34CS162 ©UCB Fall 2013
Summary (cont’d)• Shortest Job First (SJF)/Shortest Remaining Time First
(SRTF):– Run whatever job has the least amount of computation to
do/least remaining amount of computation to do– Pros: Optimal (average response time) – Cons: Hard to predict future, Unfair
• Multi-Level Feedback Scheduling:– Multiple queues of different priorities– Automatic promotion/demotion of process priority in order to
approximate SJF/SRTF
• Lottery Scheduling:– Give each thread a number of tokens (short tasks more
tokens)– Reserve a minimum number of tokens for every thread to
ensure forward progress/fairness
Midterm Review.35CS162 ©UCB Fall 2013
Example CPU Scheduling Problems
• Spring 2011 Midterm Exam #4
• Fall 2012 Midterm Exam #4
• Spring 2013 Midterm Exam #5
Midterm Review.36CS162 ©UCB Fall 2013
Memory Multiplexing,Address Translation
Midterm Review.37CS162 ©UCB Fall 2013
Important Aspects of Memory Multiplexing• Controlled overlap:
– Processes should not collide in physical memory– Conversely, would like the ability to share memory when desired
(for communication)• Protection:
– Prevent access to private memory of other processes» Different pages of memory can be given special behavior (Read
Only, Invisible to user programs, etc).» Kernel data protected from User programs» Programs protected from themselves
• Translation: – Ability to translate accesses from one address space (virtual) to
a different one (physical)– When translation exists, processor uses virtual addresses,
physical memory uses physical addresses– Side effects:
» Can be used to avoid overlap» Can be used to give uniform view of memory to programs
Midterm Review.38CS162 ©UCB Fall 2013
Why Address Translation?
Prog 1Virtual
AddressSpace 1
Prog 2Virtual
AddressSpace 2
Code
Data
Heap
Stack
Code
Data
Heap
Stack
Data 2
Stack 1
Heap 1
OS heap & Stacks
Code 1
Stack 2
Data 1
Heap 2
Code 2
OS code
OS dataTranslation Map 1 Translation Map 2
Physical Address Space
Midterm Review.39CS162 ©UCB Fall 2013
Addr. Translation: Segmentation vs. Paging
Base0 Limit0 VBase1 Limit1 VBase2 Limit2 VBase3 Limit3 NBase4 Limit4 VBase5 Limit5 NBase6 Limit6 NBase7 Limit7 V
OffsetSeg #VirtualAddress
Base2 Limit2 V
+ PhysicalAddress
> Error
Physical Address
Offset
OffsetVirtualPage #Virtual Address:
PageTablePtr page #0
page #2page #3page #4page #5
V,R
page #1 V,R
V,R,W
V,R,W
N
V,R,W
page #1 V,R
Check Perm
AccessError
PhysicalPage #
Midterm Review.40CS162 ©UCB Fall 2013
Review: Address Segmentation
1111 1111stack
heap
code
data
Virtual memory view Physical memory view
data
heap
stack
0000 0000
0001 0000
0101 0000
0111 0000
1110 000
0000 0000
0100 0000
1000 0000
1100 0000
1111 0000
Seg # base limit
00 0001 0000 10 0000
01 0101 0000 10 0000
10 0111 0000 1 1000
11 1011 0000 1 0000
seg # offset
code
Midterm Review.41CS162 ©UCB Fall 2013
Review: Address Segmentation
1111 1111
stack
heap
code
data
Virtual memory view Physical memory view
data
heap
stack
0000 0000
0100 0000
1000 0000
1100 0000
seg # offset
code
What happens if stack grows to 1110 0000?
1110 0000
0000 0000
0001 0000
0101 0000
0111 0000
1110 000Seg # base limit
00 0001 0000 10 0000
01 0101 0000 10 0000
10 0111 0000 1 1000
11 1011 0000 1 0000
Midterm Review.42CS162 ©UCB Fall 2013
Review: Address Segmentation
1111 1111
stack
heap
code
data
Virtual memory view Physical memory view
data
heap
stack
0000 0000
0100 0000
1000 0000
1100 0000
1110 0000
seg # offset
code
No room to grow!! Buffer overflow error
0000 0000
0001 0000
0101 0000
0111 0000
1110 000Seg # base limit
00 0001 0000 10 0000
01 0101 0000 10 0000
10 0111 0000 1 1000
11 1011 0000 1 0000
Midterm Review.43CS162 ©UCB Fall 2013
Review: Paging
1111 1111stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
1111 0000
page # offset
Physical memory view
data
code
heap
stack
0000 0000
0001 0000
0101 000
0111 000
1110 0000
11111 1110111110 1110011101 null 11100 null 11011 null11010 null11001 null11000 null10111 null10110 null10101 null10100 null10011 null10010 1000010001 0111110000 0111001111 null01110 null 01101 null01100 null01011 01101 01010 01100 01001 0101101000 0101000111 null00110 null00101 null 00100 null 00011 0010100010 0010000001 0001100000 00010
Page Table
Midterm Review.44CS162 ©UCB Fall 2013
Review: Paging
1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
page # offset
Physical memory view
data
code
heap
stack
0000 0000
0001 0000
0101 000
0111 000
1110 0000
11111 1110111110 1110011101 null 11100 null 11011 null11010 null11001 null11000 null10111 null10110 null10101 null10100 null10011 null10010 1000010001 0111110000 0111001111 null01110 null 01101 null01100 null01011 01101 01010 01100 01001 0101101000 0101000111 null00110 null00101 null 00100 null 00011 0010100010 0010000001 0001100000 00010
Page Table
1110 0000
What happens if stack grows to 1110 0000?
Midterm Review.45CS162 ©UCB Fall 2013
stack
Review: Paging
1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
page # offset
Physical memory view
data
code
heap
stack
11111 1110111110 1110011101 1011111100 1011011011 null11010 null11001 null11000 null10111 null10110 null10101 null10100 null10011 null10010 1000010001 0111110000 0111001111 null01110 null01101 null01100 null01011 01101 01010 01100 01001 0101101000 0101000111 null00110 null00101 null 00100 null 00011 0010100010 0010000001 0001100000 00010
Page Table
0000 0000
0001 0000
0101 000
0111 000
1110 0000
Allocate new pages where room!
Challenge: Table size equal to the # of pages in virtual memory!
1110 0000
Midterm Review.46CS162 ©UCB Fall 2013
Would you choose Segmentation or Paging?
Design a Virtual Memory system
Midterm Review.47CS162 ©UCB Fall 2013
Design a Virtual Memory system
How would you choose your Page Size?
Midterm Review.48CS162 ©UCB Fall 2013
How would you choose your Page Size?
• Page Table Size
• TLB Size
• Internal Fragmentation
• Disk Access
Design a Virtual Memory system
Midterm Review.49CS162 ©UCB Fall 2013
You have a 32 bit address space and 2 KB pages.
In a single-level page table, how many bits would you use for the
index and how many for the offset?
Design a Virtual Memory system
Midterm Review.50CS162 ©UCB Fall 2013
2 KB = 211 bytes
11 bit Offset
32 bit address space -11 bit offset = 21 bits
21 bit Virtual Page Number
031
Virtual Page Number Byte Offset
11
Design a Virtual Memory system
Midterm Review.51CS162 ©UCB Fall 2013
Design a Virtual Memory system
Page Table
VPN0
VPN1
VPN2
VPN3
……
PTE1
PTE2
PTE3
PTE4
……
What does each page table entry look like (PTE)?
Midterm Review.52CS162 ©UCB Fall 2013
What does each page table entry look like (PTE)?
Design a Virtual Memory system
sizeof(VPN) == sizeof(PPN) == 21 bits21 bit Physical Page Number
Don’t forget the 4 auxiliary bits:• Valid (resident in memory)• Read (has read permissions)• Write (has write permissions)• Execute (has execute permissions)
= 25 bit PTE
32 – 25 = 7 unused bits
PPN V R W
21 4
EUnused
7
Midterm Review.53CS162 ©UCB Fall 2013
If you only had 16 MB of physical memory, how many bits of the PPN field would
actually be used?
Design a Virtual Memory system
Midterm Review.54CS162 ©UCB Fall 2013
Design a Virtual Memory system
PPN V R W
13 4
EUnused
15
If you only had 16 MB of physical memory, how many bits of the PPN field would
actually be used?16 MB = 224 bytes
With an 11 bit offset…24-11 = 13 bit Physical Page Number
Midterm Review.55CS162 ©UCB Fall 2013
How large would the page table be?
Design a Virtual Memory system
Midterm Review.56CS162 ©UCB Fall 2013
How large would the page table be?
221
Entries
Total Size:221 x 4 B = 8MB
Design a Virtual Memory system
PPN V R W EUnused
PPN V R W EUnused
PPN V R W EUnused
PPN V R W EUnused
4 Byte Entry
Midterm Review.57CS162 ©UCB Fall 2013
stack
Review: Two-Level Paging
1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
page1 # offset
Physical memory view
data
code
heap
stack
0000 0000
0001 0000
0101 000
0111 000
1110 0000
page2 #
111 110 null101 null100 011 null010 001 null000
11 11101
10 1110001 1011100 10110
11 01101
10 0110001 0101100 01010
11 00101
10 0010001 0001100 00010
11 null 10 1000001 0111100 01110
Page Tables(level 2)
Page Table(level 1)
1110 0000
Midterm Review.58CS162 ©UCB Fall 2013
stack
Review: Two-Level Paging
stack
heap
code
data
Virtual memory view
1001 0000
Physical memory view
data
code
heap
stack
0000 0000
0001 0000
1000 0000
1110 0000
111 110 null101 null100 011 null010 001 null000
11 11101
10 1110001 1011100 10110
11 01101
10 0110001 0101100 01010
11 00101
10 0010001 0001100 00010
11 null 10 1000001 0111100 01110
Page Tables(level 2)
Page Table(level 1)
In best case, total size of page tables ≈ number of pages used by program. But requires one additional memory access!
Midterm Review.59CS162 ©UCB Fall 2013
PhysicalAddress:(40-50 bits)
12bit OffsetPhysical Page #
X86_64: Four-level page table!9 bits 9 bits 12 bits
48-bit Virtual Address: Offset
VirtualP2 index
VirtualP1 index
8 bytes
PageTablePtr
VirtualP3 index
VirtualP4 index
9 bits 9 bits
4096-byte pages (12 bit offset)Page tables also 4k bytes (pageable)
Midterm Review.60CS162 ©UCB Fall 2013
stack
Review: Inverted Table
1111 1111
stack
heap
code
data
Virtual memory view
0000 0000
0100 0000
1000 0000
1100 0000
page # offset
Physical memory view
data
code
heap
stack
11111 1110111110 1110011101 1011111100 1011010010 1000010001 0111110000 0111001011 01101 01010 01100 01001 0101110000 01010 00011 0010100010 0010000001 0001100000 00010
Inverted Tablehash(virt. page #) =
physical page #
0000 0000
0001 0000
0101 000
0111 000
1110 00001110 0000
Total size of page table ≈ number of pages used by program. But hash more
complex.
Midterm Review.61CS162 ©UCB Fall 2013
Address Translation Comparison
Advantages Disadvantages
Segmentation Fast context switching: Segment mapping maintained by CPU
External fragmentation
Paging (single-level page)
No external fragmentation
•Large size: Table size ~ virtual memory•Internal fragmentation
Paged segmentation
•No external fragmentation•Table size ~ memory used by program
•Multiple memory references per page access •Internal fragmentation
Multi-level pages
Inverted Table Hash function more complex
Midterm Review.62CS162 ©UCB Fall 2013
Example Memory Multiplexing,Address Translation Problems
• Spring 2004 Midterm Exam #5
• Fall 2009 Midterm Exam #4
• Spring 2013 Midterm Exam #3b
Midterm Review.63CS162 ©UCB Fall 2013
Caches, TLBs
Midterm Review.64CS162 ©UCB Fall 2013
• Compulsory (cold start): first reference to a block– “Cold” fact of life: not a whole lot you can do about it
– Note: When running “billions” of instruction, Compulsory Misses are insignificant
• Capacity:– Cache cannot contain all blocks access by the program
– Solution: increase cache size
• Conflict (collision):– Multiple memory locations mapped to same cache location
– Solutions: increase cache size, or increase associativity
• Two others:– Coherence (Invalidation): other process (e.g., I/O) updates
memory
– Policy: Due to non-optimal replacement policy
Review: Sources of Cache Misses
Midterm Review.65CS162 ©UCB Fall 2013
• Example: Block 12 placed in 8 block cache
0 1 2 3 4 5 6 7Blockno.
Direct mapped:block 12 (01100) can go only into block 4 (12 mod 8)
Set associative:block 12 can go anywhere in set 0
0 1 2 3 4 5 6 7Blockno.
Set0
Set1
Set2
Set3
Fully associative:block 12 can go anywhere
0 1 2 3 4 5 6 7Blockno.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
32-Block Address Space:
1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3Blockno.
Where does a Block Get Placed in a Cache?
01 100
tag index
011 00
tag index
01100
tag
Midterm Review.66CS162 ©UCB Fall 2013
Review: Caching Applied to Address Translation
• Problem: address translation expensive (especially multi-level)• Solution: cache address translation (TLB)
– Instruction accesses spend a lot of time on the same page (since accesses sequential)
– Stack accesses have definite locality of reference– Data accesses have less page locality, but still some…
Data Read or Write(untranslated)
CPU PhysicalMemory
TLB
Translate(MMU)
No
VirtualAddress
PhysicalAddress
YesCached?
Save
Result
Midterm Review.67CS162 ©UCB Fall 2013
TLB organization
• How big does TLB actually have to be?–Usually small: 128-512 entries–Not very big, can support higher associativity
• TLB usually organized as fully-associative cache–Lookup is by Virtual Address–Returns Physical Address
• What happens when fully-associative is too slow?–Put a small (4-16 entry) direct-mapped cache in front–Called a “TLB Slice”
• When does TLB lookup occur?–Before cache lookup?– In parallel with cache lookup?
Midterm Review.68CS162 ©UCB Fall 2013
• As described, TLB lookup is in serial with cache lookup:
• Machines with TLBs go one step further: they overlap TLB lookup with cache access.
– Works because offset available early
Reducing translation time further
Virtual Address
TLB Lookup
V AccessRights PA
V page no. offset10
P page no. offset10
Physical Address
Midterm Review.69CS162 ©UCB Fall 2013
• Here is how this might work with a 4K cache:
• What if cache size is increased to 8KB?– Overlap not complete– Need to do something else. See CS152/252
• Another option: Virtual Caches– Tags in cache are virtual addresses– Translation only happens on cache misses
TLB 4K Cache
10 2
004 bytes
index 1 K
page # disp20
assoclookup
32
Hit/Miss
PA Data Hit/Miss
=PA
Overlapping TLB & Cache Access
Midterm Review.70CS162 ©UCB Fall 2013
Example Caches, TLB Problems
• Spring 2011 Midterm Exam #6
• Spring 2012 Midterm Exam #6
• Spring 2013 Midterm Exam #3a
Midterm Review.71CS162 ©UCB Fall 2013
Putting Everything Together
Midterm Review.72CS162 ©UCB Fall 2013
Paging & Address Translation
Physical Address:
OffsetPhysicalPage #
Virtual Address:
OffsetVirtualP2 index
VirtualP1 index
PageTablePtr
Page Table (1st level)
Page Table (2nd level)
Physical Memory:
Midterm Review.73CS162 ©UCB Fall 2013
Page Table (2nd level)
PageTablePtr
Page Table (1st level)
Translation Look-aside Buffer
OffsetPhysicalPage #
Virtual Address:
OffsetVirtualP2 index
VirtualP1 index
Physical Memory:
Physical Address:
…
TLB:
Midterm Review.74CS162 ©UCB Fall 2013
Page Table (2nd level)
PageTablePtr
Page Table (1st level)
Virtual Address:
OffsetVirtualP2 index
VirtualP1 index
…
TLB:
Caching
Offset
Physical Memory:
Physical Address:PhysicalPage #
…
tag: block:cache:
index bytetag
Midterm Review.75CS162 ©UCB Fall 2013
Page Allocation and Replacement
Midterm Review.76CS162 ©UCB Fall 2013
Demand Paging• Modern programs require a lot of physical memory
– Memory per system growing faster than 25%-30%/year• But they don’t use all their memory all of the time
– 90-10 rule: programs spend 90% of their time in 10% of their code
– Wasteful to require all of user’s code to be in memory• Solution: use main memory as cache for disk
On
-Ch
ipC
ache
Control
Datapath
SecondaryStorage(Disk)
Processor
MainMemory(DRAM)
SecondLevelCache(SRAM)
TertiaryStorage(Tape)
Caching
Midterm Review.77CS162 ©UCB Fall 2013
• PTE helps us implement demand paging– Valid Page in memory, PTE points at physical page– Not Valid Page not in memory; use info in PTE to find it on
disk when necessary• Suppose user references page with invalid PTE?
– Memory Management Unit (MMU) traps to OS» Resulting trap is a “Page Fault”
– What does OS do on a Page Fault?:» Choose an old page to replace » If old page modified (“D=1”), write contents back to disk» Change its PTE and any cached TLB to be invalid» Load new page into memory from disk» Update page table entry, invalidate TLB for new entry» Continue thread from original faulting location
– TLB for new page will be loaded when thread continued!– While pulling pages off disk for one process, OS runs another
process from ready queue» Suspended process sits on wait queue
Demand Paging Mechanisms
Midterm Review.78CS162 ©UCB Fall 2013
What Factors Lead to Misses?• Compulsory Misses:
– Pages that have never been paged into memory before– How might we remove these misses?
» Prefetching: loading them into memory before needed» Need to predict future somehow! More later.
• Capacity Misses:– Not enough memory. Must somehow increase size.– Can we do this?
» One option: Increase amount of DRAM (not quick fix!)» Another option: If multiple processes in memory: adjust percentage
of memory allocated to each one!• Conflict Misses:
– Technically, conflict misses don’t exist in virtual memory, since it is a “fully-associative” cache
• Policy Misses:– Caused when pages were in memory, but kicked out prematurely
because of the replacement policy– How to fix? Better replacement policy
Midterm Review.79CS162 ©UCB Fall 2013
Replacement Policies (Con’t)• LRU (Least Recently Used):
– Replace page that hasn’t been used for the longest time– Programs have locality, so if something not used for a while,
unlikely to be used in the near future.– Seems like LRU should be a good approximation to MIN.
• How to implement LRU? Use a list!
– On each use, remove page from list and place at head– LRU page is at tail
• Problems with this scheme for paging?– List operations complex
» Many instructions for each hardware access• In practice, people approximate LRU (more later)
Page 6 Page 7 Page 1 Page 2Head
Tail (LRU)
Midterm Review.80CS162 ©UCB Fall 2013
• Suppose we have 3 page frames, 4 virtual pages, and following reference stream:
– A B C A B D A D B C B• Consider FIFO Page replacement:
– FIFO: 7 faults. – When referencing D, replacing A is bad choice, since need A
again right away
Example: FIFO
C
B
A
D
C
B
A
BCBDADBACBA
3
2
1
Ref:
Page:
Midterm Review.81CS162 ©UCB Fall 2013
• Suppose we have the same reference stream: – A B C A B D A D B C B
• Consider MIN Page replacement:
– MIN: 5 faults – Look for page not referenced farthest in future.
• What will LRU do?– Same decisions as MIN here, but won’t always be true!
Example: MIN
C
DC
B
A
BCBDADBACBA
3
2
1
Ref:
Page:
Midterm Review.82CS162 ©UCB Fall 2013
Implementing LRU & Second Chance
• Perfect:– Timestamp page on each reference– Keep list of pages ordered by time of reference– Too expensive to implement in reality for many reasons
• Second Chance Algorithm: – Approximate LRU
» Replace an old page, not the oldest page– FIFO with “use” bit
• Details– A “use” bit per physical page– On page fault check page at head of queue
» If use bit=1 clear bit, and move page at tail (give the page second chance!)
» If use bit=0 replace page – Moving pages to tail still complex
Midterm Review.83CS162 ©UCB Fall 2013
Clock Algorithm
• Clock Algorithm: more efficient implementation of second chance algorithm
– Arrange physical pages in circle with single clock hand• Details:
– On page fault:» Check use bit: 1used recently; clear and leave it alone
0selected candidate for replacement» Advance clock hand (not real time)
– Will always find a page or loop forever?
Midterm Review.84CS162 ©UCB Fall 2013
Clock Replacement Illustration
• Max page table size 4
• Invariant: point at oldest page
– Page B arrives
B u:0
Midterm Review.85CS162 ©UCB Fall 2013
Clock Replacement Illustration
• Max page table size 4
• Invariant: point at oldest page
– Page B arrives
– Page A arrives
– Access page A B u:0
A u:0
Midterm Review.86CS162 ©UCB Fall 2013
Clock Replacement Illustration
• Max page table size 4
• Invariant: point at oldest page
– Page B arrives
– Page A arrives
– Access page A
– Page D arrives
B u:0
A u:1
D u:0
Midterm Review.87CS162 ©UCB Fall 2013
Clock Replacement Illustration
• Max page table size 4
• Invariant: point at oldest page
– Page B arrives
– Page A arrives
– Access page A
– Page D arrives
– Page C arrives
B u:0
A u:1
D u:0
C u:0
Midterm Review.88CS162 ©UCB Fall 2013
B u:0
Clock Replacement Illustration
• Max page table size 4
• Invariant: point at oldest page
– Page B arrives
– Page A arrives
– Access page A
– Page D arrives
– Page C arrives
– Page F arrives
– Access page D
F u:0
A u:1
D u:0
C u:0
Midterm Review.89CS162 ©UCB Fall 2013
C u:0E u:0
• Max page table size 4
• Invariant: point at oldest page
– Page B arrives
– Page A arrives
– Access page A
– Page D arrives
– Page C arrives
– Page F arrives
– Access page D
– Page E arrives
A u:1A u:0
D u:1D u:0
Clock Replacement Illustration
F u:0
Midterm Review.90CS162 ©UCB Fall 2013
Thrashing
• If a process does not have “enough” pages, the page-fault rate is very high. This leads to:
– low CPU utilization– operating system spends most of its time swapping to disk
• Thrashing a process is busy swapping pages in and out• Questions:
– How do we detect Thrashing?– What is best response to Thrashing?
Midterm Review.91CS162 ©UCB Fall 2013
• Program Memory Access Patterns have temporal and spatial locality– Group of Pages accessed
along a given time slice called the “Working Set”
– Working Set defines minimum number of pages needed for process to behave well
• Not enough memory for Working SetThrashing– Better to swap out
process?
Locality In A Memory-Reference Pattern
Midterm Review.92CS162 ©UCB Fall 2013
Working-Set Model
working-set window fixed number of page references – Example: 10,000 accesses
• WSi (working set of Process Pi) = total set of pages referenced in the most recent (varies in time)
– if too small will not encompass entire locality– if too large will encompass several localities– if = will encompass entire program
• D = |WSi| total demand frames • if D > physical memory Thrashing
– Policy: if D > physical memory, then suspend/swap out processes
– This can improve overall system behavior by a lot!
Midterm Review.93CS162 ©UCB Fall 2013
Kernel/User Interface
Midterm Review.94CS162 ©UCB Fall 2013
Dual-Mode Operation• Can an application modify its own translation maps or PTE
bits?– If it could, could get access to all of physical memory– Has to be restricted somehow
• To assist with protection, hardware provides at least two modes (Dual-Mode Operation):
– “Kernel” mode (or “supervisor” or “protected”)– “User” mode (Normal program mode)– Mode set with bits in special control register only accessible in
kernel-mode
• Intel processors actually have four “rings” of protection:– PL (Privilege Level) from 0 – 3
» PL0 has full access, PL3 has least– Typical OS kernels on Intel processors only use PL0 (“kernel”)
and PL3 (“user”)
Midterm Review.95CS162 ©UCB Fall 2013
How to get from KernelUser• What does the kernel do to create a new user process?
– Allocate and initialize process control block
– Read program off disk and store in memory
– Allocate and initialize translation map» Point at code in memory so program can execute
» Possibly point at statically initialized data
– Run Program:» Set machine registers
» Set hardware pointer to translation table
» Set processor status word for user mode
» Jump to start of program
• How does kernel switch between processes (we learned about this!) ?
– Same saving/restoring of registers as before
– Save/restore hardware pointer to translation map
Midterm Review.96CS162 ©UCB Fall 2013
UserKernel (System Call)• Can’t let inmate (user) get out of padded cell on own
– Would defeat purpose of protection!– So, how does the user program get back into kernel?
• System call: Voluntary procedure call into kernel– Hardware for controlled UserKernel transition– Can any kernel routine be called?
» No! Only specific ones– System call ID encoded into system call instruction
» Index forces well-defined interface with kernel
I/O: open, close, read, write, lseekFiles: delete, mkdir, rmdir, chownProcess: fork, exit, joinNetwork: socket create, select
Midterm Review.97CS162 ©UCB Fall 2013
UserKernel (Exceptions: Traps and Interrupts)• System call instr. causes a synchronous exception (or “trap”)
– In fact, often called a software “trap” instruction
• Other sources of Synchronous Exceptions:– Divide by zero, Illegal instruction, Bus error (bad address, e.g.
unaligned access)– Segmentation Fault (address out of range)– Page Fault
• Interrupts are Asynchronous Exceptions– Examples: timer, disk ready, network, etc….– Interrupts can be disabled, traps cannot!
• SUMMARY – On system call, exception, or interrupt:– Hardware enters kernel mode with interrupts disabled– Saves PC, then jumps to appropriate handler in kernel– For some processors (x86), processor also saves registers,
changes stack, etc.
Midterm Review.98CS162 ©UCB Fall 2013
Questions?