1 ece7995 computer storage and operating system design lecture 3: processes and threads (i)
Post on 19-Dec-2015
224 views
TRANSCRIPT
1
ECE7995Computer Storage and Operating System
Design
Lecture 3: Processes and threads (I)
2
Why Multiprogramming and Timesharing? (revisit)
3
Protection (revisit)
• OS must protect/isolate applications from each other, and OS from applications
• Three techniquesPreemption: granted resources can be revoked
Interposition: must go through OS to access resourcesPrivilege: user/kernel modes
4
Mode Switching (revisit)
• User Kernel mode For reasons external or internal to CPU
• External (aka hardware) interrupt: timer/clock chip, I/O device, network card, keyboard, mouse
asynchronous (with respect to the executing program)
• Internal interrupt (aka software interrupt, trap, or exception)are synchronous
System Call (process wants to enter kernel to obtain services) – intended
Fault/exception (division by zero, privileged instruction in user mode) – usually unintended
• Kernel User mode switch on iret instruction
5
Mode Switch (revisit)
Process 1
Process 2
Kernel
user mode
kernel mode
Timer interrupt: P1 is preempted, context switch to P2
Timer interrupt: P1 is preempted, context switch to P2
System call: (trap): P2 starts I/O operation, blocks context switch to process 1
System call: (trap): P2 starts I/O operation, blocks context switch to process 1
I/O device interrupt:P2’s I/O completeswitch back to P2
I/O device interrupt:P2’s I/O completeswitch back to P2
Timer interrupt: P2 still hastime left, no context switch
Timer interrupt: P2 still hastime left, no context switch
6
Overview
• Process concept
• Process Scheduling
• Thread concept
7
Process
These are all possible definitions:• A program in execution, and process execution must progress in
sequential fashion• An instance of a program running on a computer• Schedulable entity (*)• Unit of resource ownership• Unit of protection• Execution sequence (*) + current state (*) + set of resources
(*) can be said of threads as well
8
Process in Memory
A process includes• text section• program counter • stack• data section• heap
9
Data placement, seen from C/C++
Q.: which of these variables are stored?
int a;
static int b;
int c = 5;
struct S {
int t;
};
struct S s;
void func(int d)
{
static int e;
int f;
struct S w;
int *g = new int[10];
}
A.: On stack: d, f, w (including w.t), gOn (global) data section: a, b, c, s (including s.t), e On the heap: g[0]…g[9]
A.: On stack: d, f, w (including w.t), gOn (global) data section: a, b, c, s (including s.t), e On the heap: g[0]…g[9]
10
Process State
As a process executes, it changes state
• new: The process is being created
• running: Instructions are being executed
• waiting: The process is waiting for some event to occur
• ready: The process is waiting to be assigned to a processor
• terminated: The process has finished execution
11
Process Control Block (PCB)
PCB records information associated with each process• Process identifier (pid)• Value of registers, including stack pointer
Program counter CPU registers
• Information needed by scheduler Process state Other CPU scheduling information
• Resources held by process: Memory-management information I/O status information
• Accounting information
12
Context Switching
• Multiprogramming: switch to another process if current process is (momentarily) blocked
• Time-sharing: switch to another process periodically to make sure all process make equal progress
• this switch is called a context switch.
• Understand how it works
how it interacts with user/kernel mode switching
how it maintains the illusion of each process having the CPU to itself (process must not notice being switched in and out!)
13
CPU Switch From Process to Process
1) Save the current process’s execution state to its PCB2) Update current’s PCB as needed3) Choose next process N4) Update N’s PCB as needed5) Restore N’s PCB execution state
May involve reprogramming MMU
14
Process Scheduling Queues
• Processes are linked in multiple queues:Job queue – set of all processes in the systemReady queue – set of all processes residing in main memory,
ready and waiting to executeDevice queues (wait queue )– set of processes waiting for an I/O
device or other events
• Processes migrate among the various queues
15
Ready Queue And Various I/O Device Queues
16
Representation of Process Scheduling
17
Schedulers
• Long-term scheduler (or job scheduler) – selects which processes should be brought into the ready queueThe long-term scheduler controls the degree of multiprogrammingA good mix of processes can be described as either:
– I/O-bound process – spends more time doing I/O than computations, many short CPU bursts
– CPU-bound process – spends more time doing computations; few very long CPU bursts
• Short-term scheduler (or CPU scheduler) – selects which process should be executed next and allocates CPU
18
CPU Scheduling
• Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them
• CPU scheduling decisions may take place when a process:
1. Switches from running to waiting state
2. Switches from running to ready state
3. Switches from waiting to ready
4. Terminates
• Scheduling under 1 and 4 is nonpreemptive• All other scheduling is preemptive
19
Preemptive vs Nonpreemptive Scheduling
Q.: when is scheduler asked to pick a thread from ready queue?
Nonpreemptive:
• Only when RUNNING BLOCKED transition
• Or RUNNING EXIT
• Or voluntary yield: RUNNING READY
Preemptive
• Also when BLOCKED READY transition
• Also on timer
RUNNINGRUNNING
READYREADYBLOCKEDBLOCKED
Processmust waitfor event
Event arrived
Schedulerpicks process
Processpreempted
20
Dispatcher
• Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves:switching contextswitching to user modejumping to the proper location in the user program to
restart that program
• Dispatch latency – time it takes for the dispatcher to stop one process and start another running
21
Static vs Dynamic Scheduling
• Static schedulingThe arrival and execution times of all jobs are known in advance. It create a
schedule, execute it– Used in statically configured systems, such as embedded real-time systems
• Dynamic/online scheduling• Jobs are not known in advance, scheduler must make online
decision whenever jobs arrives or leaves– Execution time may or may not be known– Behavior can be modeled by making assumptions about nature of arrival
process
22
Alternating Sequence of CPU And I/O Bursts
23
CPU Scheduling ModelProcess alternates between CPU burst and I/O burst
I/O Bound Process
CPU Bound Process
I/OCPUWaiting
P1
P1
P2
P2
Scheduling on the same CPU:
24
CPU Scheduling Terminology
• A job (sometimes called a task, or a job instance)
Activity that’s schedulable: process/thread and a collection of processes that are scheduled together
• Arrival time: time when job arrives
• Start time: time when job actually starts
• Finish time: time when job is done
• Completion time (aka Turn-around time)
Finish time – Arrival time
• Response time
Time when user sees response – Arrival time
• Execution time (aka cost): time a job need to execute
25
CPU Scheduling Terminology (cont’d)
• Waiting time = time when job was ready-to-rundidn’t run because CPU scheduler picked another job
• Blocked time = time when job was blockedwhile I/O device is in use
• Completion time Execution time + Waiting time + Blocked time
26
CPU Scheduling Goals
• Minimize latencyCan mean completion time
Can mean response time
• Maximize throughputThroughput: number of finished jobs per time-unit
Implies minimizing overhead (for context-switching, for scheduling algorithm itself)
Requires efficient use of non-CPU resources
• FairnessMinimize variance in waiting time/completion time
27
Scheduling Constraints
Reaching those goals is difficult, because• Goals are conflicting:
– Latency vs. throughput– Fairness vs. low overhead
• Scheduler must operate with incomplete knowledge– Execution time may not be known– I/O device use may not be known
• Scheduler must make decision fast– Approximate best solution from huge solution space
28
Round Robin (RR)• Each process gets a small unit of CPU time (time
quantum), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue.
• If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units.
• No more unfairness to short jobs or starvation for long jobs!
• Performanceq large FIFOq small q must be large with respect to context switch,
otherwise overhead is too high
29
Example of RR with Time Quantum = 20Process CPU Burst Time
P1 53
P2 17
P3 68
P4 24
The schedule is:
P1 P2 P3 P4 P1 P3 P4 P1 P3 P3
0 20 37 57 77 97 117 121 134 154 162
30
Round Robin – Cost of Time Slicing• Context switching incurs a cost
Direct cost (execute scheduler & context switch) + indirect cost (cache & TLB misses)
• Long time slices lower overhead, but approaches FCFS if processes finish before timeslice expires
• Short time slices lots of context switches, high overhead• Typical cost: context switch < 10 us• Time slice typical around 100 ms
Linux: 100ms default, adjust to between 10ms & 300msNote: time slice length != interval between timer interrupts
Timer frequency usually 1000Hz
31
Multi-Level Feedback Queue Scheduling
• Objectives:preference for short jobs (tends to lead to good I/O utilization)
longer timeslices for CPU bound jobs (reduces context-switching overhead)
• Challenge: Don’t know type of each process – algorithm needs to figure out
• Solutions: use multiple queuesqueue determines priority
usually combined with static priorities (nice values)
many variations of this
32
MLFQS
MIN
MAXH
ighe
r P
riorit
y
4
3
1
2
Long
er T
imes
lices
Process thatuse up theirtime slice movedown
Processes start in highest queue
• Higher priority queues are served before lower-priority ones - within highest-priority queue, round-robin
Processes that starve
move up
• Only ready processes are in this queue - blocked processes leave queue and reenter same queue on unblock
33
Case Study: Linux Scheduler
• Variant of MLFQS
• 140 priorities
1-99 “realtime”
100-140 nonrealtime
• Dynamic priority computed from static priority (nice) plus “interactivity bonus”
0
100
120
139
“Realtime”processesscheduled based on static prioritySCHED_FIFOSCHED_RR
Processes scheduledbased on dynamicprioritySCHED_NORMAL
nice=0
nice=19
nice=-20
Default
34
Linux Scheduler (cont’d)Instead of recomputation loop, recompute priority at end of each
timeslice• dyn_prio = nice + interactivity bonus (-5…5)
Interactivity bonus depends on sleep_avg• measures time a process was blocked
2 priority arrays (“active” & “expired”) in each runqueue (Linux calls ready queues “runqueue”)
• Finds highest-priority ready thread quickly
• Switching active & expired arrays at end of epoch is simple pointer swap
35
Linux Timeslice Computation• Principle: choose a timeslice as long as possible while keeping good
system response time.
• Various tweaks:“interactive processes” are reinserted into active array even after
timeslice expires– Unless processes in expired array are starving
processes with long timeslices are round-robin’d with other of equal priority at sub-timeslice granularity
Q: Does a very long quantum duration degrade the response time of interactive applications?
36
Linux SMP Load Balancing
• One runqueue is per CPU• Periodically, lengths of runqueues on different CPU is compared
Processes are migrated to balance load
• Migrating requires locks on both runqueues
37
Scheduling Summary• OS must schedule all resources in a system
CPU, Disk, Network, etc.
• CPU Scheduling affects indirectly scheduling of other devices• Goals: (1) Minimize latency (2) Maximize throughput (3) Provide fairness
• In Practice: some theory, lots of tweaking
38
Single and Multithreaded Processes
39
Benefits of Multithreading
• Responsiveness
• Resource Sharing
• Economy
• Utilization of MP Architectures
40
Thread Implementations
• User threadsThread management done by user-level threads libraryThree primary thread libraries:
― POSIX Pthreads― Win32 threads― Java threads
• Kernel threads:Supported by the Kernel Examples
– Windows XP/2000– Solaris– Linux– Tru64 UNIX– Mac OS X
41
#include <stdio.h>#include <stdlib.h>#include <pthread.h>
void *print_message_function( void *ptr );main(){
pthread_t thread1, thread2;char *message1 = "Thread 1";char *message2 = "Thread 2";int iret1, iret2;
/* Create independent threads each of which will execute function */
iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);
/* Wait till threads are complete before main continues. Unless we *//* wait we run the risk of executing an exit which will terminate *//* the process and all threads before the threads have completed. */
pthread_join( thread1, NULL);pthread_join( thread2, NULL);
printf("Thread 1 returns: %d\n",iret1);printf("Thread 2 returns: %d\n",iret2);exit(0);
}void *print_message_function( void *ptr ){
char *message;message = (char *) ptr;printf("%s \n", message);
}
An example: Pthread
cc -lpthread pthread1.c
42
Multithreading Models
• Many-to-One: many user-level threads mapped to single kernel thread
Examples:―Solaris Green Threads―GNU Portable Threads
Pros: efficient, no need for OS support for threading
Cons: lost concurrency with blocked in the kernel, cannot use MP
• One-to-One: each user-level thread maps to kernel thread
Examples― Windows NT/XP/2000
― Linux
― Solaris 9 and later
the number of kernel threads must be limited to reduce system cost
43
Multithreading Models (cont’d)• Many-to-Many: allows many user level threads to be mapped to many kernel
threads
Allows the operating system to create a sufficient number of kernel threads
Examples:― Solaris prior to version 9
― Windows NT/2000 with the ThreadFiber package
44
Thread Pools
• Create a number of threads in a pool where they await work• Advantages:
Usually slightly faster to service a request with an existing thread than create a new thread
Allows the number of threads in the application(s) to be bound to the size of the pool
45
Linux Threads• Linux treats thread as lightweight process• Each process (or thread) has its own task_struct
structure (PCB in Linux), and is identified by its own Process ID (or PID);
• However, Unix programmer expects threads in the same group (or process) to have a common PID. Linux uses tgid in PCB to record the PID of thread grorp
leader. So all the threads in a group share the same identifier.
• Thread creation is done through clone() system callclone() is a variant of fork() that allows a child task to share
certain resources with its parent via a parameter (CLONE_VM, CLONE_FILES …)