practical session 2, processes and scheduling

Operating Systems

Practical Session 2, Processes and Scheduling

1

A quick recap

Quality criteria measures:1. Throughput – The number of completed processes per

time unit.2. Turnaround time – The time interval between the process

submission and its completion.3. Waiting time – The sum of all time intervals in which the

process was in the ready queue.4. Response time – The time taken between submitting a

command and the generation of first output.5. CPU utilization – Percentage of time in which the CPU is

not idle.

2

A quick recap

Two types of scheduling:Preemptive scheduling

A task may be rescheduled to operate at a later time (for example, it may be rescheduled by the scheduler upon the arrival of a “more important” task).

Non Preemptive scheduling (cooperative)Task switching can only be performed with explicitly defined system services (for example: the termination task, explicit call to yield() , I/O operation which changes the process state to blocking, etc’…).

3

A quick recapScheduling algorithms:

1. FCFS (First – Come, First – Served).• Non preemptive.• Convoy effect.

2. SJF (Shortest Job First).• Provably minimal with respect to the minimal average turn around

time.• No way of knowing the length of the next CPU burst.• Can approximate according to: Tn+1=tn+(1- )Tn

• Preemptive (Shortest Remaining Time First) or non preemptive.

3. Round Robin.• When using large time slices it imitates FCFS.• When using time slices which are closer to context switch time, more

CPU time is wasted on switches.

4

A quick recap

4. Guaranteed scheduling.• Constantly calculates the ratio between how much time the

process has had since its creation and how much CPU time it is entitled to.

• Guarantees 1/n of CPU time per process / user.

5. Priority scheduling.• A generalization of SJF (How?).

6. Multi Level Queue scheduling.• Partition the ready queue. • Each partition employs its own

scheduling scheme.• A process from a lower priority group may run

only if there is no higher priority process.

May cause starvation!

5

A quick recap

7. Dynamic Multi Level scheduling.• Takes into account the time spent waiting (the notion of aging

to prevent starvation).

i. Highest Response Ratio Next:

ii. Feedback scheduling.• Demote processes running longer.• Combine with aging to prevent starvation.

8. Two Level scheduling.• Involves schedulers for Memory-CPU operations, and another

scheduler for Memory-Disk operations.

6

The Completely Fair Scheduler (Linux Kernel 2.63)

CFS tries to model an “ideal, precise multitasking CPU” – one that could run multiple processes simultaneously, giving each equal processing power.

Obviously, this is purely theoretical, so how can we model it?

We may not be able to have one CPU run things simultaneously, but we can measure how much runtime each task has had and try and ensure that everyone gets their fair share of time.

The Completely Fair Scheduler(cont.)

This is held in the vruntime variable for each task, and is recorded at the nanosecond level.

A lower vruntime indicates that the task has had less time to compute, and therefore has more need of the processor.

CFS uses a Red-Black tree to store, sort, and schedule tasks.

The CFS Tree & Scheduling Algo. The key for each node is the

vruntime of the corresponding task.

To pick the next task to run, simply take the leftmost node.

The task accounts for its time with the CPU by adding its execution time to the virtual runtime and is then inserted back into the tree if runnable

http://www.ibm.com/developerworks/linux/library/l-completely-fair-scheduler/

http://www.ibm.com/developerworks/linux/library/l-completely-fair-scheduler/

Priorities• CFS doesn't use priorities directly but instead uses

them as a decay factor for the time a task is permitted to execute.

• Lower-priority tasks have higher factors of decay, where higher-priority tasks have lower factors of decay.

Priority is inverse to its effect – a higher priority task will accumulate vruntime more slowly, since it needs more CPU time.

Likewise, a low-priority task will have its vruntime increase more quickly, causing it to be preempted earlier.

Five-state Process Model

14

New Ready Running Exit

Blocked

Admit

EventOccurs

Dispatch Release

Time-out

EventWait

proc.henum procstate { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };

// Per-process statestruct proc { uint sz; // Size of process memory (bytes) pde_t* pgdir; // Page table char *kstack; // Bottom of kernel stack for this process enum procstate state; // Process state volatile int pid; // Process ID struct proc *parent; // Parent process struct trapframe *tf; // Trap frame for current syscall struct context *context; // swtch() here to run process void *chan; // If non-zero, sleeping on chan int killed; // If non-zero, have been killed struct file *ofile[NOFILE]; // Open files struct inode *cwd; // Current directory char name[16]; // Process name (debugging)};

Creation of a Process (fork)

15


Blocked

Admit

EventOccurs

Dispatch Release

Time-out

EventWait

np->sz = proc->sz; np->parent = proc; *np->tf = *proc->tf;

// Clear %eax so that fork returns 0 in the child. np->tf->eax = 0;

for(i = 0; i < NOFILE; i++) if(proc->ofile[i]) np->ofile[i] = filedup(proc->ofile[i]); np->cwd = idup(proc->cwd); pid = np->pid; np->state = RUNNABLE; safestrcpy(np->name, proc->name, sizeof(proc->name)); return pid;}

proc.c// Create a new process copying p as the parent.// Sets up stack to return as if from system call.// Caller must set state of returned proc to RUNNABLE.intfork(void){ int i, pid; struct proc *np;

// Allocate process. if((np = allocproc()) == 0) return -1;

// Copy process state from p. if((np->pgdir = copyuvm(proc->pgdir, proc->sz)) == 0){ kfree(np->kstack); np->kstack = 0; np->state = UNUSED; return -1; }

Transition to a Running state

16


Blocked

Admit

EventOccurs

Dispatch Release

Time-out

EventWait

// Switch to chosen process. It is the process's job // to release ptable.lock and then reacquire it // before jumping back to us. proc = p; switchuvm(p); p->state = RUNNING; swtch(&cpu->scheduler, proc->context); switchkvm();

// Process is done running for now. // It should have changed its p->state before coming back. proc = 0; } release(&ptable.lock);

}}

proc.cvoidscheduler(void){ struct proc *p;

for(;;){ // Enable interrupts on this processor. sti();

// Loop over process table looking for process to run. acquire(&ptable.lock); for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ if(p->state != RUNNABLE) continue;

Transition to a Running state

17


Blocked

Admit

EventOccurs

Dispatch Release

Time-out

EventWait

# Switch stacks movl %esp, (%eax) movl %edx, %esp

# Load new callee-save registers popl %edi popl %esi popl %ebx popl %ebp ret

Swtch.s# Context switch# void swtch(struct context **old, struct context *new);# Save current register context in old# and then load register context from new.

.globl swtchswtch:

movl 4(%esp), %eax movl 8(%esp), %edx

# Save old callee-save registers pushl %ebp pushl %ebx pushl %esi pushl %edi

Transition to a Ready state

18


Blocked

Admit

EventOccurs

Dispatch Release

Time-out

EventWait

voidsched(void){ int intena;

if(!holding(&ptable.lock)) panic("sched ptable.lock"); if(cpu->ncli != 1) panic("sched locks"); if(proc->state == RUNNING) panic("sched running"); if(readeflags()&FL_IF) panic("sched interruptible"); intena = cpu->intena; swtch(&proc->context, cpu->scheduler); cpu->intena = intena;}

proc.c// Give up the CPU for one scheduling round.voidyield(void){ acquire(&ptable.lock); //DOC: yieldlock proc->state = RUNNABLE; sched(); release(&ptable.lock);};

Transition to a Blocked state

19


Blocked

Admit

EventOccurs

Dispatch Release

Time-out

EventWait

proc.c// Atomically release lock and sleep on chan.// Reacquires lock when awakened.voidsleep(void *chan, struct spinlock *lk){ if(proc == 0) panic("sleep");

if(lk == 0) panic("sleep without lk");

if(lk != &ptable.lock){ //DOC: sleeplock0 acquire(&ptable.lock); //DOC: sleeplock1 release(lk); }

// Go to sleep. proc->chan = chan; proc->state = SLEEPING; sched();

// Tidy up. proc->chan = 0;

// Reacquire original lock. if(lk != &ptable.lock){ //DOC: sleeplock2 release(&ptable.lock); acquire(lk); }}

Transition to a Ready state

20


Blocked

Admit

EventOccurs

Dispatch Release

Time-out

EventWait

// Wake up all processes sleeping on chan.// The ptable lock must be held.static voidwakeup1(void *chan){ struct proc *p;

for(p = ptable.proc; p < &ptable.proc[NPROC]; p++) if(p->state == SLEEPING && p->chan == chan) p->state = RUNNABLE;}

proc.c// Wake up all processes sleeping on chan.voidwakeup(void *chan){ acquire(&ptable.lock); wakeup1(chan); release(&ptable.lock);}

Warm up (1)

• Why bother with multiprogramming?• Assume processes in a given system wait for

I/O 60% of the time.1. What is the approximate CPU utilization with

one process running?2. What is the approximate CPU utilization with

three processes running?

21

Warm up (1)

1. If a process is blocking on I/O 60% of the time, than there is only 40% of CPU utilization.

2. At a given moment, the probability that all three processes are blocking on I/O is (0.6)3. That means that the CPU utilization is (1-(0.6)3)=0.784, or roughly 78%.

22

Warm up (2)

• Assume a single CPU machine with a non preemptive scheduler, attempts to schedule n independent processes. How many possible schedules exist?

• Answer: This is exactly like ordering a set of n different characters forming a word of length n. That is, there are n! different possible schedules.

23

Round Robin

The following list of processes require scheduling (each requires x Time Units, or TUs):

• PA – 6 TU• PB – 3 TU• PC – 1 TU

• PD – 7 TU

If RR scheduling is used, what quanta size should be used to achieve minimal average turnaround time? (assume 0 cost for context switches)

24

Round Robin1. Quanta = 1:

2. Quanta = 2:

3. Quanta = 3: 10.75 TU4. Quanta = 4: 11.5 TU5. Quanta = 5: 12.25 TU6. Quanta = 6: 10.5 TU7. Quanta = 7: 10.5 TU

A B C D A B D A B D A D A D A D D1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

A A B B C D D A A B D D A A D D D1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

(15+9+3+17)/4=11 TU

(14+10+5+17)/4=11.5 TU

25

Round Robin

Turnaround time depends on the size of the time quantum used. Note that it does not necessarily improve as time quantum size increases!

26

Non preemptive scheduling(Taken from Tanenbaum)Assume 5 different jobs arrive at a computer center, roughly at the same time (same clock tick). Their expected run times are 10, 6, 2, 4 and 8 TU. Their (externally determined) priorities are 3, 5, 2, 1 and 4 respectively. For each of the following scheduling algorithm, determine the mean process turnaround time. Ignore process switching overhead. All jobs are completely CPU bound.1.Priority Scheduling (non-preemptive).

[Higher number means higher priority]2.First come first served (in order 10, 6, 2,

4, 8) (non-preemptive)3.Shortest job first (non-preemptive)

1. Priority Scheduling: (6+14+24+26+30)/5=20

2. FCFS:(10+16+18+22+30)/5=19.2

3. SJF:(2+6+12+20+30)/5=14

PID Priority Time

P1 3 10

P2 5 6

P3 2 2

P4 1 4

P5 4 8

27

Preemptive dynamic priorities scheduling

(Taken from Silberschatz, 5-9)Consider the following preemptive priority scheduling algorithm with dynamically changing priorities: When a process is waiting for the CPU (in the ready queue, but not running), its priority changes at rate α; when it is running, its priority changes at rate β. All processes are given a priority of 0 when they enter the ready queue. The parameters alpha and beta can be set.Higher priority processes take higher values.1. What is the algorithm that results from β> α > 0? 2. What is the algorithm that results from α < β< 0? 3. Is there a starvation problem in 1? in 2? explain.

28


1. >>0.To get a better feeling of the problem, we will create an example:C, P1, P2, P3 arrive one after the other and last for 3 TU, =1 and =2(bold marks the running process):

The resulting schedule is FCFS.Slightly more formal: If a process is running it must have the highest priority value. While it is running, it’s priority value increases at a rate greater than any other waiting process. As a result, it will continue it’s run until it completes (or waits on I/O, for example). All processes in the waiting queue, increase their priority at the same rate, hence the one which arrived earliest will have the highest priority once the CPU is available.

Time 1 2 3 4 5 6 7 8 9P1 0 2 4

P2 0 1 2 4 6

P3 0 1 2 3 4 6 8

29


2. <<0.We will use (almost) the same example problem as before, but this time =-2, =-1:

The resulting schedule is LIFO.More formally: If a process is running it must have the highest priority value. While it is running, it’s priority value decreases at a much lower rate than any other waiting process. As a result, it will continue it’s run until it completes (or waits on I/O, for example), or a new process with priority 0 is introduced. As before, all processes in the waiting queue, decrease their priority at the same rate, hence the one which arrived later will have the highest priority once the CPU is available.

Time 1 2 3 4 5 6 7 8 9P1 0 -1 -3 -5 -7 -9 -11 -13 -14

P2 0 -1 -3 -5 -7 -8

P3 0 -1 -2

30


3.In the first case it is easy to see that there is no starvation problem. When the kth process is introduced it will wait for at most (k-1)max{timei} Time Units. This number might be large but it is still finite.

This is not true for the second case - consider the following scenario: P1 is introduced and receives CPU time. While still working a 2nd process, P2, is initiated. According to the scheduling algorithm in the second case, P2 will receive the CPU time and P1 will have to wait. As long as new processes will keep coming before P1 gets a chance to complete its run, P1 will never complete its task.

31

Dynamic Multi Level schedulingAn OS keeps two queues, Q1 and Q2. Each queue implements the round robin (RR) algorithm for all processes it holds. The OS prioritize processes in Q1, over those in Q2. When a process is created or returned from an I/O operation, it enters Q1. A process enters Q2 if it just finished running and it used up its whole time quantum. A process returning from I/O enters Q1 and has precedence over a process which did not start running. In our problem, we have the following processes:Process P1 – arrival time = 0, req.: 1 TU CPU, 1 TU IO, 3 TU CPU.Process P2 – arrival time = 2, req.: 2 CPU, 2 IO, 2 CPU.Process P3 – arrival time = 3, req.: 1 CPU, 3 IO, 3 CPU.Draw the Gantt table and compute the average TA and RT (turnaround and response time).Assume that the time quantum in Q1=1 TU, and the time quantum in Q2=2 TU. Further assume that the system has preemption.

Computing the RT is based on the start of each I/O operation (you may think of the I/O as printing to stdout and that the user is waiting for this printout).

32

Dynamic Multi Level scheduling

The Gantt table:

Avg. TA: (7+11+9)/3=9Avg. RT: (1+6+2)/3=3

Time 0 1 2 3 4 5 6 7 8 9 10 11 12

P1CPU I/O CPU CPU CPU

P2. CPU CPU I/O I/O CPU CPU

P3. CPU I/O I/O I/O CPU CPU CPU

33

Multi-core Scheduling). בכל רגע נתון C1, C2נניח כי עומד לרשותנו מחשב בעל שני מעבדים (

עובדים שני המעבדים אלא אם כן אין יותר עבודות ממתינות. תהליכים משלושה טיפוסים 13למערכת מגיעים בו זמנית אוסף של

כמפורט להלן: – תהליכים קצרים שמסתיימים לאחר יחידת זמן אחת (תהליך יחיד)Aתהליכי • 7 – תהליכים ארוכים מעט יותר שמסתיימים לאחר שתי יחידות זמן (Bתהליכי •

תהליכים) תהליכים).5 – תהליכים שמסתיימים לאחר שלוש יחידות זמן (Cתהליכי •

עבור שני האלגוריתמים הבאים חשבו: הממוצע ואיזה אלגוריתם מוצלח יותר עפ"י מדד turnaround timeמה יהיה ה-–

זה? נעשה שימוש ואיזה אלגוריתם מוצלח יותר עפ"י מדד זה?CPUבכמה זמן –מהו משך הזמן הנדרש לסיום החישוב ואיזה אלגוריתם מוצלח יותר עפ"י מדד –

זה? C על תהליכים אלו. כל תהליכי SJF שמפעיל C1 מופנים למעבד B ו-Aתהליכים מטיפוס –

. במידה ומעבד מסוים סיים את עבודתו לפני השני הוא מטפל בתהליכים C2מופנים למעבד .SJFהנותרים עפ"י עיקרון

.SJFהתהליכים מופנים למעבדים השונים עפ"י –

39

Multi-core Scheduling:מעבדים לשני הGanttנצייר טבלת

עבור האלגוריתם הראשון:

ועבור האלגוריתם השני:

כעת ניתן לענות על השאלות בקלות:avg. TA=(64+45)/13=109/13=8.38לראשון – –

avg. TA=(55+45)/13=100/13=7.69לשני – יהיה מוצלח יותר כאשר נשתמש באלגוריתם השני. (עדיף על פני SJFולכן ברור כי

Affinity.( ואילו 15+15. בשיטה אחת 30ברור שבשתי השיטות משך החישוב הנדרש יהיה זהה: –

.16+14בשיטה השניה , הוא time turnaroundבמפתיע, למרות שהאלגוריתם השני מוצלח יותר מבחינת–

יחידות זמן, בעוד שהאלגוריתם הראשון מסיים לאחר 16מסיים מאוחר יותר, כעבור 40 יחידות זמן.15

practical session 2, processes and scheduling

Documents