1 ece7995 computer storage and operating system design lecture 3: processes and threads (i)

45
1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

Post on 19-Dec-2015

224 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

1

ECE7995Computer Storage and Operating System

Design

Lecture 3: Processes and threads (I)

Page 2: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

2

Why Multiprogramming and Timesharing? (revisit)

Page 3: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

3

Protection (revisit)

• OS must protect/isolate applications from each other, and OS from applications

• Three techniquesPreemption: granted resources can be revoked

Interposition: must go through OS to access resourcesPrivilege: user/kernel modes

Page 4: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

4

Mode Switching (revisit)

• User Kernel mode For reasons external or internal to CPU

• External (aka hardware) interrupt: timer/clock chip, I/O device, network card, keyboard, mouse

asynchronous (with respect to the executing program)

• Internal interrupt (aka software interrupt, trap, or exception)are synchronous

System Call (process wants to enter kernel to obtain services) – intended

Fault/exception (division by zero, privileged instruction in user mode) – usually unintended

• Kernel User mode switch on iret instruction

Page 5: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

5

Mode Switch (revisit)

Process 1

Process 2

Kernel

user mode

kernel mode

Timer interrupt: P1 is preempted, context switch to P2

Timer interrupt: P1 is preempted, context switch to P2

System call: (trap): P2 starts I/O operation, blocks context switch to process 1

System call: (trap): P2 starts I/O operation, blocks context switch to process 1

I/O device interrupt:P2’s I/O completeswitch back to P2

I/O device interrupt:P2’s I/O completeswitch back to P2

Timer interrupt: P2 still hastime left, no context switch

Timer interrupt: P2 still hastime left, no context switch

Page 6: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

6

Overview

• Process concept

• Process Scheduling

• Thread concept

Page 7: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

7

Process

These are all possible definitions:• A program in execution, and process execution must progress in

sequential fashion• An instance of a program running on a computer• Schedulable entity (*)• Unit of resource ownership• Unit of protection• Execution sequence (*) + current state (*) + set of resources

(*) can be said of threads as well

Page 8: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

8

Process in Memory

A process includes• text section• program counter • stack• data section• heap

Page 9: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

9

Data placement, seen from C/C++

Q.: which of these variables are stored?

int a;

static int b;

int c = 5;

struct S {

int t;

};

struct S s;

void func(int d)

{

static int e;

int f;

struct S w;

int *g = new int[10];

}

A.: On stack: d, f, w (including w.t), gOn (global) data section: a, b, c, s (including s.t), e On the heap: g[0]…g[9]

A.: On stack: d, f, w (including w.t), gOn (global) data section: a, b, c, s (including s.t), e On the heap: g[0]…g[9]

Page 10: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

10

Process State

As a process executes, it changes state

• new: The process is being created

• running: Instructions are being executed

• waiting: The process is waiting for some event to occur

• ready: The process is waiting to be assigned to a processor

• terminated: The process has finished execution

Page 11: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

11

Process Control Block (PCB)

PCB records information associated with each process• Process identifier (pid)• Value of registers, including stack pointer

Program counter CPU registers

• Information needed by scheduler Process state Other CPU scheduling information

• Resources held by process: Memory-management information I/O status information

• Accounting information

Page 12: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

12

Context Switching

• Multiprogramming: switch to another process if current process is (momentarily) blocked

• Time-sharing: switch to another process periodically to make sure all process make equal progress

• this switch is called a context switch.

• Understand how it works

how it interacts with user/kernel mode switching

how it maintains the illusion of each process having the CPU to itself (process must not notice being switched in and out!)

Page 13: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

13

CPU Switch From Process to Process

1) Save the current process’s execution state to its PCB2) Update current’s PCB as needed3) Choose next process N4) Update N’s PCB as needed5) Restore N’s PCB execution state

May involve reprogramming MMU

Page 14: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

14

Process Scheduling Queues

• Processes are linked in multiple queues:Job queue – set of all processes in the systemReady queue – set of all processes residing in main memory,

ready and waiting to executeDevice queues (wait queue )– set of processes waiting for an I/O

device or other events

• Processes migrate among the various queues

Page 15: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

15

Ready Queue And Various I/O Device Queues

Page 16: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

16

Representation of Process Scheduling

Page 17: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

17

Schedulers

• Long-term scheduler (or job scheduler) – selects which processes should be brought into the ready queueThe long-term scheduler controls the degree of multiprogrammingA good mix of processes can be described as either:

– I/O-bound process – spends more time doing I/O than computations, many short CPU bursts

– CPU-bound process – spends more time doing computations; few very long CPU bursts

• Short-term scheduler (or CPU scheduler) – selects which process should be executed next and allocates CPU

Page 18: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

18

CPU Scheduling

• Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them

• CPU scheduling decisions may take place when a process:

1. Switches from running to waiting state

2. Switches from running to ready state

3. Switches from waiting to ready

4. Terminates

• Scheduling under 1 and 4 is nonpreemptive• All other scheduling is preemptive

Page 19: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

19

Preemptive vs Nonpreemptive Scheduling

Q.: when is scheduler asked to pick a thread from ready queue?

Nonpreemptive:

• Only when RUNNING BLOCKED transition

• Or RUNNING EXIT

• Or voluntary yield: RUNNING READY

Preemptive

• Also when BLOCKED READY transition

• Also on timer

RUNNINGRUNNING

READYREADYBLOCKEDBLOCKED

Processmust waitfor event

Event arrived

Schedulerpicks process

Processpreempted

Page 20: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

20

Dispatcher

• Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves:switching contextswitching to user modejumping to the proper location in the user program to

restart that program

• Dispatch latency – time it takes for the dispatcher to stop one process and start another running

Page 21: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

21

Static vs Dynamic Scheduling

• Static schedulingThe arrival and execution times of all jobs are known in advance. It create a

schedule, execute it– Used in statically configured systems, such as embedded real-time systems

• Dynamic/online scheduling• Jobs are not known in advance, scheduler must make online

decision whenever jobs arrives or leaves– Execution time may or may not be known– Behavior can be modeled by making assumptions about nature of arrival

process

Page 22: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

22

Alternating Sequence of CPU And I/O Bursts

Page 23: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

23

CPU Scheduling ModelProcess alternates between CPU burst and I/O burst

I/O Bound Process

CPU Bound Process

I/OCPUWaiting

P1

P1

P2

P2

Scheduling on the same CPU:

Page 24: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

24

CPU Scheduling Terminology

• A job (sometimes called a task, or a job instance)

Activity that’s schedulable: process/thread and a collection of processes that are scheduled together

• Arrival time: time when job arrives

• Start time: time when job actually starts

• Finish time: time when job is done

• Completion time (aka Turn-around time)

Finish time – Arrival time

• Response time

Time when user sees response – Arrival time

• Execution time (aka cost): time a job need to execute

Page 25: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

25

CPU Scheduling Terminology (cont’d)

• Waiting time = time when job was ready-to-rundidn’t run because CPU scheduler picked another job

• Blocked time = time when job was blockedwhile I/O device is in use

• Completion time Execution time + Waiting time + Blocked time

Page 26: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

26

CPU Scheduling Goals

• Minimize latencyCan mean completion time

Can mean response time

• Maximize throughputThroughput: number of finished jobs per time-unit

Implies minimizing overhead (for context-switching, for scheduling algorithm itself)

Requires efficient use of non-CPU resources

• FairnessMinimize variance in waiting time/completion time

Page 27: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

27

Scheduling Constraints

Reaching those goals is difficult, because• Goals are conflicting:

– Latency vs. throughput– Fairness vs. low overhead

• Scheduler must operate with incomplete knowledge– Execution time may not be known– I/O device use may not be known

• Scheduler must make decision fast– Approximate best solution from huge solution space

Page 28: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

28

Round Robin (RR)• Each process gets a small unit of CPU time (time

quantum), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue.

• If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units.

• No more unfairness to short jobs or starvation for long jobs!

• Performanceq large FIFOq small q must be large with respect to context switch,

otherwise overhead is too high

Page 29: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

29

Example of RR with Time Quantum = 20Process CPU Burst Time

P1 53

P2 17

P3 68

P4 24

The schedule is:

P1 P2 P3 P4 P1 P3 P4 P1 P3 P3

0 20 37 57 77 97 117 121 134 154 162

Page 30: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

30

Round Robin – Cost of Time Slicing• Context switching incurs a cost

Direct cost (execute scheduler & context switch) + indirect cost (cache & TLB misses)

• Long time slices lower overhead, but approaches FCFS if processes finish before timeslice expires

• Short time slices lots of context switches, high overhead• Typical cost: context switch < 10 us• Time slice typical around 100 ms

Linux: 100ms default, adjust to between 10ms & 300msNote: time slice length != interval between timer interrupts

Timer frequency usually 1000Hz

Page 31: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

31

Multi-Level Feedback Queue Scheduling

• Objectives:preference for short jobs (tends to lead to good I/O utilization)

longer timeslices for CPU bound jobs (reduces context-switching overhead)

• Challenge: Don’t know type of each process – algorithm needs to figure out

• Solutions: use multiple queuesqueue determines priority

usually combined with static priorities (nice values)

many variations of this

Page 32: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

32

MLFQS

MIN

MAXH

ighe

r P

riorit

y

4

3

1

2

Long

er T

imes

lices

Process thatuse up theirtime slice movedown

Processes start in highest queue

• Higher priority queues are served before lower-priority ones - within highest-priority queue, round-robin

Processes that starve

move up

• Only ready processes are in this queue - blocked processes leave queue and reenter same queue on unblock

Page 33: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

33

Case Study: Linux Scheduler

• Variant of MLFQS

• 140 priorities

1-99 “realtime”

100-140 nonrealtime

• Dynamic priority computed from static priority (nice) plus “interactivity bonus”

0

100

120

139

“Realtime”processesscheduled based on static prioritySCHED_FIFOSCHED_RR

Processes scheduledbased on dynamicprioritySCHED_NORMAL

nice=0

nice=19

nice=-20

Default

Page 34: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

34

Linux Scheduler (cont’d)Instead of recomputation loop, recompute priority at end of each

timeslice• dyn_prio = nice + interactivity bonus (-5…5)

Interactivity bonus depends on sleep_avg• measures time a process was blocked

2 priority arrays (“active” & “expired”) in each runqueue (Linux calls ready queues “runqueue”)

• Finds highest-priority ready thread quickly

• Switching active & expired arrays at end of epoch is simple pointer swap

Page 35: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

35

Linux Timeslice Computation• Principle: choose a timeslice as long as possible while keeping good

system response time.

• Various tweaks:“interactive processes” are reinserted into active array even after

timeslice expires– Unless processes in expired array are starving

processes with long timeslices are round-robin’d with other of equal priority at sub-timeslice granularity

Q: Does a very long quantum duration degrade the response time of interactive applications?

Page 36: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

36

Linux SMP Load Balancing

• One runqueue is per CPU• Periodically, lengths of runqueues on different CPU is compared

Processes are migrated to balance load

• Migrating requires locks on both runqueues

Page 37: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

37

Scheduling Summary• OS must schedule all resources in a system

CPU, Disk, Network, etc.

• CPU Scheduling affects indirectly scheduling of other devices• Goals: (1) Minimize latency (2) Maximize throughput (3) Provide fairness

• In Practice: some theory, lots of tweaking

Page 38: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

38

Single and Multithreaded Processes

Page 39: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

39

Benefits of Multithreading

• Responsiveness

• Resource Sharing

• Economy

• Utilization of MP Architectures

Page 40: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

40

Thread Implementations

• User threadsThread management done by user-level threads libraryThree primary thread libraries:

― POSIX Pthreads― Win32 threads― Java threads

• Kernel threads:Supported by the Kernel Examples

– Windows XP/2000– Solaris– Linux– Tru64 UNIX– Mac OS X

Page 41: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

41

#include <stdio.h>#include <stdlib.h>#include <pthread.h>

void *print_message_function( void *ptr );main(){

pthread_t thread1, thread2;char *message1 = "Thread 1";char *message2 = "Thread 2";int iret1, iret2;

/* Create independent threads each of which will execute function */

iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);

/* Wait till threads are complete before main continues. Unless we *//* wait we run the risk of executing an exit which will terminate *//* the process and all threads before the threads have completed. */

pthread_join( thread1, NULL);pthread_join( thread2, NULL);

printf("Thread 1 returns: %d\n",iret1);printf("Thread 2 returns: %d\n",iret2);exit(0);

}void *print_message_function( void *ptr ){

char *message;message = (char *) ptr;printf("%s \n", message);

}

An example: Pthread

cc -lpthread pthread1.c

Page 42: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

42

Multithreading Models

• Many-to-One: many user-level threads mapped to single kernel thread

Examples:―Solaris Green Threads―GNU Portable Threads

Pros: efficient, no need for OS support for threading

Cons: lost concurrency with blocked in the kernel, cannot use MP

• One-to-One: each user-level thread maps to kernel thread

Examples― Windows NT/XP/2000

― Linux

― Solaris 9 and later

the number of kernel threads must be limited to reduce system cost

Page 43: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

43

Multithreading Models (cont’d)• Many-to-Many: allows many user level threads to be mapped to many kernel

threads

Allows the operating system to create a sufficient number of kernel threads

Examples:― Solaris prior to version 9

― Windows NT/2000 with the ThreadFiber package

Page 44: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

44

Thread Pools

• Create a number of threads in a pool where they await work• Advantages:

Usually slightly faster to service a request with an existing thread than create a new thread

Allows the number of threads in the application(s) to be bound to the size of the pool

Page 45: 1 ECE7995 Computer Storage and Operating System Design Lecture 3: Processes and threads (I)

45

Linux Threads• Linux treats thread as lightweight process• Each process (or thread) has its own task_struct

structure (PCB in Linux), and is identified by its own Process ID (or PID);

• However, Unix programmer expects threads in the same group (or process) to have a common PID. Linux uses tgid in PCB to record the PID of thread grorp

leader. So all the threads in a group share the same identifier.

• Thread creation is done through clone() system callclone() is a variant of fork() that allows a child task to share

certain resources with its parent via a parameter (CLONE_VM, CLONE_FILES …)