cosc 3407: operating systems lecture 6: synchronization

This lecture… More concurrency examples Synchronization primitives

A Larger Concurrent Program Example ATM bank server example

– Suppose we wanted to implement a server process that handles requests from an ATM network to do things like deposit and withdraw money from bank

accounts.BankServer() { while (TRUE) { ReceiveRequest(&op, &acctId, &amount); ProcessRequest(op, acctId, amount); }}ProcessRequest(op, acctId, amount) { if (op == deposit) Deposit(acctId, amount); else if …}Deposit(acctId, amount) { acct = GetAccount(acctId); /* May involve disk I/O */ acct->balance += amount; StoreAccount(acct); /* Involves disk I/O */}

ATM bank server example (cont.) Suppose we had a multiprocessor?

– Could run each invocation of ProcessRequest in a separate thread to get parallelism.

Suppose we only had one CPU?– We’d still like to overlap I/O with computation.

Without threads we would have to rewrite the code to look something like:

BankServer() { while (TRUE) { event = WaitForNextEvent(); if (event == ATMRequest) StartOnRequest(); else if (event == AcctAvail) ContinueRequest(); else if (event == AcctStored) FinishRequest(); } }

ATM bank server example (cont.) With threads one could get overlapped I/O and

computation without having to “deconstruct” the code into fragments that run when the appropriate asynchronous event has occurred.

Problem: In the threaded version shared state can get corrupted:

race condition: several threads access and manipulate the same data concurrently, and where the outcome of the execution depends on the particular order of access.

Thread 1 running Deposit: Thread 2 running Deposit: load r1, acct->balance

load r1, acct->balance add r1, amount2 store r1, acct->balance add r1, amount1 store r1, acct->balance

Interleaved Schedules The dispatcher can choose to run each thread to

completion or until it blocks. It can time-slice in whatever size chunks it wants. If running on a multiprocessor then instructions

may be interleaved one at-a-time. Threaded programs must work correctly for

any interleaving of thread instruction sequences.– Cooperating threaded programs are inherently

non-deterministic and non-reproducible.– This makes them hard to debug and maintain

unless they are designed very carefully.

Machine for radiation therapy– Software control of electron

accelerator and electronXray production

– Software control of dosage

Software errors caused the death of several patients– A series of race conditions on

shared variables and poor software design

– “They determined that data entry speed during editing was the key factor in producing the error condition: If the prescription data was edited at a fast pace, the overdose occurred.”

Example: Therac-25

Space Shuttle Example Original Space Shuttle launch aborted 20 minutes before

scheduled launch Shuttle has five computers:

– Four run the “Primary Avionics Software System” (PASS)» Asynchronous and real-time» Runs all of the control systems» Results synchronized and compared every 3 to 4 ms

– The Fifth computer is the “Backup Flight System” (BFS)» stays synchronized in case it is needed» Written by completely different team than PASS

Countdown aborted because BFS disagreed with PASS– A 1/67 chance that PASS was out of sync one cycle– Bug due to modifications in initialization code of PASS

» A delayed init request placed into timer queue» As a result, timer queue not empty at expected time to

force use of hardware clock– Bug not found during extensive simulation

PASS

BFS

Race conditions Race condition: timing dependent error involving

shared state. – Whether it happens depends on how threads

scheduled *Hard* because:

– must make sure all possible schedules are safe. Number of possible schedules permutations is huge.

» Some bad schedules? Some that will work sometimes?– they are intermittent. Timing dependent = small

changes (printf’s, different machine) can hide bug.

if(n == stack_size) /* A */return full; /* B */

stack[n] = v; /* C */n = n + 1; /* D */

More race condition fun Two threads, A and B, compete with each other;

one tries to increment a shared counter, the other tries to decrement the counter.

For this example, assume that memory load and memory store are atomic, but incrementing and decrementing are not atomic.

Thread a: i = 0; while(i < 10)

i = i +1; print “A won!”;

Thread b: i = 0; while(i > -10)

i = i - 1; print “B won!”;

More race condition fun Questions:

1. Who wins? Could be either.2. Is it guaranteed that someone wins? Why not?3. What if both threads have their own CPU, running in

parallel at exactly the same speed. Is it guaranteed that it goes on forever?» In fact, if they start at the same time, with A 1/2 an

instruction ahead, B will win quickly.

4. Could this happen on a uniprocessor?» Yes! Unlikely, but if you depend on it not happening, it

will happen, and your system will break and it will be very difficult to figure out why.

Hand Simulation Multiprocessor Example Inner loop looks like this:

Thread A Thread Br1=0 load r1, M[i]

r1=0 load r1, M[i]r1=1 add r1, r1, 1

r1=-1 sub r1, r1, 1M[i]=1 store r1, M[i]

M[i]=-1 store r1, M[i]

Hand Simulation:– And we’re off. A gets off to an early start– B says “hmph, better go fast” and tries really

hard– A goes ahead and writes “1”– B goes and writes “-1”– A says “HUH??? I could have sworn I put a 1

there”

Controlling race conditions Synchronization: using atomic operations to

ensure cooperation between threads.

Mutual exclusion: ensuring that only one thread does a particular thing at a time.– One thread doing it excludes the other, and vice

versa.

Critical section: piece of code that only one thread can execute at once. – Only one thread at a time will get into the section

of code.

Critical Section Requirements Mutual Exclusion

– at most one thread is in the critical section Progress

– if thread T is outside the critical section, then T cannot prevent thread S from entering the critical section

Bounded Waiting (no starvation)– if thread T is waiting on the critical section, then T

will eventually enter the critical section» assumes threads eventually leave critical sections

Performance– the overhead of entering and exiting the critical

section is small with respect to the work being done within it

Locks: making code atomic. Lock: shared variable, two operations:

– “acquire” (lock): acquire exclusive access to lock, wait if lock already acquired.

– “release” (unlock): release exclusive access to lock. How to use? Bracket critical section in

lock/unlock:

– Access is “mutually exclusive”: locks in this way called “mutexes”

– What have we done? Bootstrap big atomic units from smaller ones (locks)

Lock(shared) criticalSection();Unlock(shared)

Implementing locks: Solution #1 int turn = 0; /* shared, id of turn */

The mutual-exclusion requirement is satisfied. Progress requirement not satisfied:

– Strict alternation of threads– If turn = 0, and thread T1 is ready to enter its critical section,

then T1 cannot do so, even though T0 may be in its noncritical section.

Lock(int t) {while (turn != t)

Thread.yield(); }/* critical section */unlock(int t) {

turn = 1 - t;}/* remainder section */

Implementing locks: Solution #2

Int flag[0] = false; /* boolean */

Int flag[1] = false; /* boolean */

lock(int t) {int other = 1 - t;flag[t] = true;while (flag[other] == true)

Thread.yield();}/* critical section */unlock(int t) {

flag[t] = false;}/* remainder section */

Implementing locks: Solution #2 (cont.) The mutual-exclusion requirement is satisfied.

The progress requirement is still not met.– Suppose thread T0 sets flag[0] to true

– Before executing its while loop, a context switch takes place and thread T1 sets flag[1] to true.

– When both threads perform the while statement, they will loop forever as flag[other] is true.

Implementing locks: Solution #3

Int flag[0] = false; /* boolean */Int flag[1] = false; /* boolean */int turn = 0; /* shared, id of turn */lock(int t) { int other = 1 - t; flag[t] = true; turn = other; while ( (flag[other] == true) && (turn == other) )

Thread.yield();}/* critical section */unlock(int t) { flag[t] = false;} /* remainder section */

Implementing locks Solution #3 works, but it’s really unsatisfactory:

– Really complicated – even for this simple an example, hard to convince yourself it really works

There’s a better way:1. Have hardware provide better (higher-level) primitives than

atomic load and store. 2. Build even higher-level programming abstractions on this

new hardware support. For example, why not use locks as an atomic building block (how we do this in the next lecture):– Lock::Acquire – wait until lock is free, then grab it– Lock::Release – unlock, waking up a waiter if any

These must be atomic operations – if two threads are waiting for the lock, and both see it’s free, only one grabs it!

cosc 3407: operating systems lecture 6: synchronization

Documents