concurrent queues and stacks companion slides for the art of multiprocessor programming by maurice...

Post on 27-Mar-2015

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Concurrent Queues and Stacks

Companion slides forThe Art of Multiprocessor Programming

by Maurice Herlihy & Nir Shavit

Art of Multiprocessor Programming 22

The Five-Fold Path

• Coarse-grained locking

• Fine-grained locking

• Optimistic synchronization

• Lazy synchronization

• Lock-free synchronization

Art of Multiprocessor Programming 33

Another Fundamental Problem

• We told you about – Sets implemented by linked lists

• Next: queues

• Next: stacks

Art of Multiprocessor Programming 44

Queues & Stacks

• pool of items

Art of Multiprocessor Programming 55

Queues

deq()/enq( )

Total orderFirst in

First out

Art of Multiprocessor Programming 66

Stacks

pop()/

push( )

Total orderLast in

First out

Art of Multiprocessor Programming 77

Bounded

• Fixed capacity

• Good when resources an issue

Art of Multiprocessor Programming 88

Unbounded

• Unlimited capacity

• Often more convenient

Art of Multiprocessor Programming 9

Blockingzzz …

Block on attempt to remove from empty stack or queue

Art of Multiprocessor Programming 10

Blockingzzz …

Block on attempt to add to full bounded stack or queue

Art of Multiprocessor Programming 11

Non-BlockingOuch!

Throw exception on attempt to remove from empty stack or queue

Art of Multiprocessor Programming 1212

This Lecture

• Queue– Bounded, blocking, lock-based– Unbounded, non-blocking, lock-free

• Stack– Unbounded, non-blocking lock-free– Elimination-backoff algorithm

Art of Multiprocessor Programming 1313

Queue: Concurrency

enq(x) y=deq()

enq() and deq() work at different

ends of the object

tail head

Art of Multiprocessor Programming 1414

Concurrency

enq(x)

Challenge: what if the queue is empty

or full?

y=deq()ta

ilhead

Art of Multiprocessor Programming 1515

Bounded Queue

Sentinel

head

tail

Art of Multiprocessor Programming 1616

Bounded Queue

head

tail

First actual item

Art of Multiprocessor Programming 1717

Bounded Queue

head

tail

Lock out other deq() calls

deqLock

Art of Multiprocessor Programming 1818

Bounded Queue

head

tail

Lock out other enq() calls

deqLock

enqLock

Art of Multiprocessor ProgrammingArt of Multiprocessor Programming 19191919

Not Done Yet

headhead

tailtail

deqLockdeqLock

enqLockenqLock

Need to tell whether queue is full or

empty

Art of Multiprocessor Programming 2020

Not Done Yet

head

tail

deqLock

enqLock

Max size is 8 items

size

1

Art of Multiprocessor Programming 2121

Not Done Yet

head

tail

deqLock

enqLock

Incremented by enq()Decremented by deq()

size

1

Art of Multiprocessor Programming 2222

Enqueuer

head

tail

deqLock

enqLock

size

1

Lock enqLock

Art of Multiprocessor Programming 2323

Enqueuer

head

tail

deqLock

enqLock

size

1

Read size

OK

Art of Multiprocessor Programming 2424

Enqueuer

head

tail

deqLock

enqLock

size

1

No need to lock tail

Art of Multiprocessor Programming 2525

Enqueuer

head

tail

deqLock

enqLock

size

1

Enqueue Node

Art of Multiprocessor Programming 2626

Enqueuer

head

tail

deqLock

enqLock

size

12

getAndincrement()

Art of Multiprocessor Programming 2727

Enqueuer

head

tail

deqLock

enqLock

size

8Release lock

2

Art of Multiprocessor Programming 2828

Enqueuer

head

tail

deqLock

enqLock

size

2

If queue was empty, notify waiting dequeuers

Art of Multiprocessor Programming 2929

Unsuccesful Enqueuer

head

tail

deqLock

enqLock

size

8

Uh-oh

Read size

Art of Multiprocessor Programming 3030

Dequeuer

head

tail

deqLock

enqLock

size

2

Lock deqLock

Art of Multiprocessor Programming 3131

Dequeuer

head

tail

deqLock

enqLock

size

2

Read sentinel’s next field

OK

Art of Multiprocessor Programming 3232

Dequeuer

head

tail

deqLock

enqLock

size

2

Read value

Art of Multiprocessor Programming 3333

Dequeuer

head

tail

deqLock

enqLock

size

2

Make first Node new sentinel

Art of Multiprocessor Programming 3434

Dequeuer

head

tail

deqLock

enqLock

size

1

Decrement size

Art of Multiprocessor Programming 3535

Dequeuer

head

tail

deqLock

enqLock

size

1Release deqLock

Art of Multiprocessor Programming 3636

Unsuccesful Dequeuer

head

tail

deqLock

enqLock

size

0

Read sentinel’s next field

uh-oh

Art of Multiprocessor Programming 3737

Bounded Queue

public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}

Art of Multiprocessor Programming 3838

Bounded Queue

public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}

enq & deq locks

Art of Multiprocessor Programming 3939

Digression: Monitor Locks

• Java synchronized objects and ReentrantLocks are monitors

• Allow blocking on a condition rather than spinning

• Threads: – acquire and release lock– wait on a condition

Art of Multiprocessor Programming 4040

public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}

The Java Lock Interface

Acquire lock

Art of Multiprocessor Programming 4141

public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}

The Java Lock Interface

Release lock

Art of Multiprocessor Programming 4242

public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}

The Java Lock Interface

Try for lock, but not too hard

Art of Multiprocessor Programming 4343

public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}

The Java Lock Interface

Create condition to wait on

Art of Multiprocessor Programming 4444

The Java Lock Interface

public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}

Never mind what this method does

Art of Multiprocessor Programming 4545

Lock Conditions

public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }

Art of Multiprocessor Programming 4646

public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }

Lock Conditions

Release lock and wait on condition

Art of Multiprocessor Programming 4747

public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }

Lock Conditions

Wake up one waiting thread

Art of Multiprocessor Programming 4848

public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }

Lock Conditions

Wake up all waiting threads

Art of Multiprocessor Programming 4949

Await

• Releases lock associated with q

• Sleeps (gives up processor)

• Awakens (resumes running)

• Reacquires lock & returns

q.await()

Art of Multiprocessor Programming 5050

Signal

• Awakens one waiting thread– Which will reacquire lock

q.signal();

Art of Multiprocessor Programming 5151

Signal All

• Awakens all waiting threads– Which will each reacquire lock

q.signalAll();

Art of Multiprocessor Programming 5252

A Monitor Lock

Cri

tica

l S

ecti

on

waiting roomlock()

unlock()

Art of Multiprocessor Programming 5353

Unsuccessful Deq

Cri

tica

l S

ecti

on

waiting roomlock()

await()deq()

Oh no, empty!

Art of Multiprocessor Programming 5454

Another One

Cri

tica

l S

ecti

on

waiting roomlock()

await()deq()

Oh no, empty!

Art of Multiprocessor Programming 5555

Enqueuer to the Rescue

Cri

tica

l S

ecti

on

waiting roomlock()

signalAll()enq( )

unlock()

Yawn!Yawn!

Art of Multiprocessor Programming 5656

Yawn!

Monitor Signalling

Cri

tica

l S

ecti

on

waiting room

Yawn!

Awakened thread might still lose lock to outside contender…

Art of Multiprocessor Programming 5757

Dequeuers Signalled

Cri

tica

l S

ecti

on

waiting room

Found it

Yawn!

Art of Multiprocessor Programming 5858

Yawn!

Dequeuers Signaled

Cri

tica

l S

ecti

on

waiting room

Still empty!

Art of Multiprocessor Programming 5959

Dollar Short + Day Late

Cri

tica

l S

ecti

on

waiting room

Art of Multiprocessor Programming 6060

public class Queue<T> {

int head = 0, tail = 0; T[QSIZE] items;

public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}

Java Synchronized Methods

Art of Multiprocessor Programming 6161

public class Queue<T> {

int head = 0, tail = 0; T[QSIZE] items;

public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}

Java Synchronized Methods

Each object has an implicit lock with an implicit condition

Art of Multiprocessor Programming 6262

public class Queue<T> {

int head = 0, tail = 0; T[QSIZE] items;

public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}

Java Synchronized Methods

Lock on entry, unlock on return

Art of Multiprocessor Programming 6363

public class Queue<T> {

int head = 0, tail = 0; T[QSIZE] items;

public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; this.notifyAll(); return result; } …}}

Java Synchronized Methods

Wait on implicit condition

Art of Multiprocessor Programming 6464

public class Queue<T> {

int head = 0, tail = 0; T[QSIZE] items;

public synchronized T deq() { while (tail – head == 0) this.wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}

Java Synchronized Methods

Signal all threads waiting on condition

Art of Multiprocessor Programming 6565

(Pop!) The Bounded Queue

public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}

Art of Multiprocessor Programming 6666

Bounded Queue Fields

public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}

Enq & deq locks

Art of Multiprocessor Programming 6767

Bounded Queue Fields

public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}

Enq lock’s associated condition

Art of Multiprocessor Programming 6868

Bounded Queue Fields

public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}

size: 0 to capacity

Art of Multiprocessor Programming 6969

Bounded Queue Fields

public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}

Head and Tail

Art of Multiprocessor Programming 7070

Enq Method Part Onepublic void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == Capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}

Art of Multiprocessor Programming 7171

public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}

Enq Method Part One

Lock and unlock enq lock

Art of Multiprocessor Programming 7272

public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}

Enq Method Part One

Wait while queue is full …

Art of Multiprocessor Programming 7373

public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}

Enq Method Part One

when await() returns, you might still fail the test !

Art of Multiprocessor Programming 7474

public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}

Be Afraid

After the loop: how do we know the queue won’t become full again?

Art of Multiprocessor Programming 7575

public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}

Enq Method Part One

Add new node

Art of Multiprocessor Programming 7676

public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}

Enq Method Part One

If queue was empty, wake frustrated dequeuers

Art of Multiprocessor Programming 7777

Beware Lost Wake-Ups

Cri

tica

l S

ecti

on

waiting roomlock()

Queue empty so signal ()

enq( )

unlock()

Yawn!

Art of Multiprocessor Programming 7878

Lost Wake-Up

Cri

tica

l S

ecti

on

waiting roomlock()

enq( )

unlock()

Yawn!

Queue not empty so no

need to signal

Art of Multiprocessor Programming 7979

Lost Wake-Up

Cri

tica

l S

ecti

on

waiting room

Yawn!

Art of Multiprocessor Programming 8080

Lost Wake-Up

Cri

tica

l S

ecti

on

waiting room

Found it

Art of Multiprocessor Programming 8181

What’s Wrong Here?

Cri

tica

l S

ecti

on

waiting room

Still waiting ….!

Art of Multiprocessor Programming 82

Solution to Lost Wakeup

• Always use– signalAll() and notifyAll()

• Not– signal() and notify()

82

Art of Multiprocessor Programming 8383

Enq Method Part Deux

public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }

Art of Multiprocessor Programming 8484

Enq Method Part Deux

public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }Are there dequeuers to be signaled?

Art of Multiprocessor Programming 8585

public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }

Enq Method Part Deux

Lock and unlock deq lock

Art of Multiprocessor Programming 8686

public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }

Enq Method Part Deux

Signal dequeuers that queue is no longer empty

Art of Multiprocessor Programming 8787

The Enq() & Deq() Methods

• Share no locks– That’s good

• But do share an atomic counter– Accessed on every method call– That’s not so good

• Can we alleviate this bottleneck?

Art of Multiprocessor Programming 8888

Split the Counter

• The enq() method– Increments only– Cares only if value is capacity

• The deq() method– Decrements only– Cares only if value is zero

Art of Multiprocessor Programming 8989

Split Counter

• Enqueuer increments enqSize• Dequeuer decrements deqSize• When enqueuer runs out

– Locks deqLock– computes size = enqSize - DeqSize

• Intermittent synchronization– Not with each method call– Need both locks! (careful …)

Art of Multiprocessor Programming 9090

A Lock-Free Queue

Sentinel

head

tail

Art of Multiprocessor Programming 9191

Compare and Set

CAS

Art of Multiprocessor Programming 9292

Enqueue

head

tail

enq( )

Art of Multiprocessor Programming 9393

Enqueue

head

tail

Art of Multiprocessor Programming 9494

Logical Enqueue

head

tail

CAS

Art of Multiprocessor Programming 9595

Physical Enqueue

head

tailCAS

Art of Multiprocessor Programming 9696

Enqueue

• These two steps are not atomic

• The tail field refers to either– Actual last Node (good)– Penultimate Node (not so good)

• Be prepared!

Art of Multiprocessor Programming 9797

Enqueue

• What do you do if you find– A trailing tail?

• Stop and help fix it– If tail node has non-null next field– CAS the queue’s tail field to tail.next

• As in the universal construction

Art of Multiprocessor Programming 9898

When CASs Fail

• During logical enqueue– Abandon hope, restart– Still lock-free (why?)

• During physical enqueue– Ignore it (why?)

Art of Multiprocessor Programming 9999

Dequeuer

head

tail

Read value

Art of Multiprocessor Programming 100100

Dequeuer

head

tail

Make first Node new sentinel

CAS

Art of Multiprocessor Programming 101101

Memory Reuse?

• What do we do with nodes after we dequeue them?

• Java: let garbage collector deal?

• Suppose there is no GC, or we prefer not to use it?

Art of Multiprocessor Programming 102102

Dequeuer

head

tail

CAS

Can recycle

Art of Multiprocessor Programming 103103

Simple Solution

• Each thread has a free list of unused queue nodes

• Allocate node: pop from list

• Free node: push onto list

• Deal with underflow somehow …

Art of Multiprocessor Programming 104104

Why Recycling is Hard

Free pool

head tail

Want to redirect

head from gray to red

zzz…

Art of Multiprocessor Programming 105105

Both Nodes Reclaimed

Free pool

zzz

head tail

Art of Multiprocessor Programming 106106

One Node Recycled

Free pool

Yawn!

head tail

Art of Multiprocessor Programming 107107

Why Recycling is Hard

Free pool

CAShead tail

OK, here I go!

Art of Multiprocessor Programming 108108

Recycle FAIL

Free pool

zOMG what went wrong?

head tail

Art of Multiprocessor Programming 109109

The Dreaded ABA Problemhead tail

Head reference has value AThread reads value A

Head reference has value AThread reads value A

Art of Multiprocessor Programming 110110

Dreaded ABA continued

zzz

head tail

Head reference has value BNode A freed

Head reference has value BNode A freed

Art of Multiprocessor Programming 111111

Dreaded ABA continued

Yawn!

head tail

Head reference has value A again

Node A recycled and reinitialized

Head reference has value A again

Node A recycled and reinitialized

Art of Multiprocessor Programming 112112

Dreaded ABA continued

CAShead tail

CAS succeeds because references match,even though reference’s meaning has changed

CAS succeeds because references match,even though reference’s meaning has changed

Art of Multiprocessor Programming 113113

The Dreaded ABA FAIL

• Is a result of CAS() semantics– I blame Sun, Intel, AMD, …

• Not with Load-Locked/Store-Conditional– Good for IBM?

Art of Multiprocessor Programming 114114

Dreaded ABA – A Solution

• Tag each pointer with a counter

• Unique over lifetime of node

• Pointer size vs word size issues

• Overflow?– Don’t worry be happy?– Bounded tags?

• AtomicStampedReference class

Art of Multiprocessor Programming 115115

Atomic Stamped Reference

• AtomicStampedReference class– Java.util.concurrent.atomic package

address S

Stamp

Reference

Can get reference & stamp atomicallyCan get reference & stamp atomically

Art of Multiprocessor Programming 116116

Concurrent Stack

• Methods– push(x) – pop()

• Last-in, First-out (LIFO) order

• Lock-Free!

Art of Multiprocessor Programming 117117

Empty Stack

Top

Art of Multiprocessor Programming 118118

Push

Top

Art of Multiprocessor Programming 119119

Push

TopCAS

Art of Multiprocessor Programming 120120

Push

Top

Art of Multiprocessor Programming 121121

Push

Top

Art of Multiprocessor Programming 122122

Push

Top

Art of Multiprocessor Programming 123123

Push

TopCAS

Art of Multiprocessor Programming 124124

Push

Top

Art of Multiprocessor Programming 125125

Pop

Top

Art of Multiprocessor Programming 126126

Pop

TopCAS

Art of Multiprocessor Programming 127127

Pop

TopCAS

mine!

Art of Multiprocessor Programming 128128

Pop

TopCAS

Art of Multiprocessor Programming 129129

Pop

Top

Art of Multiprocessor Programming 130130

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff(); }}

Lock-free Stack

Art of Multiprocessor Programming 131131

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public Boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) }public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}

Lock-free Stack

tryPush attempts to push a node

Art of Multiprocessor Programming 132132

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}

Lock-free Stack

Read top value

Art of Multiprocessor Programming 133133

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}

Lock-free Stack

current top will be new node’s successor

Art of Multiprocessor Programming 134134

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}

Lock-free Stack

Try to swing top, return success or failure

Art of Multiprocessor Programming 135135

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}

Lock-free Stack

Push calls tryPush

Art of Multiprocessor Programming 136136

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}

Lock-free Stack

Create new node

Art of Multiprocessor Programming 137137

public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}

Lock-free Stack

If tryPush() fails,back off before retrying

Art of Multiprocessor Programming 138138

Lock-free Stack

• Good– No locking

• Bad– Without GC, fear ABA– Without backoff, huge contention at top – In any case, no parallelism

Art of Multiprocessor Programming 139139

Big Question

• Are stacks inherently sequential?

• Reasons why– Every pop() call fights for top item

• Reasons why not– Stay tuned …

Art of Multiprocessor Programming 140140

Elimination-Backoff Stack

• How to– “turn contention into parallelism”

• Replace familiar– exponential backoff

• With alternative– elimination-backoff

Art of Multiprocessor Programming 141141

Observation

Push( )

Pop()

linearizable stack

After an equal number of pushes and pops, stack stays the same

After an equal number of pushes and pops, stack stays the same

Yes!

Art of Multiprocessor Programming 142142

Idea: Elimination Array

Push( )

Pop()

stack

Pick at random

Pick at random

Elimination Array

Art of Multiprocessor Programming 143143

Push Collides With Pop

Push( )

Pop()

stack

continue

continue

No need to access stack

No need to access stack

Yes!

Art of Multiprocessor Programming 144144

No Collision

Push( )

Pop()

stack

If no collision, access stack

If no collision, access stack

If pushes collide or pops collide access stack

If pushes collide or pops collide access stack

Art of Multiprocessor Programming 145145

Elimination-Backoff Stack

• Lock-free stack + elimination array

• Access Lock-free stack, – If uncontended, apply operation – if contended, back off to elimination array

and attempt elimination

Art of Multiprocessor Programming 146146

Elimination-Backoff Stack

Push( )

Pop()TopCAS

If CAS fails, back off

Art of Multiprocessor Programming 147147

Dynamic Range and Delay

Push( )

Pick random range and max waiting time based on level of contention

encountered

Pick random range and max waiting time based on level of contention

encountered

Art of Multiprocessor Programming 148148

Linearizability

• Un-eliminated calls– linearized as before

• Eliminated calls:– linearize pop() immediately after matching

push()

• Combination is a linearizable stack

Art of Multiprocessor Programming 149149

Un-Eliminated Linearizability

push(v1)

timetime

linearizable

push(v1)

pop(v1)pop(v1)

Art of Multiprocessor Programming 150150

Eliminated Linearizability

pop(v2)push(v1)

push(v2)

timetime

push(v2)

pop(v2)push(v1)

pop(v1)

Collision Point

Red calls are eliminated

pop(v1)

linearizable

Art of Multiprocessor Programming 151151

Backoff Has Dual Effect

• Elimination introduces parallelism

• Backoff to array cuts contention on lock-free stack

• Elimination in array cuts down number of threads accessing lock-free stack

Art of Multiprocessor Programming 152152

public class EliminationArray { private static final int duration = ...; private static final int timeUnit = ...; Exchanger<T>[] exchanger; public EliminationArray(int capacity) { exchanger = new Exchanger[capacity]; for (int i = 0; i < capacity; i++) exchanger[i] = new Exchanger<T>(); … } …}

Elimination Array

Art of Multiprocessor Programming 153153

public class EliminationArray { private static final int duration = ...; private static final int timeUnit = ...; Exchanger<T>[] exchanger; public EliminationArray(int capacity) { exchanger = new Exchanger[capacity]; for (int i = 0; i < capacity; i++) exchanger[i] = new Exchanger<T>(); … } …}

Elimination Array

An array of ExchangersAn array of Exchangers

Art of Multiprocessor Programming 154154

public class Exchanger<T> { AtomicStampedReference<T> slot = new AtomicStampedReference<T>(null, 0);

Digression: A Lock-Free Exchanger

Art of Multiprocessor Programming 155155

public class Exchanger<T> { AtomicStampedReference<T> slot = new AtomicStampedReference<T>(null, 0);

A Lock-Free Exchanger

Atomically modifiable reference + status

Atomically modifiable reference + status

Art of Multiprocessor Programming 156156

Atomic Stamped Reference

• AtomicStampedReference class– Java.util.concurrent.atomic package

• In C or C++:

address Sreferencereference

stampstamp

Art of Multiprocessor Programming 157157

Extracting Reference & Stamp

public T get(int[] stampHolder);

Art of Multiprocessor Programming 158158

Extracting Reference & Stamp

public T get(int[] stampHolder);

Returns reference to object of type T

Returns reference to object of type T

Returns stamp at array index 0!

Returns stamp at array index 0!

Art of Multiprocessor Programming 159159

Exchanger Status

enum Status {EMPTY, WAITING, BUSY};

Art of Multiprocessor Programming 160160

Exchanger Status

enum Status {EMPTY, WAITING, BUSY};

Nothing yetNothing yet

enum Status {EMPTY, WAITING, BUSY};

Art of Multiprocessor Programming 161161

Exchange Status

Nothing yetNothing yet

One thread is waiting for rendez-vous

One thread is waiting for rendez-vous

Art of Multiprocessor Programming 162162

Exchange Status

enum Status {EMPTY, WAITING, BUSY};

Nothing yetNothing yet

One thread is waiting for rendez-vous

One thread is waiting for rendez-vous

Other threads busy with rendez-vous

Other threads busy with rendez-vous

Art of Multiprocessor Programming 163163

public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }

The Exchange

Art of Multiprocessor Programming 164164

public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }

The Exchange

Item and timeoutItem and timeout

Art of Multiprocessor Programming 165165

public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }

The Exchange

Array holds statusArray holds status

Art of Multiprocessor Programming 166166

public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: // slot is free case WAITING: // someone waiting for me case BUSY: // others exchanging } }}

The Exchange

Loop until timeoutLoop until timeout

Art of Multiprocessor Programming 167167

public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: // slot is free case WAITING: // someone waiting for me case BUSY: // others exchanging } }}

The Exchange

Get other’s item and statusGet other’s item and status

Art of Multiprocessor Programming 168168

public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }}

The Exchange

An Exchanger has three possible statesAn Exchanger has three possible states

Art of Multiprocessor Programming 169169

Lock-free Exchanger

Art of Multiprocessor Programming 170170

Lock-free Exchanger

CAS

Art of Multiprocessor Programming 171171

Lock-free Exchanger

Art of Multiprocessor Programming 172172

Lock-free Exchanger

In search of partner …

Art of Multiprocessor Programming 173173

Lock-free Exchanger

Slot

Still waiting …

Try to exchange item and set

status to BUSY

CAS

Art of Multiprocessor Programming 174174

Lock-free Exchanger

Slot

Partner showed up, take item and reset to EMPTY

item status

Art of Multiprocessor Programming 175175

EMPTY

Lock-free Exchanger

Slot

item status

Partner showed up, take item and reset to EMPTY

Art of Multiprocessor Programming 176176

case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

Art of Multiprocessor Programming 177177

case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

Try to insert myItem and change state to WAITING

Try to insert myItem and change state to WAITING

Art of Multiprocessor Programming 178178

case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

Spin until either myItem is taken or timeout

Spin until either myItem is taken or timeout

Art of Multiprocessor Programming 179179

case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

myItem was taken, so return herItem

that was put in its place

myItem was taken, so return herItem

that was put in its place

Art of Multiprocessor Programming 180180

case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

Otherwise we ran out of time, try to reset status to EMPTY

and time out

Otherwise we ran out of time, try to reset status to EMPTY

and time out

Art of Multiprocessor Programming 181Art of Multiprocessor Programming© Herlihy-Shavit

2007

181

case EMPTY: // slot is free if (slot.compareAndSet(herItem, myItem, WAITING, BUSY)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.compareAndSet(myItem, null, WAITING, EMPTY)){throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

If reset failed,someone showed up after all,

so take that item

If reset failed,someone showed up after all,

so take that item

Art of Multiprocessor Programming 182182

case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

Clear slot and take that item Clear slot and take that item

Art of Multiprocessor Programming 183183

case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;

Exchanger State EMPTY

If initial CAS failed,then someone else changed status

from EMPTY to WAITING,so retry from start

If initial CAS failed,then someone else changed status

from EMPTY to WAITING,so retry from start

Art of Multiprocessor Programming 184184

case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}

States WAITING and BUSY

Art of Multiprocessor Programming 185185

case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}

States WAITING and BUSY

someone is waiting to exchange,so try to CAS my item in

and change state to BUSY

someone is waiting to exchange,so try to CAS my item in

and change state to BUSY

Art of Multiprocessor Programming 186186

case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}

States WAITING and BUSY

If successful, return other’s item,otherwise someone else took it,

so try again from start

If successful, return other’s item,otherwise someone else took it,

so try again from start

Art of Multiprocessor Programming 187187

case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}

States WAITING and BUSY

If BUSY,other threads exchanging,

so start again

If BUSY,other threads exchanging,

so start again

Art of Multiprocessor Programming 188188

The Exchanger Slot

• Exchanger is lock-free

• Because the only way an exchange can fail is if others repeatedly succeeded or no-one showed up

• The slot we need does not require symmetric exchange

Art of Multiprocessor Programming 189189

public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}

Back to the Stack: the Elimination Array

Art of Multiprocessor Programming 190190

public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}

Elimination Array

visit the elimination arraywith fixed value and range

visit the elimination arraywith fixed value and range

Art of Multiprocessor Programming 191191

public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}

Elimination Array

Pick a random array entryPick a random array entry

Art of Multiprocessor Programming 192192

public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}

Elimination Array

Exchange value or time outExchange value or time out

Art of Multiprocessor Programming 193193

public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}

Elimination Stack Push

Art of Multiprocessor Programming 194194

public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}

Elimination Stack Push

First, try to pushFirst, try to push

Art of Multiprocessor Programming 195195

public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}

Elimination Stack Push

If I failed, backoff & try to eliminateIf I failed, backoff & try to eliminate

Art of Multiprocessor Programming 196196

public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}

Elimination Stack Push

Value pushed and range to tryValue pushed and range to try

Art of Multiprocessor Programming 197197

public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}

Elimination Stack Push

Only pop() leaves null,so elimination was successful

Only pop() leaves null,so elimination was successful

Art of Multiprocessor Programming 198198

public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}

Elimination Stack Push

Otherwise, retry push() on lock-free stackOtherwise, retry push() on lock-free stack

Art of Multiprocessor Programming 199199

public T pop() { ... while (true) { if (tryPop()) { return returnNode.value; } else try { T otherValue = eliminationArray.visit(null,policy.range; if (otherValue != null) { return otherValue; } }}}

Elimination Stack Pop

Art of Multiprocessor Programming 200200

public T pop() { ... while (true) { if (tryPop()) { return returnNode.value; } else try { T otherValue = eliminationArray.visit(null,policy.range; if ( otherValue != null) { return otherValue; } }}}

Elimination Stack Pop

If value not null, other thread is a push(),so elimination succeeded

If value not null, other thread is a push(),so elimination succeeded

Art of Multiprocessor Programming 201201

Summary• We saw both lock-based and lock-free

implementations of • queues and stacks• Don’t be quick to declare a data

structure inherently sequential– Linearizable stack is not inherently

sequential (though it is in worst case)

• ABA is a real problem, pay attention

Art of Multiprocessor Programming 202202

         This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

• You are free:– to Share — to copy, distribute and transmit the work – to Remix — to adapt the work

• Under the following conditions:– Attribution. You must attribute the work to “The Art of

Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work).

– Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.

• For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to– http://creativecommons.org/licenses/by-sa/3.0/.

• Any of the above conditions can be waived if you get permission from the copyright holder.

• Nothing in this license impairs or restricts the author's moral rights.

Art of Multiprocessor Programming 203

top related