non-blocking data structures and ......art of multiprocessor programming 59 read-only transactions...

93
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015

Upload: others

Post on 30-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

NON-BLOCKING DATA STRUCTURES

AND TRANSACTIONAL MEMORY

Tim Harris, 27 November 2015

Page 2: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Lecture 8

Problems with locks

Atomic blocks and composition

Hardware transactional memory

Software transactional memory

Page 3: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Transactional Memory

Companion slides for

The Art of Multiprocessor Programming

by Maurice Herlihy & Nir Shavit

Page 4: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

4

Our Vision for the Future

In this course, we covered ….

Best practices …

New and clever ideas …

And common-sense observations.

Art of Multiprocessor

Programming

Page 5: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

5

Our Vision for the Future

In this course, we covered ….

Best practices …

New and clever ideas …

And common-sense observations.

Nevertheless …

Concurrent programming is still too hard …

Here we explore why this is ….

And what we can do about it.

Art of Multiprocessor

Programming

Page 6: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

6

Locking

Art of Multiprocessor

Programming

Page 7: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

7

Coarse-Grained Locking

Easily made correct …

But not scalable.

Art of Multiprocessor

Programming

Page 8: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

8

Fine-Grained Locking

Can be tricky …

Art of Multiprocessor

Programming

Page 9: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

9

Locks are not Robust

If a thread holding

a lock is delayed …

No one else can

make progress Art of Multiprocessor

Programming

Page 10: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Locking Relies on Conventions

• Relation between

– Locks and objects

– Exists only in programmer’s mind

/*

* When a locked buffer is visible to the I/O layer

* BH_Launder is set. This means before unlocking

* we must clear BH_Launder, mb() on alpha and then

* clear BH_Lock, so no reader can see BH_Launder set

* on an unlocked buffer and then risk to deadlock.

*/

Actual comment

from Linux Kernel (hat tip: Bradley Kuszmaul)

Art of Multiprocessor

Programming

Page 11: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

11

Simple Problems are hard

enq(x) enq(y) double-ended queue

No interference if

ends “far apart” Interference OK if

queue is small Clean solution is

publishable result: [Michael & Scott PODC 97]

Art of Multiprocessor

Programming

Page 12: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

12

Locks Not Composable

Transfer item from one

queue to another Must be atomic :

No duplicate or missing items

Page 13: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

13

Locks Not Composable

Lock source

Lock target

Unlock source

& target

Page 14: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

14

Locks Not Composable

Lock source

Lock target

Unlock source &

target

Methods cannot provide

internal synchronization Objects must expose

locking protocols to clients

Clients must devise and

follow protocols Abstraction broken!

Page 15: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

15

Monitor Wait and Signal

zzz

Empty

buffer Yes!

Art of Multiprocessor

Programming

If buffer is empty,

wait for item to show up

Page 16: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

16

Wait and Signal do not Compose

empty

empty zzz…

Art of Multiprocessor

Programming

Wait for either?

Page 17: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

17 17

The Transactional Manifesto

• Current practice inadequate

– to meet the multicore challenge

• Research Agenda

– Replace locking with a transactional API

– Design languages or libraries

– Implement efficient run-time systems

Page 18: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

18 18

Transactions

Block of code ….

Atomic: appears to happen

instantaneously

Serializable: all appear to

happen in one-at-a-time

order Commit: takes effect

(atomically)

Abort: has no effect

(typically restarted)

Page 19: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

19 19

atomic {

x.remove(3);

y.add(3);

}

atomic {

y = null;

}

Atomic Blocks

Page 20: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

20 20

atomic {

x.remove(3);

y.add(3);

}

atomic {

y = null;

}

Atomic Blocks

No data race

Page 21: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

21 21

public void LeftEnq(item x) {

Qnode q = new Qnode(x);

q.left = left;

left.right = q;

left = q;

}

A Double-Ended Queue

Write sequential Code

Page 22: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

22 22

public void LeftEnq(item x)

atomic {

Qnode q = new Qnode(x);

q.left = left;

left.right = q;

left = q;

}

}

A Double-Ended Queue

Page 23: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

23 23

public void LeftEnq(item x) {

atomic {

Qnode q = new Qnode(x);

q.left = left;

left.right = q;

left = q;

}

}

A Double-Ended Queue

Enclose in atomic block

Page 24: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

24 24

Warning

• Not always this simple

– Conditional waits

– Enhanced concurrency

– Complex patterns

• But often it is…

Page 25: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

25

Composition?

Page 26: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

26

Composition?

public void Transfer(Queue<T> q1, q2)

{

atomic {

T x = q1.deq();

q2.enq(x);

}

}

Trivial or what?

Page 27: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

27 27

public T LeftDeq() {

atomic {

if (left == null)

retry;

}

}

Conditional Waiting

Roll back transaction

and restart when

something changes

Page 28: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

28 28

Composable Conditional Waiting

atomic {

x = q1.deq();

} orElse {

x = q2.deq();

}

Run 1st method. If it retries … Run 2nd method. If it retries …

Entire statement retries

Page 29: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

29 29

Hardware Transactional

Memory • Exploit Cache coherence

• Already almost does it

– Invalidation

– Consistency checking

• Speculative execution

– Branch prediction =

optimistic synch!

Page 30: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

30 30

HW Transactional Memory

Interconnect

caches

memory

read active

T

Page 31: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

31 31

Transactional Memory

read

active T T

active

caches

memory

Page 32: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

32 32

Transactional Memory

active T T

active committed

caches

memory

Page 33: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

33 33

Transactional Memory write

active

committed

T D caches

memory

Page 34: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

34 34

Rewind

active T T

active write aborted

D caches

memory

Page 35: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

35 35

Transaction Commit

• At commit point

– If no cache conflicts, we win.

• Mark transactional entries

– Read-only: valid

– Modified: dirty (eventually written back)

• That’s all, folks!

– Except for a few details …

Page 36: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

36 36

Not all Skittles and Beer

• Limits to

– Transactional cache size

– Scheduling quantum

• Transaction cannot commit if it is

– Too big

– Too slow

– Actual limits platform-dependent

Page 37: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

HTM Strengths &

Weaknesses

• Ideal for lock-free data structures

Page 38: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

HTM Strengths &

Weaknesses

• Ideal for lock-free data structures

• Practical proposals have limits on

– Transaction size and length

– Bounded HW resources

– Guarantees vs best-effort

Page 39: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

HTM Strengths &

Weaknesses • Ideal for lock-free data structures

• Practical proposals have limits on

– Transaction size and length

– Bounded HW resources

– Guarantees vs best-effort

• On fail

– Diagnostics essential

– Try again in software?

Page 40: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Composition

Locks don’t compose, transactions do.

Composition necessary for Software Engineering.

But practical HTM doesn’t really support

composition! Why we need STM

Page 41: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Transactional Consistency

• Memory Transactions are collections of

reads and writes executed atomically

• They should maintain consistency

– External: with respect to the interleavings

of other transactions (linearizability).

– Internal: the transaction itself should

operate on a consistent state.

Page 42: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

External Consistency

Application

Memory

X

Y

4

2

8

4

Invariant x = 2y

Transaction A:

Write x

Write y

Transaction B:

Read x

Read y

Compute z = 1/(x-y) = 1/2

Page 43: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

43

A Simple Lock-Based STM

• STMs come in different forms

– Lock-based

– Lock-free

• Here : a simple lock-based STM

• Lets start by Guaranteeing External

Consistency

Page 44: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

44

Synchronization

• Transaction keeps

– Read set: locations & values read

– Write set: locations & values to be written

• Deferred update

– Changes installed at commit

• Lazy conflict detection

– Conflicts detected at commit

Page 45: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

45 45

STM: Transactional Locking

Map

Application

Memory

V#

V#

V#

Array of

version #s &

locks

Page 46: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

46 46

Reading an Object Mem

Locks

V#

V#

V#

V#

V#

Add version numbers

& values to read set

Page 47: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

47 47

To Write an Object Mem

Locks

V#

V#

V#

V#

V#

Add version numbers &

new values to write set

Page 48: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

48 48

To Commit Mem

Locks

V#

V#

V#

V#

V#

X

Y

V#+1

V#+1

Acquire write locks

Check version numbers

unchanged Install new values

Increment version numbers

Unlock.

Page 49: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Encounter Order Locking (Undo Log)

1. To Read: load lock + location

2. Check unlocked add to Read-Set

3. To Write: lock location, store value

4. Add old value to undo-set

5. Validate read-set v#’s unchanged

6. Release each lock with v#+1

V# 0 V# 0

V# 0

V# 0

V# 0

V# 0

V# 0

X V# 1

V# 0 Y V# 1

V# 0 V# 0

Mem Locks

V#+1 0

V#+1 0

V# 0

V# 0

V# 0

V#+1 0

V# 0

V# 0

V# 0

V# 0

V#+1 0

V# 0

X

Y

Quick read of values freshly

written by the reading transaction

Page 50: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Commit Time Locking (Write Buff)

1. To Read: load lock + location

2. Location in write-set? (Bloom Filter)

3. Check unlocked add to Read-Set

4. To Write: add value to write set

5. Acquire Locks

6. Validate read/write v#’s unchanged

7. Release each lock with v#+1

V# 0 V# 0

V# 0

V# 0

V# 0

V# 0

V# 0

V# 0

V# 0 V# 0

V# 0 V# 0

Mem Locks

V#+1 0

V# 0

V# 0

Hold locks for very short duration

V# 1

V# 1

V# 1 X

Y

V#+1 0

V# 1 V#+1 0

V# 0

V#+1 0

V# 0

V# 0

V# 0

V# 0

V#+1 0

V# 0

X

Y

Page 51: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

COM vs. ENC High Load

ENC

Hand

MCS

COM

Red-Black Tree 20% Delete 20% Update 60% Lookup

Page 52: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

COM vs. ENC Low Load

COM ENC

Hand

MCS

Red-Black Tree 5% Delete 5% Update 90% Lookup

Page 53: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Problem: Internal

Inconsistency

• A Zombie is an active transaction destined to

abort.

• If Zombies see inconsistent states bad things

can happen

Page 54: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

54

Internal Consistency

x

y

4

2

8

4

Invariant: x = 2y

Transaction A: reads x = 4

Transaction B: writes

8 to x, 16 to y, aborts A )

Transaction A: (zombie)

reads y = 4

computes 1/(x-y)

Divide by zero FAIL!

Page 55: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

55

Solution: The Global Clock

(The TL2 Algorithm)

• Have one shared global clock

• Incremented by (small subset of) writing

transactions

• Read by all transactions

• Used to validate that state worked on is

always consistent

Page 56: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

100

Art of Multiprocessor

Programming

56 56

Read-Only Transactions Mem

Locks

12

32

56

19

17

100

Shared Version

Clock

Private Read

Version (RV)

Copy version clock to local

read version clock

Page 57: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

100

Art of Multiprocessor

Programming

57 57

Read-Only Transactions Mem

Locks

12

32

56

19

17

100

Shared Version

Clock

Private Read

Version (RV)

Copy version clock to local

read version clock Read lock, version #, and

memory

Page 58: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

58 58

Read-Only Transactions Mem

Locks

12

32

56

19

17

100

Shared Version

Clock

Private Read

Version (RV)

Copy version clock to local

read version clock Read lock, version #, and

memory, check version # less

than read clock

100

On Commit:

check unlocked &

version #s less than

local read clock

Page 59: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

59 59

Read-Only Transactions Mem

Locks

12

32

56

19

17

100

Shared Version

Clock

Private Read

Version (RV)

Copy version clock to local

read version clock Read lock, version #, and

memory On Commit:

check unlocked &

version #s less than

local read clock 100

We have taken a snapshot without

keeping an explicit read set!

Page 60: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Example Execution: Read Only

Trans

1. RV Shared Version Clock

2. On Read: read lock, read mem,

read lock: check unlocked,

unchanged, and v# <= RV

3. Commit.

87 0 87 0

34 0

88 0

V# 0

44 0

V# 0

34 0

99 0 99 0

50 0 50 0

Mem Locks

Reads form a snapshot of memory.

No read set!

100 Shared Version Clock

87 0

34 0

99 0

50 0

87 0

34 0

88 0

V# 0

44 0

V# 0

99 0

50 0

100 RV

Page 61: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

100

Art of Multiprocessor

Programming

61 61

Ordinary (Writing) Transactions Mem

Locks

12

32

56

19

17

100

Shared Version

Clock

Private Read

Version (RV)

Copy version clock to local

read version clock

Page 62: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

100

Art of Multiprocessor

Programming

62 62

Ordinary Transactions Mem

Locks

12

32

56

19

17

100

Shared Version

Clock

Private Read

Version (RV)

Copy version clock to local

read version clock On read/write, check:

Unlocked & version # < RV

Add to R/W set

Page 63: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

63 63

On Commit Mem

Locks

100

Shared Version

Clock

100

12

32

56

19

17 Private Read

Version (RV)

Acquire write locks

Page 64: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

64 64

On Commit Mem

Locks

100

Shared Version

Clock

100 101

12

32

56

19

17 Private Read

Version (RV)

Acquire write locks

Increment Version Clock

Page 65: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

65 65

On Commit Mem

Locks

100

Shared Version

Clock

100 101

12

32

56

19

17 Private Read

Version (RV)

Acquire write locks

Increment Version Clock

Check version numbers ≤ RV

Page 66: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

66 66

On Commit Mem

Locks

100

Shared Version

Clock

100 101

12

32

56

19

17 Private Read

Version (RV)

Acquire write locks

Increment Version Clock

Check version numbers ≤ RV Update memory x

y

Page 67: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

67 67

On Commit Mem

Locks

100

Shared Version

Clock

100 101

12

32

56

19

17 Private Read

Version (RV)

Acquire write locks

Increment Version Clock

Check version numbers ≤ RV Update memory

Update write version #s

x

y

101

101

Page 68: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Example: Writing Trans

1. RV Shared Version Clock

2. On Read/Write: check

unlocked and v# <= RV then

add to Read/Write-Set

3. Acquire Locks

4. WV = F&I(VClock)

5. Validate each v# <= RV

6. Release locks with v# WV

Reads+Inc+Writes

=serializable

100 Shared Version Clock

87 0 87 0

34 0

88 0

44 0

V# 0

34 0

99 0 99 0

50 0 50 0

Mem Locks

87 0

34 0

99 0

50 0

34 1

99 1

87 0

X

Y

Commit

121 0

121 0

50 0

87 0

121 0

88 0

V# 0

44 0

V# 0

121 0

50 0

100 RV

100 120 121

X

Y

Page 69: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

69

TM Design Issues

• Implementation

choices

• Language design

issues

• Semantic issues

Page 70: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

70

Granularity

• Object

– managed languages, Java, C#, …

– Easy to control interactions between

transactional & non-trans threads

• Word

– C, C++, …

– Hard to control interactions between

transactional & non-trans threads

Page 71: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

71

Direct/Deferred Update

• Deferred

– modify private copies & install on commit

– Commit requires work

– Consistency easier

• Direct

– Modify in place, roll back on abort

– Makes commit efficient

– Consistency harder

Page 72: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

72

Conflict Detection

• Eager

– Detect before conflict arises

– “Contention manager” module resolves

• Lazy

– Detect on commit/abort

• Mixed

– Eager write/write, lazy read/write …

Page 73: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

73

Conflict Detection

• Eager detection may abort transactions

that could have committed.

• Lazy detection discards more

computation.

Page 74: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

74

Contention Management &

Scheduling

• How to resolve

conflicts?

• Who moves forward

and who rolls back?

• Lots of empirical

work but formal

work in infancy

Page 75: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

75

Contention Manager Strategies

• Exponential backoff

• Priority to

– Oldest?

– Most work?

– Non-waiting?

• None Dominates

• But needed anyway Judgment of Solomon

Page 76: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

76

I/O & System Calls?

• Some I/O revocable

– Provide transaction-

safe libraries

– Undoable file

system/DB calls

• Some not

– Opening cash

drawer

– Firing missile

Page 77: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

77

I/O & System Calls

• One solution: make transaction irrevocable

– If transaction tries I/O, switch to irrevocable mode.

• There can be only one …

– Requires serial execution

• No explicit aborts

– In irrevocable transactions

Page 78: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

78

Exceptions

int i = 0;

try {

atomic {

i++;

node = new Node();

}

} catch (Exception e) {

print(i);

}

Page 79: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

79

Exceptions

int i = 0;

try {

atomic {

i++;

node = new Node();

}

} catch (Exception e) {

print(i);

}

Throws OutOfMemoryException!

Page 80: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

80

Exceptions

int i = 0;

try {

atomic {

i++;

node = new Node();

}

} catch (Exception e) {

print(i);

}

Throws OutOfMemoryException!

What is

printed?

Page 81: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

81

Unhandled Exceptions

• Aborts transaction

– Preserves invariants

– Safer

• Commits transaction

– Like locking semantics

– What if exception object refers to values

modified in transaction?

Page 82: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

82

Nested Transactions

atomic void foo() {

bar();

}

atomic void bar() {

}

Page 83: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

83

Nested Transactions

• Needed for modularity

– Who knew that cosine() contained a transaction?

• Flat nesting

– If child aborts, so does parent

• First-class nesting

– If child aborts, partial rollback of child only

Page 84: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Hatin’ on TM

STM is too inefficient

Page 85: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Hatin’ on TM

Requires radical change in programming style

Page 86: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Hatin’ on TM

Erlang-style shared nothing only true path to salvation

Page 87: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Hatin’ on TM

There is nothing wrong with what we do today.

Page 88: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Gartner Hype Cycle

Hat tip: Jeremy Kemp

You are here

Page 89: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

89

• Multicore forces us to rethink

almost everything

I, for one, Welcome our new

Multicore Overlords …

Page 90: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

90

• Multicore forces us to rethink

almost everything

• Standard approaches too

complex

I, for one, Welcome our new

Multicore Overlords …

Page 91: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

91

• Multicore forces us to rethink

almost everything

• Standard approaches won’t

scale

• Transactions might make life

simpler…

I, for one, Welcome our new

Multicore Overlords …

Page 92: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Art of Multiprocessor

Programming

92

• Multicore forces us to rethink almost

everything

• Standard approaches won’t scale

• Transactions might …

• Multicore programming

Plenty more to do…

Maybe you will save us…

I, for one, Welcome our new

Multicore Overlords …

Page 93: NON-BLOCKING DATA STRUCTURES AND ......Art of Multiprocessor Programming 59 Read-Only Transactions Mem Locks 12 32 56 19 17 100 Shared Version Clock Private Read Version (RV) Copy

Thanks ! תודה

Art of Multiprocessor

Programming

93