concurrency - github pagesthreads, shared memory, & synchronization howdolockswork? data races...

19
7/6/2015 1 1 Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lowerlevel property) How do data race detectors work? Atomicity (a higherlevel property) Concurrency exceptions & Summary Extra: Doublechecked locking 2 3 “From my perspective, parallelism is the biggest challenge since highlevel programming languages. It’s the biggest thing in 50 years because industry is betting its future that parallel programming will be useful. “Industry is building parallel hardware, assuming people can use it.  And I think there's a chance they'll fail since the software is not necessarily in place.  So this is a gigantic challenge facing the computer science community.  If we miss this opportunity, it's going to be bad for the industry.” —David Patterson, ACM Queue interview, 2006 4 Imperative programs Java, C#, C, C++, Python, Ruby Threads Shared, mutable state Synchronization primitives: Lock acquire & release Monitor wait & notify Thread start & join 5 T1: t = x; t = t + 1; x = t; T2: t = x; t = t + 1; x = t; int x = 1; 6

Upload: others

Post on 09-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

1

1

Motivation & examples Threads, shared memory, & synchronization

How do locks work?

Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level property) Concurrency exceptions & SummaryExtra: Double‐checked locking

2

3

“From my perspective, parallelism is the biggest challenge since high‐level programming languages. It’s the biggest thing in 50 years because industry is betting its future that parallel programming will be useful.

“Industry is building parallel hardware, assuming people can use it.  And I think there's a chance they'll fail since the software is not necessarily in place.  So this is a gigantic challenge facing the computer science community.  If we miss this opportunity, it's going to be bad for the industry.”

—David Patterson, ACM Queue interview, 2006

4

Imperative programs

Java, C#, C, C++, Python, Ruby

Threads

Shared, mutable state

Synchronization primitives:

▪ Lock acquire & release

▪ Monitor wait & notify

▪ Thread start & join

5

T1:

t = x;t = t + 1;x = t;

T2:

t = x;t = t + 1;x = t;

int x = 1;

6

Page 2: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

2

T1:

synchronized (m) {t = x;t = t + 1;x = t;

}

T2:

synchronized (m) {t = x;t = t + 1;x = t;

}

int x = 1;

7

T1:

t = x;t = t + 1;x = t;

T2:

t = x;t = t * 2;x = t;

int x = 1;

8

T1:

synchronized (m) {t = x;t = t + 1;x = t;

}

T2:

synchronized (m) {t = x;t = t * 2;x = t;

}

int x = 1;

9

T1:

data = 42;flag = true;

T2:

if (flag) {

int t = data;

print(t);

}

Initially:

int data = 0;

boolean flag = false;

10

Motivation & examples Threads, shared memory, & synchronization

How do locks work?

Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level property) Concurrency exceptions & SummaryExtra: Double‐checked locking

11

Each thread: has its own stack shares memory with other threads in same process (same virtual address space)

Compiler compiles code as though it were single‐threaded!

Synchronization operations (e.g., lock acquire and release) order accesses to shared memory, providing: mutual exclusion ordering and visibility

12

Page 3: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

3

T1:

synchronized (m) {t = x;t = t + 1;x = t;

}

T2:

synchronized (m) {t = x;t = t * 2;x = t;

}

int x = 1;

13

T1:

acquire(m);t = x;t = t + 1;x = t;

release(m);

T2:

acquire(m);t = x;t = t * 2;x = t;

release(m);

int x = 1;

14

T1:

acquire(m.lockBit);t = x;t = t + 1;x = t;

release(m.lockBit);

T2:

acquire(m.lockBit);t = x;t = t * 2;x = t;

release(m.lockBit);

int x = 1;

15

Locking (& other) bits

m

T1:

while (m.lockBit != 0) {}m.lockBit = 1;t = x;t = t + 1;x = t;

m.lockBit = 0;

T2:

while (m.lockBit != 0) {}m.lockBit = 1;t = x;t = t * 2;x = t;

m.lockBit = 0;

int x = 1;

16

Possible implementation of locks?

T1:

while (!TAS(&m.lockBit,0,1)) {}t = x;t = t + 1;x = t;

m.lockBit = 0;

T2:

while (!TAS(&m.lockBit,0,1)) {}t = x;t = t * 2;x = t;

m.lockBit = 0;

int x = 1;

17

Need an atomic operation like test‐and‐set (TAS)

T1:

while (!TAS(&m.lockBit,0,1)) {}t = x;t = t + 1;x = t;

memory_fence;m.lockBit = 0;

T2:

while (!TAS(&m.lockBit,0,1)) {}t = x;t = t * 2;x = t;

memory_fence;m.lockBit = 0;

int x = 1;

18

• Fence needed for visibility (related to happens‐before relationship, discussed later)

• Also: compiler obeys “roach motel” rules: can move operations into but not out of atomic blocks

Page 4: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

4

T1:

while (!TAS(&m.lockBit,0,1)) {}t = x;t = t + 1;x = t;

memory_fence;m.lockBit = 0;

T2:

while (!TAS(&m.lockBit,0,1)) {}t = x;t = t * 2;x = t;

memory_fence;m.lockBit = 0;

int x = 1;

19

• Java locks are reentrant, so more than one bit is actually used (to keep track of nesting depth)

• Also, spin (non‐blocking) locks are converted to blocking locks if there’s contention

Motivation & examples Threads, shared memory, & synchronization

How do locks work?

Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level property) Concurrency exceptions & SummaryExtra: Double‐checked locking

20

50 million people 

Energy Management System Alarm and Event Processing Routine (1 MLOC)

http://www.securityfocus.com/news/8412

Energy Management System Alarm and Event Processing Routine (1 MLOC)

Post‐mortem analysis: 8 weeks"This fault was so deeply embedded, it took them weeks of poring through millions of lines of code and data to find it.”  –Ralph DiNicola, FirstEnergy

http://www.securityfocus.com/news/8412

Page 5: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

5

Race condition

Two threads writing to data structure simultaneously

Usually occurs without error 

Small window for causing data corruption

http://www.securityfocus.com/news/8412

Two accesses to same variable At least one is a write

Not well‐synchronized

(not ordered by happens‐before relationship)

Or: accesses can happen simultaneously

26

Modern language memory models (via compiler+hardware) guarantee the following relationship:

Data race freedom  Sequential consistency

However: Data race Weak or undefined semantics!

Sequential consistency: instructions appear to execute in an order that respect’s program order

27

T1:

data = 42;flag = true;

T2:

if (flag)

t = data;

Initially:

int data = 0;

boolean flag = false;

28

T1:

data = 42;flag = true;

T2:

if (flag)

t = data;

Initially:

int data = 0;

boolean flag = false;

29

T1:

flag = true;

data = 42;

T2:

if (flag)

t = data;

Initially:

int data = 0;

boolean flag = false;

30

Page 6: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

6

T1:

data = 42;flag = true;

T2:

t2 = data;

if (flag)

t = t2;

Initially:

int data = 0;

boolean flag = false;

31

T1:

data = ...;synchronized (m) {flag = true;

}

T2:

boolean f;

synchronized (m) {

f = flag;

}

if (f)

... = data;

Initially:

int data = 0;

boolean flag = false;

32

T1:

data = ...;acquire(m);flag = true;

release(m);

T2:

boolean f;

acquire(m);

f = flag;

release(m);

if (f)

... = data;

Initially:

int data = 0;

boolean flag = false;

33

T1:

data = ...;flag = true;

T2:

if (flag)

... = data;

Initially:

int data = 0;

volatile boolean flag = false;

34

T1:

data = 42;flag = true;

T2:

while (!flag) { }

print(data);

Initially:

int data = 0;

boolean flag = false;

35

T1:

data = 42;flag = true;

T2:

boolean f;

do {

f = flag;

} while (!f);

int d = data;

print(d);

Initially:

int data = 0;

boolean flag = false;

36

Page 7: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

7

T1:

flag = true;

data = 42;

T2:

boolean f;

do {

f = flag;

} while (!f);

int d = data;

print(d);

Initially:

int data = 0;

boolean flag = false;

37

T1:

data = 42;flag = true;

T2:

int d = data;

boolean f;

do {

f = flag;

} while (!f);

print(d);

Initially:

int data = 0;

boolean flag = false;

38

Motivation & examples Threads, shared memory, & synchronization

How do locks work?

Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level property) Concurrency exceptions & SummaryExtra: Double‐checked locking

39

Two accesses to same variable (one is a write) 

One access doesn’t happen before the other Program order

Synchronization order▪ Acquire‐release▪ Wait‐notify

▪ Fork‐join▪ Volatile read‐write

Thread A

write  x

unlock m

Thread B Two accesses to same variable (one is a write) 

One access doesn’t happen before the other Program order

Synchronization order▪ Acquire‐release▪ Wait‐notify

▪ Fork‐join▪ Volatile read‐write

Thread A

write  x

unlock m

Thread B

lock m

write x

Two accesses to same variable (one is a write) 

One access doesn’t happen before the other Program order

Synchronization order▪ Acquire‐release▪ Wait‐notify

▪ Fork‐join▪ Volatile read‐write

Page 8: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

8

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

Two accesses to same variable (one is a write) 

One access doesn’t happen before the other Program order

Synchronization order▪ Acquire‐release▪ Wait‐notify

▪ Fork‐join▪ Volatile read‐write

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

Two accesses to same variable (one is a write) 

One access doesn’t happen before the other

Program order

Synchronization order

▪ Acquire‐release

▪ Wait‐notify

▪ Fork‐join

▪ Volatile read‐write

How do dynamic data race detectors work?

Vector clocks: check happens‐before Lockset: assume and check locking discipline Other strategies

Each thread T maintains its own logical clock ‘c’

Initially, c = 0 when T starts

Incremented at synchronization release operations

▪ unlock m, volatile write, Thread.join()

Vector clock is a vector of logical clocks

5 2A B

)( iff tV (t)VtVV 2121

5 6 7A B C

1 2 3A B C

4 5 3A B C

4 5 3A B C

true

true

4 5 3A B C

4 2 3A B C

false

Thread A Thread B

5 2 3 4A B A B

Vector clocks

Page 9: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

9

Thread A Thread B

5 2 3 4A B A B

Vector clocks

Thread A’s logical time Thread B’s logical time

Thread A Thread B

5 2 3 4A B A B

Vector clocks

Last logical time “received” from B

Last logical time “received” from A

5 2 3 4A B A B

Thread A

write x

Thread B

5 2

5 2 3 4A B A B

Thread A

write x

unlock m

Thread B

5 2

6 2Increment clock

5 2

5 2 3 4A B A B

Thread A

write x

unlock m

Thread B

lock m

5 2

6 2Increment clock

5 2 3 4A B A B

Thread A

write x

unlock m

Thread B

lock m

5 2

6 2Increment clock

Joinclocks5 4

5 2

Page 10: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

10

5 2 3 4A B A B

Thread A

write x

unlock m

Thread B

lock m

5 2

6 2

5 4

5 2

n = # of threads

O(n) time

5 2 3 4A B A B

Thread A

write x

unlock m

Thread B

lock m

write x

5 2

6 2

5 4

5 2

5 4

5 2 3 4A B A B

Thread A

write x

unlock m

read x

Thread B

lock m

write x

5 2

6 2

5 4

5 2

5 4

6 2 Happens before?

5 2 3 4A B A B

Thread A

write x

unlock m

read x

Thread B

lock m

write x

5 2

6 2

5 4

5 2

5 4

6 2

Soundness

No false negatives, i.e., all data races in thatexecution will be reported.

Precision

No false positives, i.e., all reports are true data races.

These are not standard terms across all domains, e.g., architects mightrefer to these properties as complete and sound.

Tracks happens‐before: sound & precise

80X slowdown

Each analysis step: O(n) time                 (n = # of threads)

FastTrack [Flanagan & Freund ’09]

Reads & writes (97%): O(1) time

Synchronization (3%): O(n) time

8X slowdown

Page 11: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

11

Tracks happens‐before: sound & precise

80X slowdown

Each analysis step: O(n) time                 (n = # of threads)

FastTrack [Flanagan & Freund ’09]

Reads & writes (97%): O(1) time

Synchronization (3%): O(n) time

8X slowdown

In a data‐race‐free (DRF) program, writes are totally ordered

Can maintain only the “last” writer

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

5@A

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

5@A

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

6 2

5@A

5 2

Page 12: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

12

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

6 2

5@A

5 2

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

6 2

5 4

5@A

5 2

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

5 4

5@A

6 2Happens before?5 2

5@A

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

5 4

6 2

4@B

5 2

5@A

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

5 4

6 2

Happens before?

4@B

5 2

5@A

Thread A

write  x

unlock m

read x

Thread B

lock m

write x

5 2 3 4A B A B

5 4

6 2

Happens before?

4@B

5 2

Page 13: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

13

Keeps track of the locks associated with each thread and program variable

Keeps track of the locks associated with each thread and program variable

Thread A

lock mwrite  xlock nwrite yunlock nunlock m

read x

LocksetA

L = { } 

L = {m}

L = {m, n}

L = {m}

L = { }

Two accesses from different threads with non‐intersecting locksets form a data race

Can detect more data races than happens‐before tracking

A property referred to as coverage

Thread A

y = y + 1

lock mv = v + 1unlock m

Thread B

lock mv = v + 1unlock m

y = y + 1

Thread A

y = y + 1

lock mv = v + 1unlock m

Thread B

lock mv = v + 1unlock m

y = y + 1

No data race with happens‐before

Thread A

y = y + 1lock mv = v + 1unlock m

Thread B

lock mv = v + 1unlock my = y + 1

Page 14: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

14

Thread A

y = y + 1lock mv = v + 1unlock m

Thread B

lock mv = v + 1unlock my = y + 1

Data race with happens‐before

Data race with lockset

Thread A

y = y + 1

lock mv = v + 1unlock m

Thread B

lock mv = v + 1unlock m

y = y + 1

No data race with happens‐before

Thread A

y = y + 1

lock mv = v + 1unlock m

Thread B

lock mv = v + 1unlock m

y = y + 1

Data race with lockset

Thread A

y = y + 1

lock mv = v + 1unlock m

Thread B

lock mv = v + 1unlock m

y = y + 1

Lockset

Ly = { }

Lv = {m}

Lv = {m}

Ly = { }

Thread A

x = new Object();flag = true;

Thread B

while (!flag);x.print();

volatile boolean flag = false;

Thread A

x = new Object();flag = true;

Thread B

while (!flag);

while (!flag);x.print();

volatile boolean flag = false;

B is blocked

Page 15: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

15

Lockset algorithms

Generally unsound and imprecise

Better coverage

Vector clock algorithms

Sound and precise

Limited coverage

Motivation & examples Threads, shared memory, & synchronization

How do locks work?

Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level property) Concurrency exceptions & SummaryExtra: Double‐checked locking

86

Operations appear to happen all at once or not at all

Serializability – execution equivalent to some serial execution of atomic blocks

87

T1:

t = x;t = t + 1;x = t;

T2:

t = x;t = t * 2;x = t;

int x = 1;

88

T1:

t = x;

t = t + 1;

x = t;

T2:

t = x;

t = t * 2;

x = t;

int x = 1;

89

T1:synchronized(m) {t = x;

}

t = t + 1;

synchronized(m) {x = t;

}

T2:

synchronized(m) {t = x;

}

t = t * 2;

synchronized(m) {x = t;

}

int x = 1;

90Still an atomicity violation

Page 16: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

16

T1:

synchronized (m) {t = x;t = t + 1;x = t;

}

T2:

synchronized (m) {t = x;t = t * 2;x = t;

}

int x = 1;

91

T1:

synchronized (m) {t = x;t = t + 1;x = t;

}

T2:

synchronized (m) {t = x;t = t * 2;x = t;

}

int x = 1;

92

T1:

synchronized (m) {t = x;t = t + 1;x = t;

}

T2:

synchronized (m) {t = x;t = t * 2;x = t;

}

int x = 1;

93

class Vector {synchronized boolean contains(Object o) { ... }synchronized void add(Object o) { ... }

}

94

class Vector {synchronized boolean contains(Object o) { ... }synchronized void add(Object o) { ... }

}

class Set {Vector vector;void add(Object o) {if (!vector.contains(o)) {vector.add(o);

}}

}

95

class Vector {synchronized boolean contains(Object o) { ... }synchronized void add(Object o) { ... }

}

class Set {Vector vector;synchronized void add(Object o) {if (!vector.contains(o)) {vector.add(o);

}}

}

96

Page 17: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

17

class Vector {synchronized boolean contains(Object o) { ... }synchronized void add(Object o) { ... }

}

class Set {Vector vector;void add(Object o) {atomic {if (!vector.contains(o)) {vector.add(o);

}}

}}

97

Motivation & examples Threads, shared memory, & synchronization

How do locks work?

Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level property) Concurrency exceptions & SummaryExtra: Double‐checked locking

98

Java provides memory & type safety

Buffer overflows, dangling pointers, array out‐of‐bounds, double frees, some memory leaks

How are these handled?  With exceptions?

99

Java provides memory & type safety

Buffer overflows, dangling pointers, array out‐of‐bounds, double frees, some memory leaks

How are these handled?  With exceptions?

Should languages (and the runtime systems & hardware that support them) provide 

concurrency correctness?

Check & enforce: atomicity, SC/DRF, determinism100

General‐purpose parallel software: hard & unsolved

Challenging semantics for parallel programs

Understand how to write correct, scalable programs  one of few experts

101

Motivation & examples Threads, shared memory, & synchronization

How do locks work?

Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level property) Concurrency exceptions & SummaryExtra: Double‐checked locking

102

Page 18: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

18

class Movie {Vector<String> comments;

addComment(String s) {if (comments == null) {

comments = new Vector<String>();}comments.add(s);

}}

103

class Movie {Vector<String> comments;

addComment(String s) {synchronized (this) {

if (comments == null) {comments = new Vector<String>();}

}comments.add(s);

}}

104

class Movie {Vector<String> comments;

addComment(String s) {if (comments == null) {

synchronized (this) {if (comments == null) {

comments = new Vector<String>();}

}}comments.add(s);

}}

105

addComment(String s) {if (comments == null) {synchronized (this) {if (comments == null) {comments =new Vector<String>();

}}}

comments.add(s);}

addComment(String s) {

if (comments == null) {

}

comments.add(s);

}

106

addComment(String s) {if (comments == null) {synchronized (this) {if (comments == null) {Vector temp =alloc Vector;temp.<init>();comments = temp;}}}

comments.add(s);}

addComment(String s) {

if (comments == null) {

}

comments.add(s);

}

107

addComment(String s) {if (comments == null) {synchronized (this) {if (comments == null) {Vector temp =alloc Vector;temp.<init>();comments = temp;}}}

comments.add(s);}

addComment(String s) {

if (comments == null) {

}

comments.add(s);

}

108

Page 19: Concurrency - GitHub PagesThreads, shared memory, & synchronization Howdolockswork? Data races (a lower‐level property) How do data race detectors work? Atomicity (a higher‐level

7/6/2015

19

addComment(String s) {if (comments == null) {synchronized (this) {if (comments == null) {Vector temp =alloc Vector;comments = temp;

temp.<init>();}}}comments.add(s);}

addComment(String s) {

if (comments == null) {

}

comments.add(s);

}

109