a methodology for creating fast wait-free data structures

29
A Methodology for Creating Fast Wait-Free Data Structures Alex Kogan and Erez Petrank Computer Science Technion, Israel

Upload: marinel

Post on 04-Feb-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Alex Kogan and Erez Petrank Computer Science Technion , Israe l. A Methodology for Creating Fast Wait-Free Data Structures. Concurrency & (Non-blocking) synchronization. Concurrent data-structures require (fast and scalable) synchronization Non-blocking synchronization: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Methodology for Creating Fast Wait-Free Data Structures

A Methodology for Creating Fast Wait-Free Data Structures

Alex Kogan and Erez PetrankComputer Science

Technion, Israel

Page 2: A Methodology for Creating Fast Wait-Free Data Structures

2

Concurrency & (Non-blocking) synchronization Concurrent data-structures require (fast

and scalable) synchronization

Non-blocking synchronization: No thread is blocked in waiting for another

thread to complete no locks / critical sections

Page 3: A Methodology for Creating Fast Wait-Free Data Structures

3

Lock-free (LF) algorithms

Among all threads trying to apply operations on the data structure, one will succeed

Opportunistic approach read some part of the data structure make an attempt to apply an operation when failed, retry

Many scalable and efficient algorithms

Global progressAll but one threads may starve

Page 4: A Methodology for Creating Fast Wait-Free Data Structures

4

Wait-free (WF) algorithms A thread completes its operation a bounded

#steps regardless of what other threads are doing

Particularly important property in several domains e.g., real-time systems and operating systems

Commonly regarded as too inefficient and complicated to design

Page 5: A Methodology for Creating Fast Wait-Free Data Structures

5

The overhead of wait-freedom Much of the overhead is because of helping

key mechanism employed by most WF algorithms

controls the way threads help each other with their operations

Can we eliminate the overhead? The goal: average-case efficiency of lock-

freedom and worst-case bound of wait-freedom

Page 6: A Methodology for Creating Fast Wait-Free Data Structures

6

Why is helping slow? A thread helps others immediately when it

starts its operation All threads help others in exactly the same

order contention redundant work

Each operation has to be applied exactly once usually results in a higher # expensive atomic

operations

Lock-free MS-queue (PODC,

1996)

Wait-free KP-queue (PPOPP,

2011)

# CASs in enqueue

2 3

# CASs in dequeue

1 4

Page 7: A Methodology for Creating Fast Wait-Free Data Structures

7

Reducing the overhead of helpingMain observation: “Bad” cases happen, but are very rareTypically a thread can complete without any help

if only it had a chance to do that …

Main ideas: Ask for help only when you really need it

i.e., after trying several times to apply the operation Help others only after giving them a chance to

proceed on their own delayed helping

Page 8: A Methodology for Creating Fast Wait-Free Data Structures

8

Fast-path-slow-path methodology Start operation by running its (customized)

lock-free implementation

Upon several failures, switch into a (customized) wait-free implementation notify others that you need help keep trying

Once in a while, threads on the fast path check if their help is needed and provide help

Fast path

Slow path

Delayed helping

Page 9: A Methodology for Creating Fast Wait-Free Data Structures

9

Do I need

to help ?

Start yes Help Someone

noApply my op

using fast path(at most N

times)

Success?

no

Apply my op using slow

path(until

success)

Return

yes

Fast-path-slow-path generic scheme

Different threads may run on two paths concurrently!

Page 10: A Methodology for Creating Fast Wait-Free Data Structures

10

Fast-path-slow-path: queue example

Fast path (MS-queue)

Slow path (KP-queue)

Page 11: A Methodology for Creating Fast Wait-Free Data Structures

11

Thread ID

Fast-path-slow-path: queue exampleInternal structures

state

9

true

false

null

4

true

true

null

9

false

false

null

phasependin

genqueue

node

0 1 2

Page 12: A Methodology for Creating Fast Wait-Free Data Structures

12

Thread ID

Fast-path-slow-path: queue exampleInternal structures

state

9

true

false

null

4

true

true

null

9

false

false

null

phasependin

genqueue

node

0 1 2Counts # ops on

the slow path

Page 13: A Methodology for Creating Fast Wait-Free Data Structures

13

Thread ID

Fast-path-slow-path: queue exampleInternal structures

state

9

true

false

null

4

true

true

null

9

false

false

null

phasependin

genqueue

node

0 1 2Is there a pending

operation on the slow path?

Page 14: A Methodology for Creating Fast Wait-Free Data Structures

14

Thread ID

Fast-path-slow-path: queue exampleInternal structures

state

9

true

false

null

4

true

true

null

9

false

false

null

phasependin

genqueue

node

0 1 2 What is the pending

operation?

Page 15: A Methodology for Creating Fast Wait-Free Data Structures

15

Thread ID

Fast-path-slow-path: queue exampleInternal structures

1

4

3

0

5

8

0

9

0

curTid

lastPhasenextChec

k

helpRecords

0 1 2

Page 16: A Methodology for Creating Fast Wait-Free Data Structures

16

Thread ID

Fast-path-slow-path: queue exampleInternal structures

1

4

3

0

5

8

0

9

0

curTid

lastPhasenextChec

k

helpRecords

0 1 2

ID of the next thread that I will

try to help

Page 17: A Methodology for Creating Fast Wait-Free Data Structures

17

Thread ID

Fast-path-slow-path: queue exampleInternal structures

1

4

3

0

5

8

0

9

0

curTid

lastPhasenextChec

k

helpRecords

0 1 2Phase # of that thread at the

time the record was created

Page 18: A Methodology for Creating Fast Wait-Free Data Structures

18

Thread ID

Fast-path-slow-path: queue exampleInternal structures

1

4

3

0

5

8

0

9

0

curTid

lastPhasenextChec

k

helpRecords

0 1 2

Decrements with every my

operation. Check if my help is

needed when this counter reaches

0

HELPING_DELAY controls the frequency of

helping checks

Page 19: A Methodology for Creating Fast Wait-Free Data Structures

19

Fast-path-slow-path: queue exampleFast path1. help_if_needed()2. int trials = 0

while (trials++ < MAX_FAILURES) {

apply_op_with_customized_LF_alg(finish if succeeded)

}3. switch to slow path

LF algorithm customization is required to synchronize operations run on two paths

MAX_FAILURES controls the

number of trials on the fast path

Page 20: A Methodology for Creating Fast Wait-Free Data Structures

20

Fast-path-slow-path: queue exampleSlow path1. my phase ++2. announce my operation (in state)3. apply_op_with_customized_WF_alg

(until finished)

WF algorithm customization is required to synchronize operations run on two paths

Page 21: A Methodology for Creating Fast Wait-Free Data Structures

Performance evaluation32-core Ubuntu server with OpenJDK 1.6

8 2.3 GHz quadcore AMD 8356 processors

The queue is initially empty Each thread iteratively performs (100k

times): Enqueue-Dequeue benchmark: enqueue and

then dequeue

Measure completion time as a function of # threads

Page 22: A Methodology for Creating Fast Wait-Free Data Structures

22

Performance evaluation

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 610

20

40

60

80

100

120

140MS-queueKP-queue

number of threads

tim

e (

sec)

Page 23: A Methodology for Creating Fast Wait-Free Data Structures

23

Performance evaluation

0

20

40

60

80

100

120

140

MS-queue

KP-queue

fast WF (0, 0)

number of threads

tim

e (

sec)

MAX_FAILURES

HELPING_DELAY

Page 24: A Methodology for Creating Fast Wait-Free Data Structures

24

Performance evaluation

0

20

40

60

80

100

120

140

MS-queueKP-queuefast WF (0, 0)fast WF (3,3)fast WF (10,10)fast WF (20,20)

number of threads

tim

e (

sec)

Page 25: A Methodology for Creating Fast Wait-Free Data Structures

25

The impact of configuration parameters

0

20

40

60

80

100

120

140

fast WF (0,0)

fast WF (10,10)

number of threads

tim

e (

sec)

MAX_FAILURES

HELPING_DELAY

Page 26: A Methodology for Creating Fast Wait-Free Data Structures

26

The use of the slow path

1 9 17 25 33 41 49 570

20

40

60

80

100enqueue

number of threads

% o

ps o

n s

low

path

1 9 17 25 33 41 49 57

dequeue

number of threads

MAX_FAILURES

HELPING_DELAY

Page 27: A Methodology for Creating Fast Wait-Free Data Structures

27

Tuning performance parameters Why not just always use large values for both

parameters (MAX_FAILURES, HELPING_DELAY)? (almost) always eliminate slow path

Lemma: The number of steps required for a thread to complete an operation on the queue in the worst-case is O(MAX_FAILURES + HELPING_DELAY * n2)

→Tradeoff between average-case performance and worst-case completion time bound

Page 28: A Methodology for Creating Fast Wait-Free Data Structures

28

Summary A novel methodology for creating fast wait-

free data structures key ideas: two execution paths + delayed

helping good performance when the fast path is

extensively utilized concurrent operations can proceed on both

paths in parallel

Can be used in other scenarios e.g., running real-time and non-real-time

threads side-by-side

Page 29: A Methodology for Creating Fast Wait-Free Data Structures

29

Thank you!Questions?