lock-free resizeable concurrent tries

87
Lock-Free Resizeable Concurrent Tries Aleksandar Prokopec, Phil Bagwell, Martin Odersky LAMP, École Polytechnique Fédérale de Lausanne Switzerland

Upload: cullen

Post on 05-Feb-2016

82 views

Category:

Documents


3 download

DESCRIPTION

Lock-Free Resizeable Concurrent Tries. Aleksandar Prokopec, Phil Bagwell, Martin Odersky LAMP, École Polytechnique Fédérale de Lausanne Switzerland. Motivation. xs.foreach { x => doSomething(x) }. Motivation. xs.foreach { x => doSomething (x) }. ys = xs.map { x => x * (-1) }. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lock-Free Resizeable Concurrent Tries

Lock-Free Resizeable Concurrent Tries

Aleksandar Prokopec, Phil Bagwell, Martin OderskyLAMP, École Polytechnique Fédérale de Lausanne

Switzerland

Page 2: Lock-Free Resizeable Concurrent Tries

Motivation

xs.foreach { x => doSomething(x)}

Page 3: Lock-Free Resizeable Concurrent Tries

Motivation

xs.foreach { x => doSomething(x)}

ys = xs.map { x => x * (-1)}

Page 4: Lock-Free Resizeable Concurrent Tries

Motivation

ys = new ConcurrentMapxs.foreach { x => ys.insert(x * (-1))}

Page 5: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

Page 6: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

0 = 0000002

Page 7: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

0

Page 8: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

016 = 0100002

Page 9: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

0 16

Page 10: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

0 164 = 0001002

Page 11: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16

0

4 = 0001002

Page 12: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16

0 4

Page 13: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16

0 4

12 = 0011002

Page 14: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16

0 4

12 = 0011002

Page 15: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16

0 4 12

Page 16: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16 33

0 4 12

Page 17: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16 33

0 4 12

48

Page 18: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16

0 4 12

48

33 37

Page 19: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

16

4 12

48

33 37

0 3

Page 20: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 8 93

48 57

Page 21: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 8 93

48 57

Too much space!

Page 22: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 8 93

48 57

Page 23: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 8 93

48 57

Linear search at every level - slow!

Page 24: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 8 93

48 57

Solution – bitmap index!Relying on BITPOP instruction.

Page 25: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

48 57

48 571 0 1 0

48 571 0 1 0

48 5710

BITPOP(((1 << ((hc >> lev) & 1F)) – 1) & BMP)

Page 26: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 8 93

48 57

For 32-way tries – 32-bit bitmap.

Page 27: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 8 93

48 57

Page 28: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 12 16 20 25 33 37

0 1 93

48 57

Page 29: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

4 9 12 16 20 25 33 37

0 1 3

48 57

Remove compresses the trie.

Page 30: Lock-Free Resizeable Concurrent Tries

Hash Array Mapped Tries (HAMT)

• advantages:• low space consumption and shrinking• no contiguous memory region required• fast – logarithmic complexity, but with a low

constant factor• used as efficient immutable maps• no global resize phase – real time

applications, potentially more scalable concurrent operations?

Page 31: Lock-Free Resizeable Concurrent Tries

Concurrent Trie (Ctrie)

• goals:• thread-safe concurrent trie• maintain the advantages of HAMT• rely solely on CAS instructions• ensure lock-freedom and linearizability

• lookup – probably same as for HAMT

Page 32: Lock-Free Resizeable Concurrent Tries

CAS instruction

CAS(address, expected_value, new_value)

Atomically replaces the value at the address with the new_value if it is equal to the expected_value.

Returns true if successful, false otherwise.

May fail spuriously.

Page 33: Lock-Free Resizeable Concurrent Tries

Lock-freedom

If multiple threads execute an operation, at least one of them will complete the operation within a finite number of steps.

Page 34: Lock-Free Resizeable Concurrent Tries

Lock-freedom

If multiple threads execute an operation, at least one of them will complete the operation within a finite number of steps.

do { a = READ(addr) b = a + 1 } while (!CAS(addr, a, b))

Page 35: Lock-Free Resizeable Concurrent Tries

Lock-freedom

If multiple threads execute an operation, at least one of them will complete the operation within a finite number of steps.

def counter() do { a = READ(addr) b = a + 1 } while (!CAS(addr, a, b))

Page 36: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 16 20 25 33 37

0 1 3

48 57

17 = 0100012

Page 37: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 16 20 25 33 37

0 1 3

48 57

17 = 010001216 17

1) allocate

Page 38: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 20 25 33 37

0 1 3

48 57

17 = 010001216 17

2) CAS

Page 39: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 20 25 33 37

0 1 3

48 57

17 = 010001216 17

Page 40: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 33 37

0 1 3

48 57

18 = 0100102

16 17

20 25

Page 41: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 33 37

0 1 3

48 57

18 = 0100102

16 17

20 25

1) allocate16 17 18

Page 42: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 33 37

0 1 3

48 57

18 = 0100102

20 25

2) CAS 16 17 18

Page 43: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 33 37

0 1 3

48 57

18 = 0100102

20 25

2) CAS 16 17 18

Unless…

Page 44: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12 33 37

0 1 3

48 57

18 = 0100102

16 17

20 25

T1-1) allocate16 17 18

Unless…28 = 0111002

T1

T2

Page 45: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12

0 1 3

18 = 0100102

16 17

20 25

T1-1) allocate16 17 18

Unless…28 = 0111002

T1

T2

20 25 28 T2-1) allocate

Page 46: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12

0 1 3

18 = 0100102

16 17

20 25

T1-1) allocate16 17 18

28 = 0111002

T1

T2

20 25 28

T2-2) CAS

Page 47: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12

0 1 3

18 = 0100102

16 17

20 25

T1-2) CAS

16 17 18

28 = 0111002

T1

T2

20 25 28

T2-2) CAS

Page 48: Lock-Free Resizeable Concurrent Tries

Insertion

4 9 12

0 1 3

18 = 0100102

16 17

20 25

16 17 18

28 = 0111002

T1

T2

20 25 28

Lost insert!

Page 49: Lock-Free Resizeable Concurrent Tries

Insertion – 2nd attempt

4 9 12

0 1 3 16 17

20 25

Solution: I-nodes

Page 50: Lock-Free Resizeable Concurrent Tries

Insertion – 2nd attempt

4 9 12

0 1 3 16 17

20 25

18 = 0100102

28 = 0111002

T1

T2

Page 51: Lock-Free Resizeable Concurrent Tries

Insertion – 2nd attempt

4 9 12

0 1 3 16 17

T1

T2

20 25

18 = 0100102

28 = 0111002

16 17 18

20 25 28 T2-1) allocate

T1-1) allocate

Page 52: Lock-Free Resizeable Concurrent Tries

Insertion – 2nd attempt

4 9 12

0 1 3 16 17

T1

T2

20 25

16 17 18

20 25 28

T2-2) CAS

T1-2) CAS

Page 53: Lock-Free Resizeable Concurrent Tries

Insertion – 2nd attempt

4 9 12

0 1 3 16 17 18

20 25 28

Page 54: Lock-Free Resizeable Concurrent Tries

Insertion – 2nd attempt

4 9 12

0 1 3 16 17 18

20 25 28

Idea: once added to the Ctrie, I-nodes remain present.

Page 55: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 3 16 17 18

20 25 28

Idea: same logic as insert.

Page 56: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 3 16 17 18

20 25 28

Page 57: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 3 16 17 18

20 25 28

16 18 1) allocate

Page 58: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 3 16 17 18

20 25 28

16 18

2) CAS

Page 59: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 3 16 18

20 25 28

Page 60: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 3 18

20 25 28

Page 61: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 3 18

20 25

Page 62: Lock-Free Resizeable Concurrent Tries

Remove

4 9 12

0 1 18

20 25

Page 63: Lock-Free Resizeable Concurrent Tries

Remove

4 9

0 1 18

20 25

Page 64: Lock-Free Resizeable Concurrent Tries

Remove

4 9

1 18

20 25

Page 65: Lock-Free Resizeable Concurrent Tries

Remove

4 9

1 18

20

Page 66: Lock-Free Resizeable Concurrent Tries

Remove

9

1 18

20

Page 67: Lock-Free Resizeable Concurrent Tries

Remove

1 18

Ctrie is not compact => could be faster

Page 68: Lock-Free Resizeable Concurrent Tries

Remove – 2nd attempt

4 9 12

0 1 3 18

20 25 28 3) allocate18 20 25 28

Page 69: Lock-Free Resizeable Concurrent Tries

Remove – 2nd attempt

4 9 12

0 1 3 18

20 25 284) CAS

18 20 25 28

Page 70: Lock-Free Resizeable Concurrent Tries

Remove – 2nd attempt

4 9 12

0 1 3 18

20 25 284) CAS

18 20 25 28

Not correct.

Page 71: Lock-Free Resizeable Concurrent Tries

Remove – 2nd attempt

4 9 12

0 1 3

T1-3) allocate

18 20 25 28

18

20 25 28

T2-1) allocate17 18

T1 – compressT2 – insert 17

Page 72: Lock-Free Resizeable Concurrent Tries

Remove – 2nd attempt

4 9 12

0 1 3

T1-4) CAS

18 20 25 28

18

20 25 28

T2-2) CAS17 18

T1 – compressT2 – insert 17

Page 73: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3 18

20 25 28

Idea: disallow insertions as you do compression

Page 74: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-3) allocate

18

20 25 28

T2-1) allocate17 18

Idea: disallow insertions as you do compression

T-node18

Page 75: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-4) CAS

18

20 25 28

T2-2) CAS17 18

Idea: disallow insertions as you do compression

18

Page 76: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-4) CAS

18

20 25 28

T2-2) CAS failed - repeat17 18

Idea: disallow insertions as you do compression

18

Page 77: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-5) allocate

20 25 28

T2-1) do the same as T1, then repeat

Idea: disallow insertions as you do compression

18

18 20 25 28

Page 78: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-6) CAS

20 25 2818

18 20 25 28

Is this still lock-free?

Page 79: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-6) CAS

20 25 2818

18 20 25 28

Is this still lock-free?Yes - roughly, whoever sees the T-node will help remove it, and there is a finite number of T-nodes (full proof in the paper).

Page 80: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-6) CAS

20 25 2818

18 20 25 28

Is this linearizable?

Page 81: Lock-Free Resizeable Concurrent Tries

Remove – 3rd attempt

4 9 12

0 1 3

T1-6) CAS

20 25 2818

18 20 25 28

Is this linearizable?Yes – roughly, the CAS instruction which makes the new value reachable is the linearization point (see paper for full list).

Page 82: Lock-Free Resizeable Concurrent Tries

Evaluation – quad core i7

Page 83: Lock-Free Resizeable Concurrent Tries

Evaluation – UltraSPARC T2

Page 84: Lock-Free Resizeable Concurrent Tries

Evaluation – 4x 8-core i7

Page 85: Lock-Free Resizeable Concurrent Tries

Summary

• pseudocode and implementation for a concurrent hash trie

• properties proven:• correctness• linearizability• lock-freedom• compactness

• performance evaluation – scalable insertion and remove

Page 86: Lock-Free Resizeable Concurrent Tries

Future work

• concurrent memory pool to avoid GC• lock-free size, iterator and clear operations

running in O(1)

Page 87: Lock-Free Resizeable Concurrent Tries

Thank you!