lock-free resizeable concurrent tries
DESCRIPTION
Lock-Free Resizeable Concurrent Tries. Aleksandar Prokopec, Phil Bagwell, Martin Odersky LAMP, École Polytechnique Fédérale de Lausanne Switzerland. Motivation. xs.foreach { x => doSomething(x) }. Motivation. xs.foreach { x => doSomething (x) }. ys = xs.map { x => x * (-1) }. - PowerPoint PPT PresentationTRANSCRIPT
Lock-Free Resizeable Concurrent Tries
Aleksandar Prokopec, Phil Bagwell, Martin OderskyLAMP, École Polytechnique Fédérale de Lausanne
Switzerland
Motivation
xs.foreach { x => doSomething(x)}
Motivation
xs.foreach { x => doSomething(x)}
ys = xs.map { x => x * (-1)}
Motivation
ys = new ConcurrentMapxs.foreach { x => ys.insert(x * (-1))}
Hash Array Mapped Tries (HAMT)
Hash Array Mapped Tries (HAMT)
0 = 0000002
Hash Array Mapped Tries (HAMT)
0
Hash Array Mapped Tries (HAMT)
016 = 0100002
Hash Array Mapped Tries (HAMT)
0 16
Hash Array Mapped Tries (HAMT)
0 164 = 0001002
Hash Array Mapped Tries (HAMT)
16
0
4 = 0001002
Hash Array Mapped Tries (HAMT)
16
0 4
Hash Array Mapped Tries (HAMT)
16
0 4
12 = 0011002
Hash Array Mapped Tries (HAMT)
16
0 4
12 = 0011002
Hash Array Mapped Tries (HAMT)
16
0 4 12
Hash Array Mapped Tries (HAMT)
16 33
0 4 12
Hash Array Mapped Tries (HAMT)
16 33
0 4 12
48
Hash Array Mapped Tries (HAMT)
16
0 4 12
48
33 37
Hash Array Mapped Tries (HAMT)
16
4 12
48
33 37
0 3
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 8 93
48 57
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 8 93
48 57
Too much space!
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 8 93
48 57
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 8 93
48 57
Linear search at every level - slow!
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 8 93
48 57
Solution – bitmap index!Relying on BITPOP instruction.
Hash Array Mapped Tries (HAMT)
48 57
48 571 0 1 0
48 571 0 1 0
48 5710
BITPOP(((1 << ((hc >> lev) & 1F)) – 1) & BMP)
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 8 93
48 57
For 32-way tries – 32-bit bitmap.
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 8 93
48 57
Hash Array Mapped Tries (HAMT)
4 12 16 20 25 33 37
0 1 93
48 57
Hash Array Mapped Tries (HAMT)
4 9 12 16 20 25 33 37
0 1 3
48 57
Remove compresses the trie.
Hash Array Mapped Tries (HAMT)
• advantages:• low space consumption and shrinking• no contiguous memory region required• fast – logarithmic complexity, but with a low
constant factor• used as efficient immutable maps• no global resize phase – real time
applications, potentially more scalable concurrent operations?
Concurrent Trie (Ctrie)
• goals:• thread-safe concurrent trie• maintain the advantages of HAMT• rely solely on CAS instructions• ensure lock-freedom and linearizability
• lookup – probably same as for HAMT
CAS instruction
CAS(address, expected_value, new_value)
Atomically replaces the value at the address with the new_value if it is equal to the expected_value.
Returns true if successful, false otherwise.
May fail spuriously.
Lock-freedom
If multiple threads execute an operation, at least one of them will complete the operation within a finite number of steps.
Lock-freedom
If multiple threads execute an operation, at least one of them will complete the operation within a finite number of steps.
do { a = READ(addr) b = a + 1 } while (!CAS(addr, a, b))
Lock-freedom
If multiple threads execute an operation, at least one of them will complete the operation within a finite number of steps.
def counter() do { a = READ(addr) b = a + 1 } while (!CAS(addr, a, b))
Insertion
4 9 12 16 20 25 33 37
0 1 3
48 57
17 = 0100012
Insertion
4 9 12 16 20 25 33 37
0 1 3
48 57
17 = 010001216 17
1) allocate
Insertion
4 9 12 20 25 33 37
0 1 3
48 57
17 = 010001216 17
2) CAS
Insertion
4 9 12 20 25 33 37
0 1 3
48 57
17 = 010001216 17
Insertion
4 9 12 33 37
0 1 3
48 57
18 = 0100102
16 17
20 25
Insertion
4 9 12 33 37
0 1 3
48 57
18 = 0100102
16 17
20 25
1) allocate16 17 18
Insertion
4 9 12 33 37
0 1 3
48 57
18 = 0100102
20 25
2) CAS 16 17 18
Insertion
4 9 12 33 37
0 1 3
48 57
18 = 0100102
20 25
2) CAS 16 17 18
Unless…
Insertion
4 9 12 33 37
0 1 3
48 57
18 = 0100102
16 17
20 25
T1-1) allocate16 17 18
Unless…28 = 0111002
T1
T2
Insertion
4 9 12
0 1 3
18 = 0100102
16 17
20 25
T1-1) allocate16 17 18
Unless…28 = 0111002
T1
T2
20 25 28 T2-1) allocate
Insertion
4 9 12
0 1 3
18 = 0100102
16 17
20 25
T1-1) allocate16 17 18
28 = 0111002
T1
T2
20 25 28
T2-2) CAS
Insertion
4 9 12
0 1 3
18 = 0100102
16 17
20 25
T1-2) CAS
16 17 18
28 = 0111002
T1
T2
20 25 28
T2-2) CAS
Insertion
4 9 12
0 1 3
18 = 0100102
16 17
20 25
16 17 18
28 = 0111002
T1
T2
20 25 28
Lost insert!
Insertion – 2nd attempt
4 9 12
0 1 3 16 17
20 25
Solution: I-nodes
Insertion – 2nd attempt
4 9 12
0 1 3 16 17
20 25
18 = 0100102
28 = 0111002
T1
T2
Insertion – 2nd attempt
4 9 12
0 1 3 16 17
T1
T2
20 25
18 = 0100102
28 = 0111002
16 17 18
20 25 28 T2-1) allocate
T1-1) allocate
Insertion – 2nd attempt
4 9 12
0 1 3 16 17
T1
T2
20 25
16 17 18
20 25 28
T2-2) CAS
T1-2) CAS
Insertion – 2nd attempt
4 9 12
0 1 3 16 17 18
20 25 28
Insertion – 2nd attempt
4 9 12
0 1 3 16 17 18
20 25 28
Idea: once added to the Ctrie, I-nodes remain present.
Remove
4 9 12
0 1 3 16 17 18
20 25 28
Idea: same logic as insert.
Remove
4 9 12
0 1 3 16 17 18
20 25 28
Remove
4 9 12
0 1 3 16 17 18
20 25 28
16 18 1) allocate
Remove
4 9 12
0 1 3 16 17 18
20 25 28
16 18
2) CAS
Remove
4 9 12
0 1 3 16 18
20 25 28
Remove
4 9 12
0 1 3 18
20 25 28
Remove
4 9 12
0 1 3 18
20 25
Remove
4 9 12
0 1 18
20 25
Remove
4 9
0 1 18
20 25
Remove
4 9
1 18
20 25
Remove
4 9
1 18
20
Remove
9
1 18
20
Remove
1 18
Ctrie is not compact => could be faster
Remove – 2nd attempt
4 9 12
0 1 3 18
20 25 28 3) allocate18 20 25 28
Remove – 2nd attempt
4 9 12
0 1 3 18
20 25 284) CAS
18 20 25 28
Remove – 2nd attempt
4 9 12
0 1 3 18
20 25 284) CAS
18 20 25 28
Not correct.
Remove – 2nd attempt
4 9 12
0 1 3
T1-3) allocate
18 20 25 28
18
20 25 28
T2-1) allocate17 18
T1 – compressT2 – insert 17
Remove – 2nd attempt
4 9 12
0 1 3
T1-4) CAS
18 20 25 28
18
20 25 28
T2-2) CAS17 18
T1 – compressT2 – insert 17
Remove – 3rd attempt
4 9 12
0 1 3 18
20 25 28
Idea: disallow insertions as you do compression
Remove – 3rd attempt
4 9 12
0 1 3
T1-3) allocate
18
20 25 28
T2-1) allocate17 18
Idea: disallow insertions as you do compression
T-node18
Remove – 3rd attempt
4 9 12
0 1 3
T1-4) CAS
18
20 25 28
T2-2) CAS17 18
Idea: disallow insertions as you do compression
18
Remove – 3rd attempt
4 9 12
0 1 3
T1-4) CAS
18
20 25 28
T2-2) CAS failed - repeat17 18
Idea: disallow insertions as you do compression
18
Remove – 3rd attempt
4 9 12
0 1 3
T1-5) allocate
20 25 28
T2-1) do the same as T1, then repeat
Idea: disallow insertions as you do compression
18
18 20 25 28
Remove – 3rd attempt
4 9 12
0 1 3
T1-6) CAS
20 25 2818
18 20 25 28
Is this still lock-free?
Remove – 3rd attempt
4 9 12
0 1 3
T1-6) CAS
20 25 2818
18 20 25 28
Is this still lock-free?Yes - roughly, whoever sees the T-node will help remove it, and there is a finite number of T-nodes (full proof in the paper).
Remove – 3rd attempt
4 9 12
0 1 3
T1-6) CAS
20 25 2818
18 20 25 28
Is this linearizable?
Remove – 3rd attempt
4 9 12
0 1 3
T1-6) CAS
20 25 2818
18 20 25 28
Is this linearizable?Yes – roughly, the CAS instruction which makes the new value reachable is the linearization point (see paper for full list).
Evaluation – quad core i7
Evaluation – UltraSPARC T2
Evaluation – 4x 8-core i7
Summary
• pseudocode and implementation for a concurrent hash trie
• properties proven:• correctness• linearizability• lock-freedom• compactness
• performance evaluation – scalable insertion and remove
Future work
• concurrent memory pool to avoid GC• lock-free size, iterator and clear operations
running in O(1)
Thank you!