rcuarray: an rcu-like parallel-safe distributed resizable ... · •not inherently thread-safe to...
TRANSCRIPT
![Page 1: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/1.jpg)
RCUArray: An RCU-like Parallel-Safe Distributed Resizable Array
By Louis Jenkins
![Page 2: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/2.jpg)
The ProblemParallel-Safe Resizing
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
![Page 3: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/3.jpg)
The ProblemParallel-Safe Resizing
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely
Load
Store
![Page 4: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/4.jpg)
The ProblemParallel-Safe Resizing
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely
• Loads and Stores after the smaller storage is reclaimed can produce undefined behavior
Load
Store
![Page 5: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/5.jpg)
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely
• Loads and Stores after the smaller storage is reclaimed can produce undefined behavior
• Why not just synchronize access?• Not scalable
The ProblemParallel-Safe Resizing
![Page 6: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/6.jpg)
The ProblemParallel-Safe Resizing
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely
• Loads and Stores after the smaller storage is reclaimed can produce undefined behavior
• Why not just synchronize access?• Not scalable
• What do we need?
![Page 7: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/7.jpg)
The ProblemParallel-Safe Resizing
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely
• Loads and Stores after the smaller storage is reclaimed can produce undefined behavior
• Why not just synchronize access?• Not scalable
• What do we need?1. Allow concurrent access to both smaller and larger storage
Load
Store
![Page 8: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/8.jpg)
The ProblemParallel-Safe Resizing
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely
• Loads and Stores after the smaller storage is reclaimed can produce undefined behavior
• Why not just synchronize access?• Not scalable
• What do we need?1. Allow concurrent access to both smaller and larger storage
2. Ensure safe memory management of smaller storage
Load
Store
![Page 9: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/9.jpg)
The ProblemParallel-Safe Resizing
• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage
• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely
• Loads and Stores after the smaller storage is reclaimed can produce undefined behavior
• Why not just synchronize access?• Not scalable
• What do we need?1. Allow concurrent access to both smaller and larger storage
2. Ensure safe memory management of smaller storage
3. Ensure that stores to old memory are visible in larger storage
Load
Store
![Page 10: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/10.jpg)
Read-Copy-Update (RCU)• Synchronization strategy that favors performance of readers over writers
• Read the current snapshot 𝑠
𝑆 = 𝑏1
P
![Page 11: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/11.jpg)
Read-Copy-Update (RCU)• Synchronization strategy that favors performance of readers over writers
• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′
𝑆 = 𝑏1
P
𝑆′ = 𝑏1
![Page 12: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/12.jpg)
Read-Copy-Update (RCU)• Synchronization strategy that favors performance of readers over writers
• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′
• Update applied to s′…
𝑆 = 𝑏1
P
𝑆′ = 𝑏1, 𝑏2
![Page 13: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/13.jpg)
Read-Copy-Update (RCU)• Synchronization strategy that favors performance of readers over writers
• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′
• Update applied to s′, 𝑠′ becomes new current snapshot
𝑆 = 𝑏1
P
𝑆′ = 𝑏1, 𝑏2
![Page 14: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/14.jpg)
Read-Copy-Update (RCU)• Synchronization strategy that favors performance of readers over writers
• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′
• Update applied to s′, 𝑠′ becomes new current snapshot• Not always applicable in all situations
• Must be safe to access at least two different snapshots of the same data
𝑆 = 𝑏1 𝑆′ = 𝑏1, 𝑏2
Reader
Reader
![Page 15: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/15.jpg)
Read-Copy-Update (RCU)
Read-Copy-Update
• Readers Concurrent with Readers
Reader-Writer Locks
• Readers Concurrent With Readers
• Synchronization strategy that favors performance of readers over writers• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′
• Update applied to s′, 𝑠′ becomes new current snapshot• Not always applicable in all situations
• Must be safe to access at least two different snapshots of the same data
![Page 16: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/16.jpg)
Read-Copy-Update (RCU)
Read-Copy-Update
• Readers Concurrent with Readers
• Writers Mutually Exclusive with Writers
Reader-Writer Locks
• Readers Concurrent With Readers
• Writers Mutually Exclusive with Writers
• Synchronization strategy that favors performance of readers over writers• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′
• Update applied to s′, 𝑠′ becomes new current snapshot• Not always applicable in all situations
• Must be safe to access at least two different snapshots of the same data
![Page 17: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/17.jpg)
Read-Copy-Update (RCU)
Read-Copy-Update
• Readers Concurrent with Readers
• Writers Mutually Exclusive with Writers
• Readers Concurrent with Writers
Reader-Writer Locks
• Readers Concurrent With Readers
• Writers Mutually Exclusive with Writers
• Readers Mutually Exclusive with Writers
• Synchronization strategy that favors performance of readers over writers• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′
• Update applied to s′, 𝑠′ becomes new current snapshot• Not always applicable in all situations
• Must be safe to access at least two different snapshots of the same data
![Page 18: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/18.jpg)
Distributed RCU• Privatization and Snapshots
• Each node in the cluster has its own local snapshotLocale #0 Locale #1
Locale #2 Locale #3
𝑆 = 𝑏1 𝑆 = 𝑏1
𝑆 = 𝑏1 𝑆 = 𝑏1
P P
P P
![Page 19: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/19.jpg)
Distributed RCU• Privatization and Snapshots
• Each node in the cluster has its own local snapshot
• All local snapshots point to the same block
Locale #0 Locale #1
Locale #2 Locale #3
𝑆 = 𝑏1 𝑆 = 𝑏1
𝑆 = 𝑏1 𝑆 = 𝑏1
P P
P P
𝑏1
![Page 20: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/20.jpg)
Distributed RCU• Privatization and Snapshots
• Each node in the cluster has its own local snapshot
• All local snapshots point to the same block
• Reader Concurrency• Readers will read from local snapshot only
• All readers regardless of node will see same block
• All stores to 𝑏1 are seen by any snapshot or node
Locale #0 Locale #1
Locale #2 Locale #3
𝑆 = 𝑏1 𝑆 = 𝑏1
𝑆 = 𝑏1 𝑆 = 𝑏1
P P
P P
𝑏1
Reader Reader
Reader Reader
![Page 21: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/21.jpg)
Distributed RCU• Privatization and Snapshots
• Each node in the cluster has its own local snapshot
• All local snapshots point to the same block
• Reader Concurrency• Readers will read from local snapshot only
• All readers regardless of node will see same block
• All stores to 𝑏1 are seen by any snapshot or node
• Writer Mutual Exclusion• Use a distributed lock
Locale #0 Locale #1
Locale #2 Locale #3
𝑆 = 𝑏1 𝑆 = 𝑏1
𝑆 = 𝑏1 𝑆 = 𝑏1
P P
P P
𝑏1
Reader Reader
Reader Reader
![Page 22: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/22.jpg)
Distributed RCU• Privatization and Snapshots
• Each node in the cluster has its own local snapshot
• All local snapshots point to the same block
• Reader Concurrency• Readers will read from local snapshot only
• All readers regardless of node will see same block
• All stores to 𝑏1 are seen by any snapshot or node
• Writer Mutual Exclusion• Use a distributed lock
• Perform each update local to each node
Locale #0 Locale #1
Locale #2 Locale #3
𝑆′ = 𝑏1, 𝑏2 𝑆′ = 𝑏1, 𝑏2
𝑆′ = 𝑏1, 𝑏2 𝑆′ = 𝑏1, 𝑏2
P P
P P
𝑏1
Reader Reader
Reader Reader
𝑏2
![Page 23: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/23.jpg)
Distributed RCU• Privatization and Snapshots
• Each node in the cluster has its own local snapshot
• All local snapshots point to the same block
• Reader Concurrency• Readers will read from local snapshot only
• All readers regardless of node will see same block
• All stores to 𝑏1 are seen by any snapshot or node
• Writer Mutual Exclusion• Use a distributed lock
• Perform each update local to each node
• Results• Fast and parallel-safe loads/stores across multiple nodes
• Allow for loads and stores to be immediately visible
• 40x faster resizing than naïve Block Distribution at 32-nodes
Locale #0 Locale #1
Locale #2 Locale #3
𝑆′ = 𝑏1, 𝑏2 𝑆′ = 𝑏1, 𝑏2
𝑆′ = 𝑏1, 𝑏2 𝑆′ = 𝑏1, 𝑏2
P P
P P
𝑏1
Reader Reader
Reader Reader
𝑏2
![Page 24: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/24.jpg)
RCUArray – Resizing Example
𝑠
𝑏1
𝑠
𝑏1
𝑅
Set of readers 𝑅 begin using snapshot 𝑠
![Page 25: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/25.jpg)
RCUArray – Resizing
𝑠
𝑏1
𝑅
Writer acquires Cluster Lock
𝑠
𝑏1
𝑅
![Page 26: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/26.jpg)
RCUArray – Resizing
Writer clones 𝑠 to create 𝑠′
𝑠
𝑏1
𝑅 𝑠 𝑠′
𝑏1
𝑅
![Page 27: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/27.jpg)
RCUArray – Resizing
Writer appends block 𝑏2 to 𝑠′𝑠 𝑠′
𝑏1
𝑅
𝑏2
𝑠 𝑠′
𝑏1
𝑅
![Page 28: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/28.jpg)
RCUArray – Resizing
Writer updates current snapshot to 𝑠′𝑠 𝑠′
𝑏1
𝑅
𝑏2
𝑠 𝑠′
𝑏1
𝑅
𝑏2
![Page 29: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/29.jpg)
RCUArray – Resizing
Set of readers 𝑅′ begin accessing 𝑠′𝑠 𝑠′
𝑏1
𝑅
𝑏2
𝑠 𝑠′
𝑏1
𝑅
𝑏2
𝑅′
![Page 30: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/30.jpg)
RCUArray – Resizing
Readers 𝑅 finish using 𝑠
𝑠 𝑠′
𝑏1 𝑏2
𝑠 𝑠′
𝑏1
𝑅
𝑏2
𝑅′𝑅′
![Page 31: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/31.jpg)
RCUArray – Resizing
Reclaim 𝑠𝑠′
𝑏1 𝑏2
𝑠 𝑠′
𝑏1 𝑏2
𝑅′𝑅′
![Page 32: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/32.jpg)
RCUArray – Resizing
Writer releases cluster lock𝑠′
𝑏1 𝑏2
𝑅′𝑠′
𝑏1 𝑏2
𝑅′
![Page 33: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/33.jpg)
Network Atomics vs Remote Execution Atomics• In Chapel, pointers to potentially remote memory are widened to 128-bits
• 64-bit Address, 32-bit Locale id, 32-bit Sub-locale id (NUMA)
• Cray’s Aeries NIC only supports 64-bit network atomic operations• Atomics via remote execution proves to be significantly slower than network atomics
• Distributed wait-free algorithms can scale with network atomics• Must have a low constant bounds in inter-node communications
Network Execution 26x faster (32 Nodes) Network Execution 20x faster (32 Nodes)
![Page 34: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/34.jpg)
RCUArray as a Dynamic Heap
• Replacing Wide Pointers• Blocks have locality information
• 64-bits vs 128-bits
• Network Atomics
• Recycling Memory• Each node recycles indices to local
blocks
• Dynamic Heap• Parallel-Safe and Fast Resizing
• Distributed across multiple locales
• Great as a per data-structure heap
![Page 35: RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to access memory while it is being resized •Memory has to be ‘moved’ from the](https://reader033.vdocuments.us/reader033/viewer/2022053100/605b010e98a4d6286d32e0e2/html5/thumbnails/35.jpg)
Conclusion• Chapel makes RCU easier…
• Lot of abstraction and language constructs• Privatization
• Parallel remote tasks
• Including Distributed RCU…
• RCUArray as a distribution • Exploring implementation under Domain map Standard Interface (DSI)
• Memory Management Related Efforts• Current efforts to add Quiescent State-Based “Garbage Collector” into language
• 75% finished runtime changes… but on hold
• Plans to introduce a Epoch-Based “Garbage Collector” as a Chapel module…