shimin chen lba reading group presentation
DESCRIPTION
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007. Shimin Chen LBA Reading Group Presentation. Motivation. Synchronization is a challenging step in parallel programming - PowerPoint PPT PresentationTRANSCRIPT
Colorama: Architectural Support for Data-Centric Synchronization
Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007
Shimin ChenLBA Reading Group
Presentation
Motivation
Synchronization is a challenging step in parallel programming
Transactional Memory helpful but still complicated Programmers have to reason non-locally Code-centric approach
Data-Centric Synchronization (DSC) desirable Associate synchronization constraints with data
structures Which data items should be in the same critical section System automatically inserts sync operations into code Reason locally
What’s New? Existing DCS proposal are SW-only (S-
DCS) Cannot handle C/C++ pointer aliasing Unrealistic
New proposal: hardware DCS (H-DCS) Colorama HW primitives to start and exit critical
sections Independent of the underlying sync
mechanisms
Outline
Introduction Data-Centric Synchronization
(DCS) Architectures of Colorama Programming with Colorama Evaluation Conclusion
Data-Centric Synchronization (DCS)
Data consistency domain Two threads cannot access the same domain at the
same time For example: X, and Y are in the same domain
If a thread is accessing X, no other threads can access X & Y
System needs to automatically infer entry and exit points of critical sections:
Entry: access to data in a domain Exit: define a simple, clear exit policy and let
programmers write code to conform to this policy
Software DCS (S-DCS) Vaziri et al’s Atomic Sets Compiler and language extensions to Java Data consistency domain: atomic set, subset
of fields of a Java class Entry point: compiler analysis Exit policy: insert exit point
In the same method as the entry point and Right before method return
Colorama: Hardware DCS Data consistency domain: color
Data item belongs to a domain: colored Entry point: detected by HW Exit policy: driven by compiler Examples:
Examples Cont’d
Outline
Introduction Data-Centric Synchronization
(DCS) Architectures of Colorama Programming with Colorama Evaluation Conclusion
Structures Overview
Every colored data item has an entry in Palette (details next)
Per-thread: all 3 structures have the same number of entries
Owned color array: current critical sections CAB, CRB: used for exit policy
Palette
Palette based on Mondrian Memory Protection system (Witchel et al. ASPLOS’02) – the white part
Extend with color ID (the gray part)
SW managed
HW
Entry Point HW monitors each load and store
Check cached Palette for the mem op Check owned colors array Trigger a user-level SW handler if accessing
a colored region not owned Handler for entry point:
Add color ID into owned colors array Start critical section (e.g. begin transaction)
Exit Policy Exit a critical section when the thread returns
from the subroutine where the critical section was entered
Implementing Exit Policy Color acquire bitmap register (CAB) and color release
bitmap register (CRB) CAB automatically set by HW at entry points Compiler generates the following code:
Subroutine prologue:Push CABCAB 0
Subroutine epilogue:CRB CABPop CAB
Upon write to CRB: HW triggers user-level handler Handler: remove Color ID from owned color array, exit
critical section
Handling Pointers as Subroutine Arguments
Perform multiple operations on a structure together Propose “colorcheck” instruction
Using Locks as Sync Mechanisms
Colorama can also use locks Two potential problems:
Longer critical section thus maybe more contention May deadlock See evaluations
Outline
Introduction Data-Centric Synchronization
(DCS) Architectures of Colorama Programming with Colorama Evaluation Conclusion
Correctness Critical sections of the same color are
serialized Correctly colored programs data-race free Possible programming errors:
Fail to color shared data structures Use different colors to data that should be protected
together
Compatibility Issues Legacy libraries that do not use Colorama
OK if they explicitly protect lib data using locks, etc. Colorama protects application data outside of lib
Cases requires extensions to Colorama Worker thread executes an infinite loop that
processes incoming request Needs to release lock, wait, acquire lock in the same
loop Colorama extensions: getcolorid etc.
Complete API
Outline
Introduction Data-Centric Synchronization
(DCS) Architectures of Colorama Programming with Colorama Evaluation Conclusion
Setup Evaluation is based on analyzing applications
by using a Pin-based tool
Is the Exit Policy Suitable?
Matched: lock acquire & release in same subroutine Almost all dynamic and 95% static critical sections Answer: Yes
Critical Section Size Increase
How often multiple independent critical sections are in the same subroutine?
Potential deadlocks 1% dynamic and 4% static Detailed analysis shows that the resulting lock order
always same, thus no deadlocks
Structure Sizes
# palette rows: # of allocated regions + # of static data objects
# of colors: # lock addr
# of Owned Colors Array entries: max # of active locks held by a thread
Colorama Instruction Overheads
Per-routine: Prologue & epilogue: 6 insn/routine 1 colorcheck insn per pointer argument Estimate 7 insn/routine On avg, 1.6 routines per 100 dynamic insns: so ~11%
insns Entry and exit handlers: low freq of critical section
enry and exit, so low overhead Coloring overheads ~ memory allocation calls
# of insns between allocations: firefox, gaim, gftp – 2-4K Memory allocators can keep pools of colored memory (??)
Memory Overhead
MMP: Mondrian Memory Protection Palette adds 1-2.5% more space over app footprint
Conclusions Colorama: Hardware Data-Centric
Synchronization HW support for entry and exit points Evaluation suggests:
Exit policy is suitable Low impact on critical section lengths Modest additional overhead over MMP
This paper does not even do simulation!
Related Work
monitors