safe and efficient supervised memory systems
DESCRIPTION
Safe and Efficient Supervised Memory Systems. 1) Out-of-band metadata per data block 2) Monitor, control (supervise) data accesses 3) Run handlers on specific metadata states. Jayaram Bobba † , Marc Lupon ‡ , Mark D. Hill , and David A. Wood. Department of Computer Sciences - PowerPoint PPT PresentationTRANSCRIPT
Safe and Efficient Supervised Memory Systems
Jayaram Bobba†, Marc Lupon‡, Mark D. Hill, and David A. Wood
Department of Computer SciencesUniversity of Wisconsin-Madison
†Intel Corporation ‡Universitat Politècnica de Catalunya
† ‡ Work done while at University of Wisconsin-Madison
1) Out-of-band metadata per data block2) Monitor, control (supervise) data accesses3) Run handlers on specific metadata states
Wisconsin Multifacet Project 2
SW more complexProductivity Wall
HW more powerful
Hardware TMEmpty/full-bits
Why Supervised Memory Systems?
Hardware Support to Improve Productivity
MemTracker,SafeMem,iWatcher Supervised (Memory) Systems
2/15/2011
Deterministic Shared Memory
Hardware-assisted GCInformation Flow Tracking
Wisconsin Multifacet Project 3
• Many supervised memory systems• Assume SC, but few systems do SC1. Moving to TSO (x86 & SPARC) non-trivial2. Supervised Memory for TSO– TSOall: TSO for data & metadata slow– TSOdata: TSO for data & metadata tricky
3. Safe Supervision– Metadata for X only controls data at X– Fast & Simple
Formal Foundation
Current/Future Supervised Systems
Executive Summary
2/15/2011
Wisconsin Multifacet Project 4
Outline
• Introduction• Move To TSO non-trivial– Case Study: Deterministic Multiprocessor (DMP)
• Supervised Memory for TSO• Safe Supervision• Evaluation
2/15/2011
04/22/2023 Wisconsin Multifacet Project 5
A TSO-compliant system
Proc
esso
r PC
Stor
e Bu
ffer
r1r2
Reordering can be incorrect
r3M
emor
y
PC
r1r2r3
ST 1, [A]LD [B], r1ST 2,[C]LD [C], r3
ST 0x01, A
ST 0x10, C
MetadataBlock DataA 0x00
B 0x01
C 0x11
0x01
0x10
ST ALD B
P1 P2
04/22/2023 Wisconsin Multifacet Project 6
DMP-ShTab [Devietti et al., ASPLOS 09]Pr
oces
sor PC
r1r2r3
Mem
ory
PC
r1r2r3
LD [Y], r2ST r2, [Y]ST 2,[A]LD [B], r3
Metadata
Owned@T1
Owned@T2
Block DataA 0x10
B
X 0x11
Y 0xff
LD [X], r1LD [B], r2ST 1, [B]
T1T2
0x110xff
STALL STALL
Owned@T2
Shared@T1,T2Owned@T2
Shared@T1,T20x000x01
0x000x01
Owned@T1
Private
Shared
Reordering can be incorrect
P1 P2
04/22/2023 Wisconsin Multifacet Project 7
Is reordering safe? A Case Study DMP-ShTab on TSO
Proc
esso
r PC
r1r2r3
Mem
ory
PC
r1r2r3
Stor
e Bu
ffer
LD [Y], r2ST r2, [Y]ST 2,[A]LD [B], r3
Metadata
Owned@T1
Owned@T2
Block DataA 0x10
B
X 0x11
Y 0xff
Explore relaxed supervised systems
LD [X], r1LD [B], r2ST 1, [B]
T1T2
0x110xff
STALL STALL
Owned@T2
Shared@T1,T2
0x00
0x00
ST 0x10, A
Case1: LD B does not pass ST A r3 gets 0x01
Case2: LD B passes ST A r3 gets 0x00
Reordering can be incorrect
P1 P2Private
Shared
Wisconsin Multifacet Project 8
Outline
• Introduction• Move To TSO non-trivial• Supervised Memory for TSO– Define Supervised Memory– TSOall: Simple but Slow
– TSOdata: Fast but tricky
• Safe Supervision• Evaluation
2/15/2011
04/22/2023 Wisconsin Multifacet Project 9
Supervised Memory
• Each memory location A,– data (A.d)– metadata (A.m)
• New operations– Supervised Load (sLD A)– Supervised Store (sST A)
• Jump on reading special metadata (Optionally)– Hardware exception
Define Supervised MemorySupervised Memory for TSO
04/22/2023 Wisconsin Multifacet Project 10
Supervised OperationssLD A =>
Start:atomic{curm = Val[RA.m] // Read metadatanextm = NEXT(Load, curm) // Check software-
// specified FSMIf nextm == EXCEPTION then Jump to HandlerIf (nextm != curm) thenWA.m,nextm // Update metadata RA.d // Read data}Handler:…
Define Supervised MemorySupervised Memory for TSO
04/22/2023 Wisconsin Multifacet Project 11
TSO Axioms [Hangal et al., ISCA 2004]
Supervised Memory for TSO
04/22/2023 Wisconsin Multifacet Project 12
TSO Axioms [Hangal et al., ISCA 2004]
Axiom Description
Order Total Order on all write accesses
Atomicity No intervening accesses for atomic operations
Termination All write accesses eventually complete
Value Reads return latest value from memory or store buffer
Memory Barrier
No reordering across a barrier
ReadAny Accesses cannot pass outstanding reads
WriteWrite Write access cannot pass outstanding writes Reordering Axioms
Rd ARd B
Rd AWr B
Wr AWr B
Wr ARd B
Allows store buffers
Supervised Memory for TSO
04/22/2023 Wisconsin Multifacet Project 13
TSOall: A Consistency Model for Supervised Memory
TSO axioms applied to all accesses—data and metadata
+ (Simple) Like TSO— (Slow) Prohibits optimizationsThread: sST A
sLD B => Store buffers ineffective
->[Rd A.m, Wr A.d, Wr A.m]->[Rd B.m, Rd B.d]
Supervised Memory for TSO
04/22/2023 Wisconsin Multifacet Project 14
TSOdata: Fast Yet Simple
Thread: sST B sLDA
Axiom Description
Order Total Order on all write accesses
Atomicity No intervening accesses for atomic operations
Termination All write accesses eventually complete
Value Reads return latest value from memory or store buffer
Memory Barrier
No reordering across a barrier
ReadAny Data accesses cannot pass outstanding data reads
WriteWrite Data writes cannot pass outstanding data writes
Reor
derin
g Ax
iom
s
ÞStore buffers can be used
->[Rd A.m, Wr A.d, Wr A.m]
->[Rd B.m, Rd B.d]
Supervised Memory for TSO
Wisconsin Multifacet Project 15
Outline
• Introduction• Move To TSO non-trivial• Supervised Memory for TSO• Safe Supervision• Evaluation
2/15/2011
04/22/2023 Wisconsin Multifacet Project 16
Safe SupervisionMotivation
No Reordering, Easy to Reason (TSOall) vs
Reorder, Performance (TSOdata)
Safe Supervision
04/22/2023 Wisconsin Multifacet Project 17
Blast from the Past[Adve and Hill, ISCA1990]
No Reordering, Easy to Reason (SC) vs
Reorder, Performance (RC)• Observation: – Simple programs rely only on certain SC orders– Ignore non-essential orders. Still appears as SC
• Challenge: Simple? Non-essential orders?• Solution: Data-race-freedom– For data-race-free programs, RC = SC
Safe Supervision
04/22/2023 Wisconsin Multifacet Project 18
Safe SupervisionMotivation
No Reordering, Easy to Reason (TSOall) vs
Reorder, Performance (TSOdata)• Observation: – Simple supervised programs rely only on certain orders– Ignore non-essential orders. Still appears as TSOall
• Challenge: Simple? Non-essential orders?• Solution: Safe Supervision– For safely-supervised programs, TSOdata = TSOall
Safe Supervision
04/22/2023 Wisconsin Multifacet Project 19
Safe Supervision• A location’s metadata is only used to control access to that
location’s data
• Most uses of supervision are safely supervised. E.g.,• Heap Checker: Initialized/Uninitialized values• Transactional Memory: Conflict Detection information
• DMP is NOT safely-supervised
Thread 1:B.data = 1A.mdata = Full
Initially, A.mdata = Empty, B.data = 0Thread 2:While (A.mdata == Empty);Read B.data
Safe Supervision
Wisconsin Multifacet Project 20
Outline
• Introduction• Move To TSO non-trivial• Supervised Memory for TSO• Safe Supervision• Evaluation– Is reordering useful?
2/15/2011
Wisconsin Multifacet Project 21
Supervised Systems
• TokenTM [bobba et al., ISCA2008]– Transactional Memory– Metadata for tracking read/write-sets
• HARD [zhou et al., HPCA2007]– Race Detection– Metadata for tracking sharing state and locksets
• Both safely-supervised
2/15/2011
Reordering is useful
Wisconsin Multifacet Project 22
Evaluation Setup
• Systems– TokenTM on in-order• TokenTMall on TSOall, TokenTMdata on TSOdata
– HARD on OOO superscalar• HARDall on TSOall, HARDdata on TSOdata
• Simulation built on Multifacet GEMS• Workloads– TokenTM: STAMP – HARD: Wisconsin Commercial Workload Suite
2/15/2011
Reordering is useful
Wisconsin Multifacet Project 23
Results TokenTM
Bayes
Genome
Intruder
Kmeans
Labyri
nthSsc
a2
Vacation
Yada
0.8
1
1.2
1.4
TokenTM_allTokenTM_data
Perf
orm
ance
nor
mal
ized
to
Toke
nTM
_all
2/15/2011
Speedups: 3% in Kmeans to 22% in Labyrinth
Reordering is useful
Wisconsin Multifacet Project 24
Results HARD
Apache
JBB OLTP ZEUS0.8
1
1.2
1.4
HARD_allHARD_data
Perf
orm
ance
nor
mal
ized
to H
ARD_
all
2/15/2011
Speedups: 3% in JBB to 5% in Apache
Reordering is useful
Wisconsin Multifacet Project 25
In the paper…
• Formal models– Formal Definition of Safe Supervision– Proofs (in thesis)
http://www.cs.wisc.edu/multifacet/theses/jayaram_bobba_phd.pdf
• OpenSPARC case study– How to handle reordering issues?– Metadata overhead
2/15/2011
Wisconsin Multifacet Project 26
• Many supervised memory systems• Assume SC, but few systems do SC1. Moving to TSO (x86 & SPARC) non-trivial2. Supervised Memory for TSO– TSOall: TSO for data & metadata slow– TSOdata: TSO for data & metadata tricky
3. Safe Supervision– Metadata for X only controls data at X– Fast & Simple
Executive Summary
2/15/2011
Formal Foundation Current/Future Supervised Systems
Wisconsin Multifacet Project 272/15/2011
04/22/2023 Wisconsin Multifacet Project 28
Deterministic Shared Memory (DMP)[Devietti et al., ASPLOS 2009]
“depending upon the consistency model of the underlying hardware, threads must perform a memory fence at the edge of a quantum”
• Insert a fence after the last operation in the quantum
• Insert a fence before the first shared operation in the quantum
I3: Reordered metabit-reads
Illustration
Explore relaxed supervised systems
04/22/2023 Wisconsin Multifacet Project 29
Is reordering trivial?Empty/full-bits
Proc
esso
r PC
Stor
e Bu
ffer
r1r2r3
Mem
ory
PC
r1r2r3
ST 1, [A]LD [B], r1ST 2,[C]LD [C], r3
ST 0x01, A
ST 0x10, C
MetadataFull
None
Empty
Block DataA 0x00
B 0x01
C 0x11
0x01
Explore relaxed supervised systems
LD
ST
ST
Exception
Empty Full
LD
None LD/ST
I2: NO LOAD BYPASS
EXCEPTION
I3: LATEEXCEPTIONS
04/22/2023 Wisconsin Multifacet Project 30
TSOdata on OpenSPARC T2
• Goal: Explore low-level issues on a real design• Late Exceptions with deferred handlers– Dump store buffer entries on exception– Enhance store buffer to carry Virtual Address (VA)– ~200 cycles to read out 4 entries
• Disable store buffer bypassing for supervised loads
• Low space overhead for adding metabits (~4%)
04/22/2023 Wisconsin Multifacet Project 31
Existing proposals assume SC
• Assume SC or don’t deal with multiprocessorsProposal Base
ArchitectureImplementation
WWT MIPS SC
Tapeworm MIPS SC
LogTM SPARC SC
OneTM SPARC SC
Informing Memory MIPS, Alpha SC
SafeMem x86 x86
MemTracker MIPS SC
DMP x86 SC
Explore relaxed memory systems
04/22/2023 Wisconsin Multifacet Project 32
Non-TSOall Executions
04/22/2023 Wisconsin Multifacet Project 33
TSOdata is Complex
sLD
Full
sST
sST
Exception
Empty/full-bitsInitial State:A.d = 0, A.m = NoneB.d = 0, B.m = Empty
T0:dST 1, AsLD B
T1:sST B, 1dLD A
Empty
Can dLD A return 0?
04/22/2023 Wisconsin Multifacet Project 34
Safe Supervision
Wisconsin Multifacet Project 352/15/2011