transactional memory guest lecture design of parallel and high-performance computing

40
Transactional Memory Guest Lecture Design of Parallel and High-Performance Computing Georg Ofenbeck

Upload: nariko

Post on 23-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Transactional Memory Guest Lecture Design of Parallel and High-Performance Computing. Georg Ofenbeck. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A. Locking Recap: The Problem. void PushLeft ( DQueue *q, int val ) { - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Transactional MemoryGuest Lecture Design of Parallel and High-Performance Computing

Georg Ofenbeck

Page 2: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

LS

Locking Recap: The Problemvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }

Double queue

qn

10 20 RS

How to make this parallel? Global lock to coarse grained

Fine grained locking not trivial

How to compose?

? ? ?

?

Page 3: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

LS

Locking Recap: A possible Solution

Double queue

qn

10 20 RS

Transactional Memory might be a solution Makes a code block behave like its executed as a single atomic instruction

Transaction

Page 4: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Historical Background Databases accessed concurrently for decades “Transactions” as abstraction tool

Atomic

Consistency

Isolated

Durable

Proposed for general applications already 1977 by Lomet Transactions on memory instead of databases

Picture: http://godatabase.net

Page 5: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Research on Transactional Memory

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 20110

50

100

150

200

250

300

350

400

450

ACM digital li-brary

Google scholar hits

IeeeXplore

[Number of publications][Number of hits/10]

multicore

Page 6: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Research on Software TM (STM) Intel C++ STM compiler extends C++ with support for STM language

extensions Microsoft STM.NET is an experimental extension of the .NET

Framework Sun/Oracle DSTM2 Dynamic Software Transactional Memory Library

Page 7: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

LS

ACID

Double queue

10 20 RS

Transaction

Page 8: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

LS

AtomicityCID

Double queue

qn

10

Failure atomicity Transaction only succeeds if all its parts succeed

No evidence of its execution is left behind in case of failure

Does not mean atomic executionTransaction

Page 9: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

LS

AtomicityCID

Double queue

qn

10

Failure atomicity Transaction only succeeds if all its parts succeed

No evidence of its execution is left behind in case of failure

Does not mean atomic executionTransaction

Page 10: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

AConsistencyID

Double queue

Consistency A consistent state (e.g. links ok) is preserved when transaction ends

In case of an abort satisfied by the failure atomicityTransaction LS

qn

10

Page 11: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

ACIsolationD

Isolation Transaction do not interfere with each other

Transaction

Page 12: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

LS

qn

qn

10

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

ACIsolationD

Double queue

Isolation Transaction do not interfere with each other

Transaction

Page 13: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

ACIDurability

Durability Persisting data in the database world

Sometimes relevant for Hardware TM – persisting caches to memoryTransaction

Page 14: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

TM Design Choices Failure Atomicity

How to restore the original state?

Version Control

Isolation How to shield concurrent transactions from each other?

Concurrency Control

Consistency How to detect conflicts?

Conflict Detection

Page 15: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

TM Design Choices Failure Atomicity

How to restore the original state?

Version Control

Isolation How to shield concurrent thread from each other?

Concurrency Control

Consistency How to detect conflicts?

Conflict Detection

Page 16: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Version Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }} Eager Version Management

Write changes directly in memory and change all values in Undo Log

On success simply remove Undo Log

On failure restore original values

Page 17: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Version Control

Eager Version Management Write changes directly in memory and change all values in Undo Log

On success simply remove Undo Log

On failure restore original values

Undo Log

QNode *LSQNode *LNNULLNULL

LS

Double queue

qn

10

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

Page 18: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Version Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }} Lazy Version Management

Write changes into a buffer and apply at commit

Apply changes during commit if no conflict occurred

Otherwise just discard buffer

Most common

Page 19: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Version Control

Lazy Version Management Write changes into a buffer and apply at commit

Apply changes during commit if no conflict occurred

Otherwise just discard buffer

Most common

Buffer

LS

Double queue

qn

10

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

Page 20: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Version Control

Lazy Version Management Write changes into a buffer and apply at commit

Apply changes during commit if no conflict occurred

Otherwise just discard buffer

Most common

Buffer

x2

void Example () { x = 2; y = 0; atomic{

x = 3;y = x;

}}

y0

x3

?

Page 21: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Version Control Redirection

Explicit by library

Implicit by compiler

Overhead!

void PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; do{ StartTx(); QNode *LS = ReadTx(&(q->left)); QNode *oLN = ReadTx(&(LS->left)); WriteTx(&(qn->left), LS); WriteTx(&(qn->right), oLN); WriteTx(&(LS->right), qn); WriteTx(&(oLN->left), qn); } while (!CommmitTx());}

Page 22: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

TM Design Choices Failure Atomicity

How to restore the original state?

Version Control

Isolation How to shield concurrent transactions from each other?

Concurrency Control

Consistency How to detect conflicts?

Conflict Detection

Page 23: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Concurrency Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }} Pessimistic

Transaction locks all variables it works on during a transaction

Page 24: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

LS

Concurrency Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

Double queue

qn

10

Pessimistic Transaction locks all variables it works on during a transaction

Deadlocks!

Usually in combination with eager version managment

qn

Page 25: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Concurrency Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }} Optimistic

System maintains multiple versions and detects conflict later

Page 26: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

LS

Concurrency Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

Double queue

qn

10

Optimistic System maintains multiple versions and detects conflict later

qn

Page 27: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

LS

Concurrency Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }}

Double queue

qn

10

Optimistic System maintains multiple versions and detects conflict later

Livelocks!

qn

Page 28: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Concurrency Controlvoid PushLeft (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LS->right = qn; LN->left = qn; }} Optimistic

System maintains multiple versions and detects conflict later

Livelocks!

void PushLeft2 (DQueue *q, int val) { QNode *qn = malloc(sizeof(QNode)); qn->val = val; atomic{ QNode *LS = q->left; QNode *LN = LS->right; qn->left = LS; qn->right = LN; LN->left = qn; LS->right = qn; }}

Page 29: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

TM Design Choices Failure Atomicity

How to restore the original state?

Version Control

Isolation How to shield concurrent transactions from each other?

Concurrency Control

Consistency How to detect conflicts?

Conflict Detection

Page 30: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Conflict detection Granularity

Bytes, Objects ….

How to resolve? Control to user? (Exception like)

RetryDelayedPriorities ….

For optimistic concurrency When to check?

Eager conflict detection?

How to detect?

void Example (int input) { atomic{

x = input;y = x;

}}

Page 31: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Nesting Flattened Nesting Closed Nesting Open Nesting

void Example () { x = 2; y = 0; atomic{ x = 3; y = x; atomic{ z = 3*x; w = z--; } }}

Page 32: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Why do we bother with locks?

Source: Calin Cascaval, Colin Blundell, Maged Michael, Harold W. Cain, Peng Wu, Stefanie Chiras, Siddhartha Chatterjee, 2008. Software Transactional Memory: Why Is It Only a Research Toy?.Queue 6, 5 (September 2008)

Page 33: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Why do we bother with locks?

Source: Calin Cascaval, Colin Blundell, Maged Michael, Harold W. Cain, Peng Wu, Stefanie Chiras, Siddhartha Chatterjee, 2008. Software Transactional Memory: Why Is It Only a Research Toy?.Queue 6, 5 (September 2008)

Page 34: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Why do we bother with locks?

Source: Aleksandar Dragojević , Pascal Felber , Vincent Gramoli , Rachid Guerraoui, Why STM can be more than a research toy, Communications of the ACM, v.54 n.4, April 2011

Page 35: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Why do we bother with locks?

Source: Ferad Zyulkyarov, Vladimir Gajinov, Osman S. Unsal, Adrián Cristal, Eduard Ayguadé, Tim Harris, and Mateo Valero. 2009. Atomic quake: using transactional memory in an interactive multiplayer game server.

Page 36: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Why do we bother with locks?

Source: Ferad Zyulkyarov, Vladimir Gajinov, Osman S. Unsal, Adrián Cristal, Eduard Ayguadé, Tim Harris, and Mateo Valero. 2009. Atomic quake: using transactional memory in an interactive multiplayer game server.

Page 37: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Why do we bother with locks?

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 20110

50

100

150

200

250

300

350

400

450

ACM digital li-brary

Google scholar hits

IeeeXplore

[Number of publications][Number of hits/10]

Page 38: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Why do we bother with locks? Intel C++ STM compiler extends C++ with support for STM language

extensions Microsoft STM.NET is an experimental extension of the .NET

Framework

discontinued

Page 39: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

Summary TM is a nice model to tackle parallelism BUT

STM does not yield good performance

HTM often with only limited functionality and hard to instrument

Hybrids the future?

Issues not covered in this lecture Details of conflict detection

How to handle I/O?

How to handle access from outside of the transaction?

….

Page 40: Transactional Memory Guest Lecture  Design  of Parallel and  High-Performance  Computing

More information Transactional Memory, 2nd Edition (Synthesis Lectures on Computer

Architecture)