distributed operating systems - os.inf.tu-dresden.de filedistributed operating systems 2018 till...

158
Distributed Operating Systems Synchronization in Parallel Systems Till Smejkal 2019

Upload: others

Post on 15-Oct-2019

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems

Synchronization in Parallel Systems

Till Smejkal2019

Page 2: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 2

Overview

Introduction Hardware Primitives Synchronization with Locks (Part I)

Properties Locks

Spin Lock (Test & Set Lock) Test & Test & Set Lock Ticket Lock

Synchronization without Locks Synchronization with Locks (Part II)

MCS Lock Performance

Special Issues Timeouts Reader Writer Lock Lockholder Preemption Monitor, Mwait

Page 3: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 3

Introduction

Request

head

tail

freefreefreefree

Circular Buffer

head

tail

Page 4: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 4

Introduction

head

Page 5: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 5

Introduction

head

Page 6: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 6

Introduction

B

A

1) A,B create list elementhead

Page 7: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 7

Introduction

B

A

1) A,B create list element2) A,B set next pointerhead

Page 8: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 8

Introduction

B

A

1) A,B create list element2) A,B set next pointer3) B sets prev pointer

head

Page 9: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 9

Introduction

B

A

1) A,B create list element2) A,B set next pointer3) B sets prev pointer4) A sets prev pointer

head

Page 10: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 10

Introduction

B

A

head1) A,B create list element2) A,B set next pointer3) B sets prev pointer4) A sets prev pointer5) A updates head

Page 11: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 11

Introduction

B

A

head

1) A,B create list element2) A,B set next pointer3) B sets prev pointer4) A sets prev pointer5) A updates head6) B updates head

Page 12: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 12

Mutual Exclusion – w/ Locks

coarse grained (lock entire list)

lock(list);list->insert_element(ele);unlock(list);

fine grained (lock list elements)

retry: lock(head); if (trylock(head->next)) { head->insert_element(ele); unlock(head->next); } else { unlock(head); goto retry; } unlock(head)

Page 13: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 13

Mutual Exclusion – w/o Locks

bool flag[2] = {false, false};

int turn = 0;

void entersection(int thread) {

int other = 1 – thread; /* id of other thread; thread in {0,1} */

flag[thread] = true; /* show interest */

turn = other; /* give precedence to other thread */

while (turn == other && flag[other]) {} /* wait */

}

void leavesection(int thread) {

flag[thread] = false;

}

Decker / Peterson

atomic stores, atomic loads

sequential consistent memory (or memory fences)

Page 14: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 14

Atomic Hardware Instructions

A, B are atomic if A || B = A;B or B;A

Read-Modify-Write instructions are typically not atomic

A B add &x, 1 || mov &x, 2 (x = 0)

Page 15: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 15

Atomic Hardware Instructions

A, B are atomic if A || B = A;B or B;A

Read-Modify-Write instructions are typically not atomic

A B load &x -> Reg (x = 0) add Reg, 1 || store 2 -> &x store Reg -> &x

Page 16: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 16

Atomic Hardware Instructions

A, B are atomic if A || B = A;B or B;A

Read-Modify-Write instructions are typically not atomic

A B load &x -> Reg (x = 0) add Reg, 1 store 2 -> &x

store Reg -> &x

x == 1

A;B → x == 2B;A → x == 3

Page 17: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 17

Atomic Hardware Instructions

How to make instructions atomic? Bus lock

Lock memory bus until all memory accesses of a RMW instruction have completed Intel Pentium 3 and older x86 CPUs

lock; add &x, 1; unlock

Cache Lock Delay snoop traffic until all memory accesses of a RMW instruction have completed Intel Pentium 4 and newer x86 CPUs

Page 18: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 18

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP ARM, Alpha

retry: load_linked &x -> Reg; modify Reg; if (!store_conditional(Reg -> &x)) goto retry

Hardware Transactional Memory Install watchdog for all memory used by the transaction Discard changes on write-write or write-read conflicts Intel TSX, IBM BlueGeneQ

Page 19: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 19

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

BusRdX(&x)

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

CPU 0

S | x

CPU 1

S | x

Page 20: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 20

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

Flush

CPU 0

E | x

CPU 1

I |

Page 21: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 21

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

BusRdX(&x)

CPU 0

E | x

CPU 1

I |

Page 22: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 22

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

BusRdX(&x)

CPU 0

E | x

CPU 1

I |

Page 23: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 23

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

BusRdX(&x)

CPU 0

E | x

CPU 1

I |

Page 24: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 24

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

BusRdX(&x)

CPU 0

M | x

CPU 1

I |

Page 25: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 25

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

BusRdX(&x)

CPU 0

M | x

CPU 1

I |

Page 26: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 26

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

Flush

CPU 0

I |

CPU 1

S | x

CPU 1

E | x

Page 27: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 27

Atomic Hardware Instructions

How to make instructions atomic? Cache Lock

Delay snoop traffic until all memory accesses of a RMW instruction have completed Can be achieved with the M(O)ESI Cache Coherence protocol

add &x, 1 1. read_for_ownership(&x) 2. load &x -> Reg 3. add Reg, 1 4. store Reg -> &x

mov &x, 2 2. store 2 -> &x

CPU 1

S | x

CPU 1

M | x

CPU 0

I |

Page 28: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 28

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

CPU 1

S | x

CPU 1

S | x

CPU 0

S | x

Page 29: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 29

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

BusRdX(&x)

CPU 0

S | x

CPU 1

S | x

CPU 1

S | x

Page 30: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 30

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

Flush

CPU 0

E | x

CPU 1

S | x

CPU 1

I |

Page 31: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 31

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

CPU 0

E | x

CPU 1

S | x

CPU 1

I |

Page 32: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 32

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

CPU 0

M | x

CPU 1

S | x

CPU 1

I |

→ Ok

Page 33: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 33

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

mov &x, 2 2. store 2 -> &x

BusRdX(&x)

CPU 0

S | x

CPU 1

S | x

CPU 1

S | x

Page 34: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 34

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

mov &x, 2 2. store 2 -> &x

Flush

CPU 0

E | x

CPU 1

S | x

CPU 1

I |

Page 35: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 35

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

mov &x, 2 2. store 2 -> &x

BusRdX(&x)

CPU 0

E | x

CPU 1

S | x

CPU 1

I |

Page 36: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 36

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

mov &x, 2 2. store 2 -> &x

Flush

CPU 0

I |

CPU 1

S | x

CPU 1

E | x

Page 37: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 37

Atomic Hardware Instructions

How to make instructions atomic? Observe Cache

Install cache watchdog on load Abort store if watchdog has detected a concurrent access; retry OP

add &x, 1 1. load_linked &x -> Reg 2. add Reg, 1 3. store_conditional Reg -> &x

mov &x, 2 2. store 2 -> &x

CPU 0

I |

CPU 1

S | x

CPU 1

M | x

→ Abort

Page 38: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 38

Atomic Hardware Instructions

Bit Test and Set bit_test_and_set (bit) { if (bit clear) { set bit ; return true; } else { return false; } }

Exchange

swap (mem, Reg) { mov &mem, tmp; mov Reg, &mem; mov tmp, Reg; }

Fetch and Add

xadd (mem, Reg) { mov &mem, tmp; add &mem, Reg; return tmp; }

Page 39: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 39

Atomic Hardware Instructions

Compare and Swap cas (mem, expected, desired) { if (&mem == expected) { mov desired, &mem; return true; } else { return false; } }

Double Compare and Swap

dcas (mem1, mem2, exp1, exp2, des1, des2) { if (&mem1 == exp1 && &mem2 == exp2) { mov des1, &mem1; mov des2, &mem2; return true; } else { return false; } }

Page 40: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 40

Overview

Introduction Hardware Primitives Synchronization with Locks (Part I)

Properties Locks

Spin Lock (Test & Set Lock) Test & Test & Set Lock Ticket Lock

Synchronization without Locks Synchronization with Locks (Part II)

MCS Lock Performance

Special Issues Timeouts Reader Writer Lock Lockholder Preemption Monitor, Mwait

Page 41: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 41

Synchronization w/ Locks

Properties Overhead

Taking a lock should be cheap (<10% of CS) Minimize overhead if lock is currently free

Fairness Every thread should be able to obtain the lock after a finite amount of

time Abort lock()-operations

Abort locking after a specified timeout Stop threads which are currently waiting for a lock

Concurrent access to CS Support that multiple threads can enter the lock at the same time

Page 42: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 42

Synchronization w/ Locks

Properties Lock-holder preemption

Preemption of the thread currently executing the CS→ Just short in this lecture! (See MKC or RTS)

Priority inversion Prevent higher priority thread from executing because of lower priority

thread holding shared lock → Not covered in this lecture! (See MKC or RTS)

Spinning vs. Blocking Release CPU while waiting for the lock to be free again Latency and performance implications

Page 43: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 43

Synchronization w/ Locks

Spin Lock (Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ only one cheap atomic OP required

– high cache bus traffic while lock is held:

Page 44: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 44

Synchronization w/ Locks

Spin Lock (Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ only one cheap atomic OP required

– high cache bus traffic while lock is held

CPU 0

I |

CPU 1

I |

CPU 2

I |

CPU 3

I |

Page 45: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 45

Synchronization w/ Locks

Spin Lock (Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ only one cheap atomic OP required

– high cache bus traffic while lock is held

CPU 0

I |

CPU 1

I |

CPU 2

I |

CPU 3

I | M | l

Page 46: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 46

Synchronization w/ Locks

Spin Lock (Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ only one cheap atomic OP required

– high cache bus traffic while lock is held

CPU 0

I |

CPU 1

M | l

CPU 2

I |

CPU 3

I | I | M | l

Page 47: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 47

Synchronization w/ Locks

Spin Lock (Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ only one cheap atomic OP required

– high cache bus traffic while lock is held

CPU 0

I |

CPU 1

M | l

CPU 2

I |

CPU 3

I | I | M | l

Page 48: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 48

Synchronization w/ Locks

Spin Lock (Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ only one cheap atomic OP required

– high cache bus traffic while lock is held

CPU 0

I |

CPU 1

I |

CPU 2

M | l

CPU 3

I | M | l I |

Page 49: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 49

Synchronization w/ Locks

Spin Lock (Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ only one cheap atomic OP required

– high cache bus traffic while lock is held

CPU 0

I |

CPU 1

M | l

CPU 2

I |

CPU 3

I | I | M | l

Page 50: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 50

Synchronization w/ Locks

Spin Lock (Test & Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; do {} while (l->lock == 1); swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ spins locally while lock is held by other CPU

+ like Test & Set Lock but with fewer cache bus traffic

Page 51: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 51

Synchronization w/ Locks

Spin Lock (Test & Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; do {} while (l->lock == 1); swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ spins locally while lock is held by other CPU

+ like Test & Set Lock but with fewer cache bus traffic

CPU 0

I |

CPU 1

I |

CPU 2

I |

CPU 3

I |

Page 52: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 52

Synchronization w/ Locks

Spin Lock (Test & Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; do {} while (l->lock == 1); swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ spins locally while lock is held by other CPU

+ like Test & Set Lock but with fewer cache bus traffic

CPU 0

I |

CPU 1

I |

CPU 2

I |

CPU 3

I | M | l

Page 53: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 53

Synchronization w/ Locks

Spin Lock (Test & Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; do {} while (l->lock == 1); swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ spins locally while lock is held by other CPU

+ like Test & Set Lock but with fewer cache bus traffic

CPU 0

I |

CPU 1

M | l

CPU 2

I |

CPU 3

I | S | lS | l

Page 54: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 54

Synchronization w/ Locks

Spin Lock (Test & Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; do {} while (l->lock == 1); swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ spins locally while lock is held by other CPU

+ like Test & Set Lock but with fewer cache bus traffic

CPU 0

S | l

CPU 1

S | l

CPU 2

I |

CPU 3

I | S | l

Page 55: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 55

Synchronization w/ Locks

Spin Lock (Test & Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; do {} while (l->lock == 1); swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ spins locally while lock is held by other CPU

+ like Test & Set Lock but with fewer cache bus traffic

CPU 0

S | l

CPU 1

S | l

CPU 2

S | l

CPU 3

I |

Page 56: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 56

Synchronization w/ Locks

Spin Lock (Test & Test & Set Lock)

void lock (lock_t *l) { do { int tmp = 1; do {} while (l->lock == 1); swap (l->lock, tmp); } while (tmp == 1);}

void unlock (lock_t *l) { l->lock = 0}

+ spins locally while lock is held by other CPU

+ like Test & Set Lock but with fewer cache bus traffic

CPU 0

S | l

CPU 1

S | l

CPU 2

S | l

CPU 3

I |

Page 57: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 57

Synchronizatio w/ Locks

Fairness – Test & Set Locks

CPU 0 CPU 1 CPU 2 CPU 3

Page 58: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 58

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

CPU 0 CPU 1 CPU 2 CPU 3

Page 59: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 59

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

CPU 0 CPU 1 CPU 2 CPU 3

Page 60: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 60

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

CPU 0 CPU 1 CPU 2 CPU 3

Page 61: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 61

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

CPU 0 CPU 1 CPU 2 CPU 3

Page 62: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 62

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

lock

CPU 0 CPU 1 CPU 2 CPU 3

Page 63: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 63

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

lock

test

CPU 0 CPU 1 CPU 2 CPU 3

Page 64: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 64

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

lock

test

unlock

CPU 0 CPU 1 CPU 2 CPU 3

Page 65: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 65

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

lock

test

unlock

lock

CPU 0 CPU 1 CPU 2 CPU 3

Page 66: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 66

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

lock

test

unlock

lock

test

CPU 0 CPU 1 CPU 2 CPU 3

Page 67: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 67

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

lock

test

unlock

lock

test

test

CPU 0 CPU 1 CPU 2 CPU 3

Page 68: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 68

Synchronizatio w/ Locks

Fairness – Test & Set Locks

lock

test

test

unlock

lock

test

unlock

lock

test

test

CPU 0 CPU 1 CPU 2 CPU 3

free

free

Although the lock was free multiple times CPU2 did not get it.→ Test & Set Locks are not fair!

Page 69: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 69

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

+ similarly cheap as Test & Set Lock

+ ensures fairness between threads

Page 70: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 70

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2

my_ticket

next_ticket 0 cur_ticket 0

Page 71: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 71

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2lock

my_ticket 0

next_ticket 1 cur_ticket 0

Page 72: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 72

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2

my_ticket 0 1

next_ticket 2 cur_ticket 0

lock

lock

Page 73: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 73

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2

my_ticket 0 1 2

next_ticket 3 cur_ticket 0

lock

lock

lock

Page 74: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 74

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2

my_ticket 1 2

next_ticket 3 cur_ticket 1

lock

lock

lock

unlock

Page 75: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 75

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2

my_ticket 3 1 2

next_ticket 4 cur_ticket 1

lock

lock

lock

unlock

lock

Page 76: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 76

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2lock

lock

lock

unlock

lock

unlock

my_ticket 3 2

next_ticket 4 cur_ticket 2

Page 77: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 77

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

CPU 0 CPU 1 CPU 2lock

lock

lock

unlock

lock

unlock

unlock

my_ticket 3

next_ticket 4 cur_ticket 3

Page 78: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 78

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

– unnecessary bus traffic on ticket increase

– abort of lock operation is difficult to implement

Page 79: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 79

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

– unnecessary bus traffic on ticket increase

– abort of lock operation is difficult to implement

CPU 0 CPU 1 CPU 2

my_ticket 0 1 2

next_ticket 3 cur_ticket 0

Page 80: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 80

Synchronization w/ Locks

Ticket Locks

struct ticket_lock_t { int next_ticket; volatile int cur_ticket;};

void lock (ticket_lock_t *l) { int my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket != my_ticket);}

void unlock (ticket_lock_t *l) { l->cur_ticket++;}

– unnecessary bus traffic on ticket increase

– abort of lock operation is difficult to implement

CPU 0 CPU 1 CPU 2unlock

my_ticket 1 2

next_ticket 3 cur_ticket 1

Page 81: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 81

Overview

Introduction Hardware Primitives Synchronization with Locks (Part I)

Properties Locks

Spin Lock (Test & Set Lock) Test & Test & Set Lock Ticket Lock

Synchronization without Locks Synchronization with Locks (Part II)

MCS Lock Performance

Special Issues Timeouts Reader Writer Lock Lockholder Preemption Monitor, Mwait

Page 82: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 82

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List

void insert(ele_t *new_ele, ele_t *prev) { do { new_ele->next = prev->next; } while (!cas(&(prev->next), new_ele->next, new_ele));}

ele1

ele2

Page 83: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 83

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List

void insert(ele_t *new_ele, ele_t *prev) { do { new_ele->next = prev->next; } while (!cas(&(prev->next), new_ele->next, new_ele));}

ele1

ele2

ele3

Page 84: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 84

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List

void insert(ele_t *new_ele, ele_t *prev) { do { new_ele->next = prev->next; } while (!cas(&(prev->next), new_ele->next, new_ele));}

ele1

ele2

ele3

Page 85: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 85

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List

void insert(ele_t *new_ele, ele_t *prev) { do { load_linked(prev->next); new_ele->next = prev->next } while (!store_conditional(&(prev->next), new_ele);}

Page 86: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 86

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List Double-Linked List

void insert(ele_t *new_ele, ele_t *prev) { do { auto next = prev->next; new_ele->next = next; new_ele->prev = prev; } while (!dcas(&(prev->next), &(next->prev), new_ele->next, new_ele->prev, new_ele, new_ele);}

ele1

ele2

Page 87: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 87

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List Double-Linked List

void insert(ele_t *new_ele, ele_t *prev) { do { auto next = prev->next; new_ele->next = next; new_ele->prev = prev; } while (!dcas(&(prev->next), &(next->prev), new_ele->next, new_ele->prev, new_ele, new_ele);}

ele1

ele2

ele3

Page 88: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 88

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List Double-Linked List

void insert(ele_t *new_ele, ele_t *prev) { do { auto next = prev->next; new_ele->next = next; new_ele->prev = prev; } while (!dcas(&(prev->next), &(next->prev), new_ele->next, new_ele->prev, new_ele, new_ele);}

ele1

ele2

ele3

Page 89: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 89

Synchronization w/o Locks

Lock-free Data Structures Single-Linked List Double-Linked List Binary Trees ...

Not using locks does not solve all problems of locks!

e.g. Fairness → Wait-free Data Structures

Page 90: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 90

Overview

Introduction Hardware Primitives Synchronization with Locks (Part I)

Properties Locks

Spin Lock (Test & Set Lock) Test & Test & Set Lock Ticket Lock

Synchronization without Locks Synchronization with Locks (Part II)

MCS Lock Performance

Special Issues Timeouts Reader Writer Lock Lockholder Preemption Monitor, Mwait

Page 91: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 91

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

Page 92: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 92

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

qeue

? ?0 next

Page 93: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 93

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

0 next

Page 94: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 94

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

0 next

0 next

Page 95: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 95

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

0 next

0 next

Page 96: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 96

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

0 next

0 next

Page 97: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 97

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

0 next

0 next

Page 98: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 98

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

0 next

0 next

Page 99: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 99

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

0 next

1 next

Page 100: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 100

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

0 next

1 next

Page 101: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 101

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

0 next

Page 102: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 102

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

0 next

Page 103: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 103

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

1 next

Page 104: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 104

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

Page 105: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 105

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

Page 106: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 106

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

Page 107: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 107

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

0 next

Page 108: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 108

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

0 next

Page 109: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 109

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

0 next

Page 110: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 110

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

0 next

Page 111: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 111

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

0 next

Page 112: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 112

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

1 next

Page 113: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 113

1 next

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

Page 114: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 114

1 next

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

Page 115: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 115

1 next

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

Page 116: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 116

Synchronization w/ Locks

MCS-Lock – fair local spinning lock – Mellor-Crummey and Scott

void mcs_lock(mcs_lock_t* l, mcs_node_t* cur) { cur->next = NULL; cur->free = false; auto prev = fetch_and_store(&(l->queue), cur); if (prev) { prev->next = cur; do {} while (!cur->free); }}

void mcs_unlock(mcs_lock_t* l, mcs_node_t* cur) { if (!cur->next) { if (cas(&(l->queue), cur, NULL)) return; do {} while (!cur->next); } cur->next->free = true;}

struct mcs_lock_t { mcs_node_t* queue;};

struct mcs_node_t { mcs_node_t* next; bool free;};

CPU 0 CPU 1 CPU 2

queue

Page 117: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 117

Synchronization w/ Locks

Figure 1 Comparisson of diffirent lock implementations. Mellor-Crummey, Scott [1991]: “Algorithms for Scalable Synchronization on Shared Memory Multiprocessors”

MCS-Lock – Performance

Page 118: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 118

Synchronization w/ Locks

Figure 2 Comparisson of the overhead of spin-locks and MCS-locks on an 16 core AMD Opteron. Boyd-Wickizer et al. [2008]: “Corey: An Operating System for Many Cores”

MCS-Lock – Performance

Page 119: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 119

Overview

Introduction Hardware Primitives Synchronization with Locks (Part I)

Properties Locks

Spin Lock (Test & Set Lock) Test & Test & Set Lock Ticket Lock

Synchronization without Locks Synchronization with Locks (Part II)

MCS Lock Performance

Special Issues Timeouts Reader Writer Lock Lockholder Preemption Monitor, Mwait

Page 120: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 120

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Page 121: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 121

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks → stop trying to acquire lock

Page 122: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 122

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock

Page 123: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 123

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 2 3

next_ticket 4 cur_ticket 0

Page 124: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 124

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire lock

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 3

next_ticket 4 cur_ticket 0

Page 125: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 125

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire lock

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 1 3

next_ticket 4 cur_ticket 1

Page 126: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 126

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire lock

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 3

next_ticket 4 cur_ticket 2

Page 127: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 127

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire lock

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 3

next_ticket 4 cur_ticket 2

CPU3 waits for ever

Page 128: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 128

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire lock + increase cur_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 2 3

next_ticket 4 cur_ticket 0

Page 129: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 129

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire lock + increase cur_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 3

next_ticket 4 cur_ticket 1

Page 130: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 130

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire lock + increase cur_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 3

next_ticket 4 cur_ticket 1

CPU1 can enter CS

Page 131: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 131

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire the lock + increase cur_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 2 3

next_ticket 4 cur_ticket 1

Page 132: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 132

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire the lock + increase cur_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 2 3

next_ticket 4 cur_ticket 1

OK

Page 133: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 133

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire the lock + alter next_ticket and

my_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 2 3

next_ticket 4 cur_ticket 0

Page 134: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 134

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire the lock + alter next_ticket and

my_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 2

next_ticket 3 cur_ticket 0

Page 135: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 135

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire the lock + alter next_ticket and

my_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 2

next_ticket 3 cur_ticket 0

OK

Page 136: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 136

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks Ticket Lock → stop trying to acquire the lock + alter next_ticket and

my_ticket

CPU 0 CPU 1 CPU 2 CPU 3

my_ticket 0 1 2

next_ticket 3 cur_ticket 0

Very tricky to implement!

OK

Page 137: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 137

Synchronization w/ Locks

Timeouts – Abort lock()-Operation Give up locking after a specified timeout Stop threads which are currently waiting for a lock

Test & Set Locks → stop trying to acquire the lock Ticket Lock → stop trying to acquire the lock + alter next_ticket and

my_ticket

MCS-Lock → dequeue from the queue of waiters (exercise)

Page 138: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 138

Synchronization w/ Locks

Reader Writer Locks Lock differentiates two types of lock holders:

Readers Do not modify the object Multiple can use the object at the same time

Writers Modify the object Must have exclusive access to the object (no other readers or

writers) Locks can have different level of fairness

Readers and writers use the object in the order they appear → fair Later readers overtake earlier writers → unfair for writers Later writers overtake earlier readers → unfair for readers

Page 139: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 139

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31 0

Page 140: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 140

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket

next_ticket 0|0 cur_ticket 0|0

0

Page 141: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 141

read

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket 0|0

next_ticket 0|1 cur_ticket 0|0

0

Page 142: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 142

read

read

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket 0|0 0|1

next_ticket 0|2 cur_ticket 0|0

0

Page 143: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 143

read

read

write

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket 0|0 0|1 0|2

next_ticket 1|2 cur_ticket 0|0

0

Page 144: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 144

read

read

write

done

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket 0|1 0|2

next_ticket 1|2 cur_ticket 0|1

0

Page 145: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 145

read

read

write

done

done

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket 0|2

next_ticket 1|2 cur_ticket 0|2

0

Page 146: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 146

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket 1|2 0|2

next_ticket 1|3 cur_ticket 0|2

0

read

read

write

done

done

read

Page 147: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 147

CPU 0

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31

CPU 1 CPU 2

my_ticket 1|2

next_ticket 1|3 cur_ticket 1|2

0

read

read

write

done

done

read

done

Page 148: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 148

Synchronization w/ Locks

Fair Ticket Reader Writer Lock

struct rw_lock_t { rw_union_t cur_ticket; rw_union_t next_ticket;};

void lock_read(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket), 1); do {} while (l->cur_ticket.write != my_ticket.write);}

void lock_write(rw_lock_t *l) { auto my_ticket = xadd(&(l->next_ticket.write), 1); do {} while (l→cur_ticket != my_ticket);}

void unlock_read(rw_lock_t *l) { xadd(&(l->cur_ticket.read), 1);}

void unlock_write(rw_lock_t *l) { l->cur_ticket.write++;}

write read

rw_union_t

63 31 0

Difficult to get right! No overflows within each counter

Counters must be large enough so that all threads can be readers or writers

No overflow from one counter to another Read counter must not overflow into write counter → protection bit

Page 149: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 149

Synchronization w/ Locks

Lockholder Preemption

Figure 3 Comparison of the operation throughput using blocking or spinning primitives. Johnson et al. [2010]: “Decoupling Contention Management from Scheduling”

Page 150: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 150

Synchronization w/ Locks

Lockholder Preemption Spinning time of a CPU is increased by the time the current lockholder can not execute

Lockholder gets preempted by other spinning threads on the same CPU Especially bad for Ticket and MCS-Locks

Blocking instead of spinning reduces the load on the system and can thereby help preventing lockholder preemption Prohibit preemption of the lockholder by disabling interrupts while being in CS

cli + sti in combination with pushf + popf

Page 151: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 151

Synchronization w/ Locks

Lockholder Preemption Spinning time of a CPU is increased by the time the current lockholder can not execute

Lockholder gets preempted by other spinning threads on the same CPU Especially bad for Ticket and MCS-Locks

Blocking instead of spinning reduces the load on the system and can thereby help preventing lockholder preemption Prohibit preemption of the lockholder by disabling interrupts while being in CS

cli + sti in combination with pushf + popf

Only possible in the kernel! Very dangerous!

Page 152: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 152

Synchronization w/ Locks

Monitor, Mwait Possibility to stop the CPU/HT while waiting for a lock (only x86)

Can be used to put a CPU in a sleep state Allows better usage of the remaining resources

monitor – watches a given cache line mwait – stops CPU/HT until write to monitored cache line or interrupt

while (trigger != 1) { monitor(&trigger); if (trigger != 1) mwait(&trigger);}

Page 153: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 153

Synchronization w/ Locks

Monitor, Mwait Possibility to stop the CPU/HT while waiting for a lock (only x86)

Can be used to put a CPU in a sleep state Allows better usage of the remaining resources

monitor – watches a given cache line mwait – stops CPU/HT until write to monitored cache line or interrupt

while (trigger != 1) { monitor(&trigger); if (trigger != 1) mwait(&trigger);}

CPU 0 CPU 1

Page 154: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 154

Synchronization w/ Locks

Monitor, Mwait Possibility to stop the CPU/HT while waiting for a lock (only x86)

Can be used to put a CPU in a sleep state Allows better usage of the remaining resources

monitor – watches a given cache line mwait – stops CPU/HT until write to monitored cache line or interrupt

while (trigger != 1) { monitor(&trigger); if (trigger != 1) mwait(&trigger);}

CPU 0 CPU 1

mwait

Page 155: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 155

Synchronization w/ Locks

Monitor, Mwait Possibility to stop the CPU/HT while waiting for a lock (only x86)

Can be used to put a CPU in a sleep state Allows better usage of the remaining resources

monitor – watches a given cache line mwait – stops CPU/HT until write to monitored cache line or interrupt

while (trigger != 1) { monitor(&trigger); if (trigger != 1) mwait(&trigger);}

CPU 0 CPU 1

mwait

trigger = 1;

Page 156: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 156

Synchronization w/ Locks

Monitor, Mwait Possibility to stop the CPU/HT while waiting for a lock (only x86)

Can be used to put a CPU in a sleep state Allows better usage of the remaining resources

monitor – watches a given cache line mwait – stops CPU/HT until write to monitored cache line or interrupt

while (trigger != 1) { monitor(&trigger); if (trigger != 1) mwait(&trigger);}

Only possible in the kernel!

Page 157: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 157

References

Scheduler-Conscious SynchronizationLeonidas I. Kontothanassis, Robert W. Wisniewski, Michael L. Scott

Scalable Reader-Writer Synchronization for Shared-Memory MultiprocessorsJohn M. Mellor-Crummey, Michael L. Scott

Algorithms for Scalable Synchronization on Shared-Memory MultiprocessorsJohn M. Mellor-Crummey, Michael L. Scott

Concurrent Update on Multiprogrammed Shared Memory MultiprocessorsMaged M. Michael, Michael L. Scott

Scalable Queue-Based Spin Locks with TimeoutMichael L. Scott and William N. Scherer III

Reactive Synchronization Algorithms for MultiprocessorsB. Lim, A. Agarwal

Lock Free Data Structures

John D. Valois (PhD Thesis) Reduction: A Method for Proving Properties of Parallel Programs

R. Lipton

Page 158: Distributed Operating Systems - os.inf.tu-dresden.de fileDistributed Operating Systems 2018 Till Smejkal 2 Overview Introduction Hardware Primitives Synchronization with Locks (Part

Distributed Operating Systems 2018 Till Smejkal 158

References

Decoupling Contention Management from SchedulingF.R. Johnson, R. Stoica, A. Ailamaki, T. Mowry

Corey: An Operating System for Many CoresSilas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao,Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, Zheng Zhang