1 message logging pessimistic & optimistic cs717 lecture 10/16/01-10/18/01 kamen yotov...

27
1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01- 10/18/01 Kamen Yotov [email protected]

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

1

Message LoggingPessimistic & Optimistic

CS717 Lecture 10/16/01-10/18/01

Kamen Yotov

[email protected]

Page 2: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

2

Intruduction

Context & Applications Check-pointing Message Logging

Pessimistic (failure-free mode suffers) Optimistic (good for failure-free mode) Causal (to be discussed in next lectures...)

Main problems Consistency

• Orphans

Page 3: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

3

Fault Tolerance “Why”s

Flow of eventsCheck-pointLog messagesCrashRestoreReplay

Page 4: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

4

Common Assumptions

Fail-stop model Failure eventually detectable by all Channels

Asynchronous Reliable FIFO Unbounded message delivery Failures

• Transiently dropping• No duplication and/or corruption

Stable storage Spare processing capacity

Page 5: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

5

Common goals

Application independence Application transparency

Simple Independent evolution Handles preexisting programs

High throughput Failure-free model with little overhead

Maximum fault-tolerance Any number of failures

Page 6: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

6

Formal Terminology

Delivery (as opposed to receipt) Non-faulty processes eventually deliver all

messages that they have received Receive sequence number

• If p delivers m and m.rsn=l then m is the lth message p delivers

Run Sequence of system states Asynchronous

• Only one process changes state at once

Page 7: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

7

Formal Terminology (cont.)

Properties: Logical expressions over runs □ - Always ◊ - Eventually

Message determinant #m = <m.src, m.ssn, m.dest, m.rsn, m.data> m.data and m.dest not essential Logging determinants vs. actual messages

Other notation N – set of all processes C – set of failed processes Log(m) – set of processes possessing a copy of #m Depend(m) – set of processes that depend on m

)'()(:'

.

.

def

mdelivermdeliverm

mdeliveredhasjdestmjNjDepend(m)

jdestm

Page 8: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

8

Orphan Properties

Before failure, by definition #mLog(m)

#m lost if Log(m)C stable(m) if #m cannot be lost p orphan of C if

p did not failpDepend(m)#m is lost

Page 9: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

9

Orphan Properties (cont.)

mLogmDependmLogmDependfmLogm

mLogmDependfmLogm

CmDependmstablem

mDependmstablem

mLogmDependfmLogm

mLogmDependmstablem

mLogmDependm

mLogmDependm

CmDependCmLogm

CmLogmDependpm

CNpCoforphanp

:

:

:Causal

:

:Optimistic

1:

:property)stronger (much cPessimisti

:

:

:

:

:

:

def

Page 10: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

10

Performance Metrics

Number of forced roll-backs Time spend on blocking Number of messages Size of messages

Page 11: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

11

Got to the real-world stuff!

No additional messages Any number of failures (including total) No assumptions about the logging protocol

Pessimistic doesn’t require that generality

Page 12: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

12

The ModelProcess states

Process states State interval

• Instantiates a new one on each message received• State interval index (auto increment)

p1

p2

p3

I03 I1

3 I23 I3

3 I43 I5

3

I01 I1

1 I32

Page 13: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

13

The ModelProcess states (cont.)

p1

p2

p3

I03 I1

3 I23 I3

3 I43 I5

3

I01 I1

1 I32

Dependencies between process states (pi depends on pj) Maximum index of any interval of pj, on which pi depends Inside a process each interval depends on the previous one

Dependence vector di = <*> = < 1, 2, 3, 4,…, n>, k = , 0, 1, …

Page 14: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

14

The ModelSystem states

Process state – dependence vector di = <*> = < 1, 2, 3, 4,…, n>, k = , 0, 1, …

System state – dependence matrix nn Row i – process state for pi

Diagonal – current state intervals

nnnnn

n

n

n

D

321

3333231

2232221

1131211

**

Page 15: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

15

The ModelSystem states (cont.)

S – set of all system states A=[**]S and B=[**]S

A B i=1..n: ii ii

Partial order different than Lamport’s• Orders system states vs. events• Only events are state intervals

Lattice A B = [**] ik = ii ii ? ik : ik

A B = [**] ik = ii ii ? ik : ik

Page 16: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

16

The ModelConsistent System states

Consistent state All received messages

• Sent in the current state of the sender• Can be deterministically sent in the future

Messages not yet received are not a problem

Definition: D=[**]S, i, k=1..n: ik kk

• A process cannot depend on the state interval of another process, that has not been reached yet

C = { D S | D is consistent }• C is a sub-lattice of S – proof straightforward!

Page 17: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

17

The ModelLogging and Stability

logged(i,) Message that started state interval of

process i has been logged on stable storage checkpoint(i,)

Exists a check-point that contains the state of process i on stable storage

checkpoint(i,0) is implicit Effective check-point for on i is

checkpoint(i,), , is maximal stable(i,) : < [logged(i,)]

Page 18: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

18

The ModelRecoverable System states

Recoverable system state System state is consistent All current process states are stable D=[**]S

• recoverable(D) D C && i : stable(i, ii) R = { D S | recoverable(D) }

• R is a sub-lattice of S – proof straightforward! Theorem: A single maximum recoverable state exists!

Proof• R S;• A B R if A, B R A, B A B• Therefore maximum is D R D, obviously unique!

Page 19: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

19

The ModelRecoverable System states (cont.)

Current recovery state The Maximum Recover State at any time Never decreases

• D=[**], No : ( i : ii ) is ever rolled back• Proof:

• D will always remain consistent ii will always remain stable• Since R is a lattice, any new state formed after D

will be greater than D• In any new current recovery:

ii state interval index for each process• Therefore, not state interval ii for each i

will ever need to be rolled back!

Page 20: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

20

The ModelWrapup…

Corollary 1: If all messages received are eventually logged no domino effect occurs

If D=[**] is the current recovery state Corollary 2: Any messages sent by process i

from state ii may be committed

With i being the effective checkpoint of ii

• Corollary 3: All previous checkpoints of process i may be discarded

• Corollary 4: All messages that begin state intervals prior to i may be discarded

Page 21: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

21

The AlgorithmOverview

Keep a current recovery state On each new interval for some process k

becoming stable Try to improve the current recovery state,

such that:• State of process k advances to • Add more state intervals from other processes to

maintain consistency• Succeed if all such included intervals are stable

Page 22: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

22

The AlgorithmBasic implementation

Notation D=[**]– the current recovery state – state interval of process k becoming stable dk = <*> = < 1, 2, 3, 4,…, n>, j = , 0, 1, … –

state of process k (dependence vector) Algorithm

if ( >kk) { i : ki i // update row of D while ( i,j : ij >jj ) if ( ij : stable()) // - an interval for j i : ji i // update row of D with dj for else fail}

Page 23: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

23

The AlgorithmSome details

The chosen should be the minimum stable state interval: ij

The comparisons ij >jj can be made in any order without affecting the final result

When state interval of process k becomes stable, the algorithm finds some recoverable D with kk =

No stable process state interval that was not suitable should be checked again before advancing the current recovery state

Corollary: When the recovery state advances from some D to D’, the rejected ’s above that need to be rechecked are those with direct dependency on some on any process i such that ii < < ii’

Page 24: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

24

The AlgorithmProof of Correctness

The algorithm presented always finds the current recovery state of the system Only finds recoverable system states Any such state found is greater Following the observations stated before, all

possible new states are considered Therefore, the correct one is always found!

Page 25: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

25

The AlgorithmOptimizations & Implementation

Optimization considerationsKeeping work list of rows to update D

• Keep only the one with max index

Keeping only the diagonal of D Implementation

Provided in the paperFollows everything said till nowTakes advantage of some specifics

Page 26: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

26

Conclusions

General Model and Algorithm Work for both pessimistic and optimistic protocols Does not need the generality for the pessimistic case

Optimistic logging is desirable from performance standpoint in low failure environments

Unifies existing approaches to fault tolerance Check-pointing Message Logging

Results Existence of unique maximum recoverable state Never decreases (progress is being made) Domino effect cannot occur

Page 27: 1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov kyotov@cs.cornell.edu

27

Future work list…

Address non-determinism Switch between

• check-pointing for the non-deterministic part• Check-pointing + message logging elsewhere

Output-driven optimistic message logging and check-pointing Pay attention to communication of the results

Application specific knowledge