@spcl eth fault tolerance for remote memory access programming models...
Post on 04-Feb-2021
5 Views
Preview:
TRANSCRIPT
-
spcl.inf.ethz.ch
@spcl_eth
MACIEJ BESTA, TORSTEN HOEFLER
Fault Tolerance for Remote Memory Access
Programming Models
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory
Process p
A
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Process p Process q
A
BB
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
Process p Process q
A
BB
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
Process p Process q
A
BB
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
Process p Process q
A
BB
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
Process p Process q
A
BB
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
put
Process p Process q
A
BB
AA
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
put
Process p Process q
A
Bget
B
A
B
A
B
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
put
Process p Process q
A
Bget
B
A
B
flush
A
B
-
spcl.inf.ethz.ch
@spcl_eth
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
put
Process p Process q
A
Bget
B
A
B
flush
A
B
-
spcl.inf.ethz.ch
@spcl_eth
One-sided communication
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
put
Process p Process q
A
Bget
B
A
B
flush
A
B
-
spcl.inf.ethz.ch
@spcl_eth
One-sided communication
2
REMOTE MEMORY ACCESS (RMA) PROGRAMMING
Memory Memory
Cray
BlueWaters
put
Process p Process q
A
Bget
B
A
B
flush
A
B
no active
participation
-
spcl.inf.ethz.ch
@spcl_eth
3
REMOTE MEMORY ACCESS PROGRAMMING
Implemented in hardware in NICs in the majority of HPC
networks support RDMA
-
spcl.inf.ethz.ch
@spcl_eth
3
REMOTE MEMORY ACCESS PROGRAMMING
Implemented in hardware in NICs in the majority of HPC
networks support RDMA
-
spcl.inf.ethz.ch
@spcl_eth
3
REMOTE MEMORY ACCESS PROGRAMMING
Implemented in hardware in NICs in the majority of HPC
networks support RDMA
-
spcl.inf.ethz.ch
@spcl_eth
3
REMOTE MEMORY ACCESS PROGRAMMING
Implemented in hardware in NICs in the majority of HPC
networks support RDMA
-
spcl.inf.ethz.ch
@spcl_eth
3
REMOTE MEMORY ACCESS PROGRAMMING
Implemented in hardware in NICs in the majority of HPC
networks support RDMA
-
spcl.inf.ethz.ch
@spcl_eth
4
REMOTE MEMORY ACCESS PROGRAMMING
Supported by many HPC libraries and languages
-
spcl.inf.ethz.ch
@spcl_eth
4
REMOTE MEMORY ACCESS PROGRAMMING
Supported by many HPC libraries and languages
-
spcl.inf.ethz.ch
@spcl_eth
4
REMOTE MEMORY ACCESS PROGRAMMING
Supported by many HPC libraries and languages
-
spcl.inf.ethz.ch
@spcl_eth
4
REMOTE MEMORY ACCESS PROGRAMMING
Supported by many HPC libraries and languages
-
spcl.inf.ethz.ch
@spcl_eth
5
REMOTE MEMORY ACCESS PROGRAMMING
Enables significant speedups over message passing in
many types of applications, e.g.:
[1] R. Gerstenberger, M. Besta, T. Hoefler, Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One-Sided.
ACM/IEEE Supercomputing 2013, SC13, Best Paper Award
[2] D. Petrovic et al., High-performance RMA-based broadcast on the Intel SCC. SPAA’12
-
spcl.inf.ethz.ch
@spcl_eth
5
REMOTE MEMORY ACCESS PROGRAMMING
Enables significant speedups over message passing in
many types of applications, e.g.: Speedup of ~1.5 for communication patterns in graph analytics
[1] R. Gerstenberger, M. Besta, T. Hoefler, Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One-Sided.
ACM/IEEE Supercomputing 2013, SC13, Best Paper Award
[2] D. Petrovic et al., High-performance RMA-based broadcast on the Intel SCC. SPAA’12
-
spcl.inf.ethz.ch
@spcl_eth
5
REMOTE MEMORY ACCESS PROGRAMMING
Enables significant speedups over message passing in
many types of applications, e.g.: Speedup of ~1.5 for communication patterns in graph analytics
Speedup of ~1.4-2 in physics computations
[1] R. Gerstenberger, M. Besta, T. Hoefler, Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One-Sided.
ACM/IEEE Supercomputing 2013, SC13, Best Paper Award
[2] D. Petrovic et al., High-performance RMA-based broadcast on the Intel SCC. SPAA’12
-
spcl.inf.ethz.ch
@spcl_eth
6
FAULT TOLERANCE + RMA
-
spcl.inf.ethz.ch
@spcl_eth
6
15.8h of MTBF (for nodes) for the TSUBAME2
FAULT TOLERANCE + RMA
-
spcl.inf.ethz.ch
@spcl_eth
7
FAULT TOLERANCE + RMA
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
7
FAULT TOLERANCE + RMA
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
7
FAULT TOLERANCE + RMA
Message Passing
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
7
FAULT TOLERANCE + RMA
Message Passing
Coordinated
Checkpointing (CC)
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
7
FAULT TOLERANCE + RMA
Message Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
7
FAULT TOLERANCE + RMA
Message Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
Scarce research exists for fault tolerance for RMA
7
FAULT TOLERANCE + RMA
Message Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
Scarce research exists for fault tolerance for RMA
7
FAULT TOLERANCE + RMA
RMAMessage Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
Scarce research exists for fault tolerance for RMA
7
FAULT TOLERANCE + RMA
RMAMessage Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
Scarce research exists for fault tolerance for RMA
7
FAULT TOLERANCE + RMA
RMAMessage Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
logging memory
accesses vs. messages
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
Scarce research exists for fault tolerance for RMA
7
FAULT TOLERANCE + RMA
RMAMessage Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
logging memory
accesses vs. messages
checkpointing in RMA-
based applications
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
Scarce research exists for fault tolerance for RMA
7
FAULT TOLERANCE + RMA
RMAMessage Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
logging memory
accesses vs. messages
checkpointing in RMA-
based applications
fault tolerance
mechanisms and
schemes
-
spcl.inf.ethz.ch
@spcl_eth
Fault tolerance is well studied for message passing
Scarce research exists for fault tolerance for RMA
7
FAULT TOLERANCE + RMA
RMAMessage Passing
Coordinated
Checkpointing (CC)
uncoordinated
checkpointing
and message
logging (UC)
logging memory
accesses vs. messages
checkpointing in RMA-
based applications
performance
fault tolerance
mechanisms and
schemes
-
spcl.inf.ethz.ch
@spcl_eth
8
OVERVIEW OF OUR RESEARCH
-
spcl.inf.ethz.ch
@spcl_eth
Generic model
8
OVERVIEW OF OUR RESEARCH
-
spcl.inf.ethz.ch
@spcl_eth
Generic model
8
OVERVIEW OF OUR RESEARCH
-
spcl.inf.ethz.ch
@spcl_eth
Generic model
8
OVERVIEW OF OUR RESEARCH
-
spcl.inf.ethz.ch
@spcl_eth
Generic model
8
OVERVIEW OF OUR RESEARCH
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
8
OVERVIEW OF OUR RESEARCH
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA Schemes
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
UC in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA Schemes
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
UC in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Schemes
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
UC in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Checkpointing
schemesSchemes
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
UC in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Checkpointing
schemesSchemes
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Checkpointing
schemesSchemes
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Checkpointing
schemesSchemes
Basic scheme
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Checkpointing
schemesSchemes
Basic scheme
Extended RMA
semantics
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Checkpointing
schemesSchemes
Basic scheme
Extended RMA
semantics
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemesSchemes
Basic scheme
Extended RMA
semantics
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
Distribution of
processes
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemesSchemes
Basic scheme
Extended RMA
semantics
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
Distribution of
processes
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemesSchemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
Distribution of
processes
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemesSchemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
Distribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemesSchemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
Distribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMA
8
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
8
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
8
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
8
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
9
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
10
COORDINATED CHECKPOINTING (MP)
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
global
rollback
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
barrier com
pute
com
pute
com
pute
com
pute
barrier
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
10
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
11
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
11
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
11
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
11
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
send
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
11
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
send
coordinated
checkpoint
recv
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
11
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
send
coordinated
checkpoint
recv
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
11
COORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
12
COORDINATED CHECKPOINTING (RMA)
Proc k Proc 1Proc 1 ... ... Proc k...
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
12
COORDINATED CHECKPOINTING (RMA)
Proc k Proc 1Proc 1 ... ... Proc k...
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
12
COORDINATED CHECKPOINTING (RMA)
Proc k Proc 1Proc 1 ... ... Proc k...
put
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
12
COORDINATED CHECKPOINTING (RMA)
Proc k Proc 1Proc 1 ... ... Proc k...
put
coordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
12
COORDINATED CHECKPOINTING (RMA)
Proc k Proc 1Proc 1 ... ... Proc k...
put
coordinated
checkpoint
flush
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
12
COORDINATED CHECKPOINTING (RMA)
Proc k Proc 1Proc 1 ... ... Proc k...
put
coordinated
checkpoint
flush
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
13
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
13
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
13
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A
B
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A
B
actions are
non-blocking
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A
B
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A
B
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B
C
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B
C
D
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B
C
D
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B CD
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B CD
E
F
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B CD
E
F
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B CD E F
actions are
non-blocking
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B CD E F
actions are
non-blockingEpoch 0
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B CD E F
actions are
non-blockingEpoch 0
Epoch 1
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
14
RMA: EPOCHS
Proc p Proc q
memorymemory
A B C D E F
A B CD E F
actions are
non-blockingEpoch 0
Epoch 1
Epoch 2
data will be valid
upon synchronizing
memories
-
spcl.inf.ethz.ch
@spcl_eth
15
RMA: EPOCHS
Proc p Proc q
-
spcl.inf.ethz.ch
@spcl_eth
15
RMA: EPOCHS
Proc p Proc q
X
Y
Z
-
spcl.inf.ethz.ch
@spcl_eth
15
RMA: EPOCHS
Proc p Proc q
Epoch 0
X
Y
Z
-
spcl.inf.ethz.ch
@spcl_eth
15
RMA: EPOCHS
Proc p Proc q
Epoch 0
Epoch 1
X
Y
Z
-
spcl.inf.ethz.ch
@spcl_eth
15
RMA: EPOCHS
Proc p Proc q
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
B
C
E
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
B
C
E
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
B
C
E
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
B
C
E
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
B
C
B Cput put
E
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
B
C
B Cput put
E
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
B
C
B Cput put
E
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
co
B
C
B Cput put
E Fget getE
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
co
For recovery, a
process has to
replay actions
in the correct
order! co
B
C
B Cput put
E Fget getE
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
co
For recovery, a
process has to
replay actions
in the correct
order! co
For this,
we use epoch
counters (EC)
B
C
B Cput put
E Fget getE
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
co
For recovery, a
process has to
replay actions
in the correct
order! co
For this,
we use epoch
counters (EC)
memory
EC
B
C
B Cput put
E Fget getE
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
co
For recovery, a
process has to
replay actions
in the correct
order! co
For this,
we use epoch
counters (EC)
memory
EC 0
B
C
B Cput put
E Fget getE
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
co
For recovery, a
process has to
replay actions
in the correct
order! co
For this,
we use epoch
counters (EC)
memory
EC 1
B
C
B Cput put
E Fget getE
F
-
spcl.inf.ethz.ch
@spcl_eth
16
RMA: THE CONSISTENCY ORDER
Proc p Proc q
A
D
Epoch 0
Epoch 1
Epoch 2
co
co
co
For recovery, a
process has to
replay actions
in the correct
order! co
For this,
we use epoch
counters (EC)
memory
EC 2
B
C
B Cput put
E Fget getE
F
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
17
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
17
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
17
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
rollback
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
dependency
rollback
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
dependency
rollback
rollback
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
dependency
dependency
rollback
rollback
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
dependency
dependency
rollback
rollback rollback
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
uncoordinated
checkpoint
logging a
message
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
local
rollback
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
get the log and
replay the message
-
spcl.inf.ethz.ch
@spcl_eth
Node 1 Node N
18
UNCOORDINATED CHECKPOINTING (MP)
Proc k Proc 1Proc 1 ... ... Proc k...
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
memoryCC
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
memoryCC
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
memoryCC
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
memory
Logs of puts
CC
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
CC
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
CC
Data is valid
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
C
C
Data is valid
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
C + #put
C
Data is valid
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
C + #put
C
Data is valid
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
EC 1
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
C + #put
+ EC( )
C
Data is valid
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
EC 1
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
C + #put
+ EC( )
C
Data is valid
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
EC 1
1
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
record the put
memory
Logs of puts
C + #put
+ EC( )
C
Data is valid
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
EC 1
1
-
spcl.inf.ethz.ch
@spcl_eth
19
RMA: LOGGING PUTS
Proc p Proc q
C
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
1
-
spcl.inf.ethz.ch
@spcl_eth
20
RMA: LOGGING GETS
Proc p Proc q
C
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
20
RMA: LOGGING GETS
Proc p Proc q
A
B
C
D
E
F
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
20
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
memory memoryD
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
memory
Logs of #gets
memoryD
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
memory
Logs of #gets
memoryD
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
memory
Logs of #gets
memoryD
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
memory
Logs of #gets
memoryD
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
memory
Logs of #gets
Data is not
yet valid
memoryD
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
Data is not
yet valid
memoryD
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get
Data is not
yet valid
memoryD
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
Data is not
yet valid
EC 1memory
D1
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
Data is not
yet valid
EC 1memory
D
1
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
Data is not
yet valid
EC 1memory
D
1...
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
Data is not
yet valid
EC 1memory
D
1...
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
Data is not
yet valid
EC 1memory
DD
1...
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
EC 1memory
DD
Data is valid
1...
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
EC 1memory
D
Now this data
may be invalid
Data is valid
1...
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
EC 1memory
D
Now this data
may be invalid
Data is valid
1...
Logs of gets
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
EC 1memory
+ #get
+ EC( )
D
Now this data
may be invalid
Data is valid
1...
Logs of gets
D
1
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
#get + EC( )
EC 1memory
+ #get
+ EC( )
D
Now this data
may be invalid
Data is valid
delete #get
and EC
1...
Logs of gets
D
1
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
D
𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔,… , 𝑑𝑎𝑡𝑎
#𝑎 = 𝑠𝑟𝑐, 𝑡𝑟𝑔, …
log(#get + EC)
memory
Logs of #gets
EC 1memory
+ #get
+ EC( )
D
Now this data
may be invalid
Data is valid
delete #get
and EC
...Logs of gets
D
1
-
spcl.inf.ethz.ch
@spcl_eth
21
RMA: LOGGING GETS
Proc p Proc q
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
22
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
22
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
22
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
App. dataChpt. data
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
App. data
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
Stage 2: replay actions beyond the checkpoint
App. data
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
Stage 2: replay actions beyond the checkpoint
X
Y
Z
App. data
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
Stage 2: replay actions beyond the checkpoint
X
Y
Z
App. data
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
Stage 2: replay actions beyond the checkpoint
X
Y
App. data
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
Stage 2: replay actions beyond the checkpoint
X
Y
App. data
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
memory
Logs of puts
Stage 2: replay actions beyond the checkpoint
X
Y
App. data
DataPX + #put + EC( )0
Y + #put + EC( )0
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
memory
Logs of puts
Stage 2: replay actions beyond the checkpoint
X
Y
App. data
DataP
Epoch 0
X + #put + EC( )0
Y + #put + EC( )0
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
memory
Logs of puts
Stage 2: replay actions beyond the checkpoint
X
Y
X + #put + EC( )0
Y + #put + EC( )0
App. data
DataP
Epoch 0
X + #put + EC( )0
Y + #put + EC( )0
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
memory
Logs of puts
Stage 2: replay actions beyond the checkpoint
X
Y
X + #put + EC( )0
Y + #put + EC( )0
App. data
DataP
Epoch 0
X + #put + EC( )0
Y + #put + EC( )0
-
spcl.inf.ethz.ch
@spcl_eth
memorymemory
23
RMA: RECOVERY
Proc p Proc q
memory
Logs of puts
Stage 2: replay actions beyond the checkpoint
X + #put + EC( )0
Y + #put + EC( )0
App. data
DataPX + #put + EC( )0
Y + #put + EC( )0
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
A
B
C
D
E
F
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
A
B
C
D
E
F
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )2
F + #get + EC( )2
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )2
F + #get + EC( )2
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )2
F + #get + EC( )2
Stage 2: replay actions beyond the checkpoint
DataP
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
E + #get + EC( )2
F + #get + EC( )2
D + #get + EC( )1
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
E + #get + EC( )2
F + #get + EC( )2
D + #get + EC( )1
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )
F + #get + EC( )2
2
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
updating
state...
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )
F + #get + EC( )2
2
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
updating
state...
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
X + #put + EC( )0
Y + #put + EC( )0
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )
F + #get + EC( )2
2
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
updating
state...
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )
F + #get + EC( )2
2
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
updating
state...
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
D
E
F
Logs of gets
D + #get + EC( )1
E + #get + EC( )
F + #get + EC( )2
2
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
updating
state...
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
D
E
F
Logs of gets
E + #get + EC( )
F + #get + EC( )2
2
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
updating
state...
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
D
E
F
Logs of gets
E + #get + EC( )
F + #get + EC( )2
2
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
updating
state...
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
D
E
F
Logs of gets
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
memory memory
24
RMA: RECOVERY
Proc p Proc q
Logs of puts
X + #put + EC( )0
Y + #put + EC( )0
App. data
D
E
F
Logs of gets
state
restored!
Stage 2: replay actions beyond the checkpoint
DataP
D + #get + EC( )1D
E + #get + EC( )2
F + #get + EC( )2
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
25
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
25
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
Holistic fault-tolerance library
Topology-awareness
CC in RMAGeneric model
UC in RMA
Recovery in RMAProofs
25
OVERVIEW OF OUR RESEARCH
PerformanceDistribution of
processesDesign
MP vs. RMA
MP vs. RMA
Model extensions
Deadlock freedom
Correct recovery
Checkpointing
schemes
Optimizations
Checkpoints
on demand
Schemes
Basic scheme
Extended RMA
semantics
Decreasing
failure prob.
Logging accesses
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
A Cray XE/XT
supercomputer
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
...4 cabinets:
A Cray XE/XT
supercomputer
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
...4 cabinets:
3 chassis: ...
A Cray XE/XT
supercomputer
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
...4 cabinets:
3 chassis: ...
A Cray XE/XT
supercomputer
...8 blades:
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
...4 cabinets:
3 chassis:
4 nodes:
...
...
A Cray XE/XT
supercomputer
...8 blades:
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
...4 cabinets:
3 chassis:
4 nodes:
...
...
...
A Cray XE/XT
supercomputer
...8 blades:
32 cores:
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
A single hardware crash may kill multiple processes
...4 cabinets:
3 chassis:
4 nodes:
...
...
...
A Cray XE/XT
supercomputer
...8 blades:
32 cores:
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
A single hardware crash may kill multiple processes
...4 cabinets:
3 chassis:
4 nodes:
...
...
...
A Cray XE/XT
supercomputer
...8 blades:
32 cores:
Up to 128 process
failures (assuming
1 process per core)
-
spcl.inf.ethz.ch
@spcl_eth
26
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Today’s supercomputers have a hierarchical layout
A single hardware crash may kill multiple processes
Introduced protocols usually cannot handle > 1 process crash
...4 cabinets:
3 chassis:
4 nodes:
...
...
...
A Cray XE/XT
supercomputer
...8 blades:
32 cores:
Up to 128 process
failures (assuming
1 process per core)
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
G
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
G
A
B
C
Application data
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
G
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
Application data
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
G
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
Parity data
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
27
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 1: groups of processes: Divide processes into groups of size G each
Add m parity processes to each group to store the parity data
m
G
A
B
C
X
Y
D
E
F
S
T
G
H
I
U
W
J
K
L
Z
V
M
N
O
1
2
P
Q
R
3
4
-
spcl.inf.ethz.ch
@spcl_eth
28
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
A D G J M P
Y T W V 2 4
Q
R
3
B
X
E
F
S
H
I
U
K
L
Z
N
O
1
C
-
spcl.inf.ethz.ch
@spcl_eth
28
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 2: topology-aware distribution of groups:
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
A D G J M P
Y T W V 2 4
Q
R
3
B
X
E
F
S
H
I
U
K
L
Z
N
O
1
C
-
spcl.inf.ethz.ch
@spcl_eth
28
EXTENDING THE PROTOCOLS FOR MORE RESILIENCE
Step 2: topology-aware distribution of groups: For example, apply topology-awareness at the level of blades...
Proc 5 Proc 6
Proc 7 Proc 8 Proc 11 Proc 12
Proc 13 Proc 14 Proc 17 Proc 18
Proc 19 Proc 20 Proc 23 Proc 24
Proc 19 Proc 20 Proc 21 Proc 22 Proc 23 Proc 24
Proc 1 Proc 2 Proc 3 Proc 4
Proc 9 Proc 10
Proc 15 Proc 16
Proc 21 Proc 22
A D G J M P
Y T W V 2 4
Q
R
3
B
X
E
F
S
H
I
U
K
L
Z
top related