a programming model for failure-prone, collaborative robots · sdir07: a programming model for...
TRANSCRIPT
![Page 1: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/1.jpg)
1
A Programming model for failure-prone, Collaborative robots
Nels Eric BeckmanJonathan Aldrich
School of Computer ScienceCarnegie Mellon University
SDIR 2007April 14th, 2007
![Page 2: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/2.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
2
Failure Blocks: Increasing Application Liveness
• In the Claytronics domain, failure will be commonplace.
• Certain Applications:• Failure of one catom causes others to be useless.
• Our model:• An extension to remote procedure calls.• Helps developers preserve liveness.• Developers
• Signify where liveness is a concern.• Specify liveness preserving actions.
• When failure is automatically detected, those actions are taken.
![Page 3: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/3.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
3
Outline
• In our domain, the rate of failure will be high
• ‘Hole Motion’ (An example failure scenario)
• Existing RPC systems do not help us to preserve liveness
• Our model has two key pieces• The failure block• The compensating action
![Page 4: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/4.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
4
Rate of Failure in Catoms will be High
• Due to the large numbers involved:• Per-unit cost must be low, which implies
• A lack of hardware error detection features.• Rate of mechanical imperfections will be high.
• Probability of some catom failing becomes high.
• Interaction with the physical world:• Dust particles?• Other unintended interactions?
![Page 5: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/5.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
5
Outline
• In our domain, the rate of failure will be high
• ‘Hole Motion’ (An example failure scenario)
• Existing RPC systems do not help us to preserve liveness
• Our model has two key pieces• The failure block• The compensating action
![Page 6: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/6.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
6
The Hole Motion* Algorithm
• A Motion-Planning Technique
• The Idea:• Randomly send holes through the mass of
catoms.• Holes ‘stick’ to areas that should shrink.• They are more likely to be created from
areas that should grow.
*De Rosa, Goldstein, Lee, Campbell, Pillai. Scalable Shape Sculpting Via Hole Motion: Motion Planning in Lattice-Constrained Modular Robots. IEEE ICRA 2006. May 2006.
![Page 7: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/7.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
7
In Detail...
• At each ‘hole time-step,’ catoms around the hole have a leader.
• They only accept commands from this leader.
• This protects the hole’s integrity.
L
![Page 8: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/8.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
8
In Detail...
• In order to become the ‘leader,’ this catom calls ‘setLeader’ on its neighbors.
• The same method is called recursively on other would-be group members.
L
nextCatom->setLeader(me);
![Page 9: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/9.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
9
In Detail...
• Catom on the stack fails:• Catoms i and j may
have already set L as their leader!
• But the only communication path to L is gone.
L g
i
j
nextCatom->setLeader(me);
![Page 10: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/10.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
10
In Detail...
• Instead, suppose operation returned normally...
L
![Page 11: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/11.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
11
In Detail...
• Instead, suppose operation returned normally...
L
![Page 12: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/12.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
12
In Detail...
• Instead, suppose operation returned normally...
L
![Page 13: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/13.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
13
In Detail...
• Instead, suppose operation returned normally...
L
![Page 14: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/14.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
14
In Detail...
• Instead, suppose operation returned normally...
L
![Page 15: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/15.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
15
In Detail...
• Now L fails:• Catoms g-j (and all
the rest) expect commands from L!
• For all practical purposes, 12 catoms have failed.
L g h
i
j
![Page 16: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/16.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
16
Outline
• In our domain, the rate of failure will be high
• ‘Hole Motion’ (An example failure scenario)
• Existing RPC systems do not help us to preserve liveness
• Our model has two key pieces• The failure block• The compensating action
![Page 17: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/17.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
17
Existing RPC Systems, Not a Perfect Fit
• Weak Failure Detection• Usually a timeout mechanism.• Our model uses active failure detection.
• No Callee-Side Failure Handling• Caller can catch timeout exception; not
callee.• But the callee could be left in an invalid
state.• Our model provides callee with
compensating actions.
![Page 18: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/18.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
18
Existing RPC Systems, Not a Perfect Fit
• Only detect failure on the stack of RPC calls.• Our model designates catoms as being a
part of the group for a lexical ‘amount of time.’
• They are still a part of this group when the thread moves to a different location.
• Failures on the stack and off are dealt with in the same manner.
![Page 19: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/19.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
19
Outline
• In our domain, the rate of failure will be high
• ‘Hole Motion’ (An example failure scenario)
• Existing RPC systems do not help us to preserve liveness
• Our model has two key pieces• The failure block• The compensating action
![Page 20: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/20.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
20
The Model: Two Key Pieces
• fail_block, which specifies• The logical ‘time period’ during which live-
ness concerns exist• The members of the group (implicitly)• Where control should return in the event of
a failure
• push_comp, which allows• The specification of code to be executed in
the event of catom failure
![Page 21: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/21.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
21
The fail_block Primitive
• fail_block b
• Evaluates the code in block b.
• In the event of a detected failure• The entire block throws an exception.• Execution continues from the catom where
the failure block is evaluated.
![Page 22: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/22.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
22
The fail_block Primitive
• At runtime, the entire operation is given a unique ‘operation ID.’• When a RPC is called from within block
• Callee becomes ‘part’ of the operation.• Callee and caller add one another as
collaborators.• They ‘ping’ each other regularly to detect failure.• Applies recursively.
• In the event a failure is detected, they share the information about the demise of that operation.
![Page 23: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/23.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
23
The fail_block Primitive
• If b is successfully executed• An ‘end’ message is sent out.• Collaborators stop detecting failure for that
OID.
![Page 24: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/24.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
24
‘Demo’
fail_block {// catom 1lnode->setBoss(this);rnode->setBoss(this);
}...setBoss(Catom h)
myLeader = h;}
Group Members: {}
Op ID:
Failure Detect
1
2
4
3
![Page 25: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/25.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
25
‘Demo’
fail_block {// catom 1lnode->setBoss(this);rnode->setBoss(this);
}...setBoss(catom h) {
myLeader = h;}
Group Members: {1}
Op ID: 23423123
1
2
4
3
Failure Detect
![Page 26: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/26.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
26
‘Demo’
fail_block {// catom 1lnode->setBoss(this);rnode->setBoss(this);
}...setBoss(catom h) {
myLeader = h;}
Group Members: {1,2}
Op ID: 23423123
1
2
4
3
Failure Detect
![Page 27: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/27.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
27
‘Demo’
fail_block {// catom 1lnode->setBoss(this);rnode->setBoss(this);
}...setBoss(catom h) {
myLeader = h;}
Group Members: {1,2,4}
Op ID: 23423123
1
2
4
3
Failure Detect
![Page 28: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/28.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
28
‘Demo’
fail_block {// catom 1lnode->setBoss(this);rnode->setBoss(this);
}...setBoss(catom h) {
myLeader = h;}
Group Members: {1,2,4}
Op ID: 23423123
1
2
4
3
end 23423123
Failure Detect
![Page 29: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/29.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
29
The push_comp Primitive
• push_comp b
• On whichever catom it is called:• Suspend code in block b.• This code will be evaluated (purely for its side-
effects) in the event that a failure is detected.• Called ‘compensating actions*’ or ‘compensations.’
*Westley Weimer and George C. Necula. Finding and preventing runtime error handling mistakes. In OOPSLA ’04: Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages 419–431, New York, NY, USA, 2004. ACM Press.
![Page 30: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/30.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
30
The push_comp Primitive
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
OID: i OID: j OID: n
......
![Page 31: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/31.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
31
The push_comp Primitive
OID: i OID: j OID: n
......
e_1
push_comp e_1
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
![Page 32: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/32.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
32
The push_comp Primitive
OID: i OID: j OID: n
......
e_1
push_comp e_2
e_2
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
![Page 33: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/33.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
33
The push_comp Primitive
OID: i OID: j OID: n
......
e_1
push_comp e_3
e_2e_3
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
![Page 34: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/34.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
34
The push_comp Primitive
OID: i OID: j OID: n
......
e_1
FAILURE OID i!!
e_2e_3
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
![Page 35: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/35.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
35
The push_comp Primitive
OID: i OID: j OID: n
......
e_1e_2e_3 .run()
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
![Page 36: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/36.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
36
The push_comp Primitive
OID: i OID: j OID: n
......
e_1e_2 .run()
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
![Page 37: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/37.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
37
The push_comp Primitive
OID: i OID: j OID: n
......
e_1 .run()
• Each catom has several stacks of compensations, one for each OID, and compensating actions are executed from top to bottom.
![Page 38: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/38.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
38
‘Demo,’ Continued
fail_block {// catom 1lnode->recurse(this,LEFT);rnode->recurse(this,RIGH);
}recurse(catom ldr,Dir d){
myLeader = ldr;push_comp(
myLeader = -1);...nnode->recurse(lead);
}
1
2
4
3
![Page 39: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/39.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
39
‘Demo,’ Continued
fail_block {// catom 1lnode->recurse(this,LEFT);rnode->recurse(this,RIGH);
}recurse(catom ldr,Dir d){
myLeader = ldr;push_comp(
myLeader = -1);...nnode->recurse(lead);
}
1
2
4
3
![Page 40: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/40.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
40
‘Demo,’ Continued
fail_block {// catom 1lnode->recurse(this,LEFT);rnode->recurse(this,RIGH);
}recurse(catom ldr,Dir d){
myLeader = ldr;push_comp(
myLeader = -1);...nnode->recurse(lead);
}
1
2
4
3
FAIL
FAIL
![Page 41: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/41.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
41
‘Demo,’ Continued
fail_block {// catom 1lnode->recurse(this,LEFT);rnode->recurse(this,RIGH);
}recurse(catom ldr,Dir d){
myLeader = ldr;push_comp(
myLeader = -1);...nnode->recurse(lead);
}
1
2
4
3
FAIL
FAILFAILFAILFAIL
FAIL FAIL
![Page 42: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/42.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
42
‘Demo,’ Continued
fail_block {// catom 1lnode->recurse(this,LEFT);rnode->recurse(this,RIGH);
}recurse(catom ldr,Dir d){
myLeader = ldr;push_comp(
myLeader = -1);...nnode->recurse(lead);
}
1
2
4
3
FAIL
FAILFAILFAILFAIL
FAIL FAIL
![Page 43: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/43.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
43
try {fail_block {// catom 1lnode->recurse(this,LEFT);rnode->recurse(this,RIGH);
} } catch(OpFailure) {...}recurse(catom ldr,Dir d){
myLeader = ldr;push_comp(
myLeader = -1);...nnode->recurse(lead);
}
‘Demo,’ Continued
1
2
4
3
![Page 44: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/44.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
44
Conclusion
• Failure Blocks• An extension to RPC for recovering from
node failures.• Within a failure block
• RPC calls add the callee to the current operation.
• Callee and caller detect failure in one another.• Compensating actions can be stored, executed
in the event of failure.• Targeted at modular robotic systems where
failure is high but availability is important.
![Page 45: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/45.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
45
ReferencesN. Beckman and J. Aldrich. A Programming Model for Failure-Prone,
Collaborative Robots. To appear in the 2nd International Workshop on Software Development and Integration in Robotics (SDIR). Rome, Italy. April 14, 2007.
De Rosa, Goldstein, Lee, Campbell, Pillai. Scalable Shape Sculpting Via Hole Motion: Motion Planning in Lattice-Constrained Modular Robots. IEEE ICRA 2006. May 2006.
Achour Mostefaoui, Eric Mourgaya, and Michel Raynal. Asynchronous implementation of failure detectors. In 2003 International Conference on Dependable Systems and Networks (DSN’03), page 351, 2003.
Westley Weimer and George C. Necula. Finding and preventing runtime error handling mistakes. In OOPSLA ’04: Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages 419–431, New York, NY, USA, 2004. ACM Press.
Michel Reynal. A short introduction to failure detectors for asynchronous distributed systems. SIGACT News, 36(1):53–70, 2005.
![Page 46: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/46.jpg)
46
The end
![Page 47: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/47.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
47
Scenario One
1 2 3
![Page 48: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/48.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
48
Scenario One
1 2 3
host2->foo()
![Page 49: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/49.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
49
Scenario One
1 2 3
![Page 50: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/50.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
50
Scenario One
1 2 3
host3->bar()
![Page 51: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/51.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
51
Scenario One
1 2 3
![Page 52: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/52.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
52
‘Demo’
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
1
4
2 3
Group Members: {}
Op ID:
Regular Ping
![Page 53: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/53.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
53
‘Demo’
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
1
4
2 3
Group Members: {1}
Op ID: 3435435
Regular Ping
![Page 54: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/54.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
54
‘Demo’
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
1
4
2 3
Group Members: {1,2}
Op ID: 3435435
Regular Ping
![Page 55: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/55.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
55
‘Demo’
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
1
4
2 3
Group Members: {1,2,3}
Op ID: 3435435
Regular Ping
![Page 56: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/56.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
56
‘Demo’
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
1
4
2 3
Group Members: {1,2,3,4}
Op ID: 3435435
Regular Ping
![Page 57: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/57.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
57
‘Demo’
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
Group Members: {}
Op ID:
Regular Ping
1
4
2 3
end 3435435!
end 3435435!
![Page 58: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/58.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
58
At a Macroscopic Level... (Video)
![Page 59: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/59.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
59
‘Demo,’ Continued
...doWork(HostAddr a) {myLeader = a;
push_comp {if(myLeader == a)myLeader = null;
}}...
1
4
2 3
Failure!
![Page 60: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/60.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
60
‘Demo,’ Continued
...doWork(HostAddr a) {myLeader = a;
push_comp {if(myLeader == a)myLeader = null;
}}...
1
4
2 3
...myLeader= null;...
![Page 61: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/61.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
61
Outline
• The Rate of Failure Will be High
• Two Failure Scenarios We Would Like to Handle
• Existing RPC Systems Do Not Meet Our Needs
• Our Model Has Two Key Pieces• fail_block• push_comp
• Our Model Does Not Require Consistency
![Page 62: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/62.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
62
Our System Does Not Require Consistency
• Our model has a nice feature:• We do not require consistency in failure
detection!• This has been proven to be impossible in
‘time-free’ systems.
![Page 63: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/63.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
63
What is Consistency?
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
1
4
2 3
![Page 64: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/64.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
64
What is Consistency?
fail_block {(* host 1 *)host2->foo(); host4->bar();
}...foo() {host3->doWork(h1);
}
OID: 9, failure!
OID: 9, end!
OID: 9, end! 1
4
2 3
![Page 65: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/65.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
65
Our System Does Not Require Consistency
• Domain Assumption:• The ultimate goal of any application is to
perform actuator movements.
• Additionally,• The thread of control must migrate to a
catom in order to issue an actuator command.
• If a thread migrates to a catom that has detected or knows about a failure, that thread will not continue normally.
![Page 66: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/66.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
66
Our System Does Not Require Consistency
• Therefore, if inconsistency occurs, we know:• In between detection and fail_block
completion, no actuator movements were necessary on any hosts that knew about the failure.
• In the sense that actuator movements are the ultimate goal in the domain, their work was already done.
![Page 67: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/67.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
67
Our System Does Not Require Consistency
• What if we won’t make an actuator movement on a host, but we need to know it performed its duty?• E.g., structural catoms
• This is a question of live-ness versus other goals.• fail_block should be used precisely
when live-ness is a chief concern.
![Page 68: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/68.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
68
Assumptions
• Movement:• When a movement occurs, you are required
to talk with these surrounding hosts and they will be able to figure out the new location to ping.
• Goals:• Actuator movements are the ultimate goal
of most applications in this domain.
![Page 69: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/69.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
69
What about Transactions?
• Semantics of roll-back suggest a transactional model.
• Similarly, it seems that Two-Phase commit could give us consistency.
• But• 2PC has one or two extra rounds of communication
• Application doesn’t make progress!• Our model has no extra blocking rounds.
• 2PC can block indefinitely if the coordinator fails• In non-blocking protocols the number of failures is
bounded.• It is not clear how error detection and 2PC could be
combined.
![Page 70: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/70.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
70
In Detail...
• But, after this field has been set, failure of the leader leaves the catoms in a dead state.
L
![Page 71: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/71.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
71
The push_comp Primitive
• We call this suspended code ‘compensating actions.’• Borrowed terminology from Weimar and
Necula.• Originally used to ensure proper clean-up
for file handlers, etc. in exceptional circumstances.• (However, our compensating actions are only
executed when a failure is detected.)
![Page 72: A Programming model for failure-prone, Collaborative robots · SDIR07: A Programming Model for Failure-Prone Collaborative Robots 2 Failure Blocks: Increasing Application Liveness](https://reader034.vdocuments.us/reader034/viewer/2022052018/6031d51a7e4d545fb62d1586/html5/thumbnails/72.jpg)
SDIR07: A Programming Model for Failure-Prone Collaborative Robots
72
Why Server-Side Failure Handling?
• The client may think the server has failed, when it hasn’t.• Allow server to return to a stable state.• Failure detectors unreliable in ‘time-free’
systems.
• The client may have failed.
• An catom on the return route may have failed.