Download - Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel
![Page 1: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/1.jpg)
Recovery Oriented Programming
Olga Brukman and Shlomi Dolev Ben-Gurion University
Beer-ShevaIsrael
![Page 2: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/2.jpg)
2
Towards Correct Software
• Software should respects its specifications– Safety, Liveness
• Atomic power station– Safety: the atomic
station shouldn't explode
– Liveness: the atomic station should produce some electricity
Atomic power station
![Page 3: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/3.jpg)
3
Recovery Oriented Design
• Software performs substantially in accordance with specifications for a period of 90 days... (IEEE Computer, October 2006)
• How to cope with such software?!– Recovery Oriented Computing [PBB'02]!
• Recovery actions– Reboot, wait, reschedule– Non-intrusive: avoid rewriting the program
(possibly new other bugs)
![Page 4: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/4.jpg)
4
Recovery Oriented Programming
• Specifications Composer (Project Manager)
– Invariants and predicates• important properties on
program IO
– Recovery actions
• Programmer• Best-effort implementation
• Using same IO variables as specifier
• Still: bugs and unexpected states
![Page 5: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/5.jpg)
5
Recovery Oriented Programming: Assumptions • Self-stabilizing processor
• Self-stabilizing OS
• Infrastructure for robust monitoring and recovery• Processes exist and execute their code
![Page 6: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/6.jpg)
Recovery Oriented Programming: Assumptions
• Not immediately Byzantine– eventual Byzantine program
Long enough to do sufficient job
![Page 7: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/7.jpg)
7
Our Framework
Pre-compiler
Code
Recovery tuples
Subsystemshierarchy
event-driven monitoring
event-driven monitoring
External Monitor
event-driven monitoring
event-driven monitoring
External Monitor
event-driven monitoring
event-driven monitoring
External Monitor
SubsystemExternal Monitor
System is able to recover from any
state
![Page 8: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/8.jpg)
Generated Code: One Process
event-driven monitoring
External Monitor
Codeevent-driven monitoring
Recovery tuples
![Page 9: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/9.jpg)
9
Generated Code: Subsystem
event-driven monitoring
event-driven monitoring
External Monitor
event-driven monitoring
event-driven monitoring
External Monitor
event-driven monitoring
event-driven monitoring
External Monitor
SubsystemExternal Monitor
Code
Code
Code
Recovery tuples
Subsystemshierarchy
![Page 10: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/10.jpg)
10
Our Framework: Transforming Recovery Tuples into Code
Code
Recovery tuples
Subsystemshierarchy
event-driven monitoring
event-driven monitoring
External Monitor
SubsystemExternal Monitor
Pre-compiler
event-driven monitoring
event-driven monitoring
External Monitor
event-driven monitoring
event-driven monitoring
External Monitor
![Page 11: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/11.jpg)
11
Safety Recovery Tuple
...x=a;...
PRED: x!=7RA: this.restart()
1 process
temp_x=a;if temp_x!=7 x=temp_x;else this.restart();
Pre-compiler
![Page 12: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/12.jpg)
12
Safety Recovery Tuple in the Scope of Stabilization: External Monitoring
...x=a;...
PRED: x!=7RA: this.restart();
1 process
temp_x=a;if temp_x!=7 x=temp_x;else this.restart(); ...
if !(ps.x!=7) ps.restart();
No more x=...
Pre-compiler
![Page 13: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/13.jpg)
13
Liveness Recovery Tuple
x=x+2;...y=y+5;...
INV: eventually x+y=15RA: this.restart()HTR: history={}
1 processx=x+2;if (x+y==15) this.history={};...y=y+5;if (x+y==15) this.history={};
History= [ ... {.., x=1,y=2,..}, {.., x=3,y=7,..},...]
history=history▪this.state(); if loop in history and CPU(this) ps.restart();
Pre-compiler
![Page 14: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/14.jpg)
14
Generated Monitoring Code for Subsystem
Code for p1
Recovery Tuples
sub: p1, p
2
History= [ ... distributed snapshot(sub),...] External monitor
for sub
Code for p2
Pre-compiler event-driven monitoring
event-driven monitoring
External Monitor
event-driven monitoring
event-driven monitoring
External Monitor
![Page 15: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/15.jpg)
15
Generic Correctness Theorem
• In the program produced by the pre-compiler every rsf (restart supporting fair)-execution E has a suffix in which the program respects its specification function
– A rsf-execution is the execution in which system is trusted to behave according to its specifications after restart.
![Page 16: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/16.jpg)
16
Generic Correctness Proof
• Assumption: Processes and external monitors are scheduled fairly due to presence of self-stabilizing software platform
• Safety: process either reaches monitoring section in its code or its external monitor makes scheduled check – Subsystem: external monitor makes scheduled
check
![Page 17: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/17.jpg)
17
Generic Correctness Proof Cont.
• Liveness: the process (subsystem) external monitor makes scheduled check of the history log
• Corrupted history: – If causes (unnecessary) recovery - trimmed– New correct records are eventually
accumulated and reflect the real state of system
![Page 18: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/18.jpg)
18
Related Work: Perfect Software• Formal specification languages
– ASM [GRS'04], IO Automata [L'96], NURPL [CKB'84]
– Gradually and manually translated into fully verified program
• Model checking – Doesn't scale
• Specification embedding programming languages– SRC (Software Cost Reduction) language [RLHL'06]
– Programmer bugs
![Page 19: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/19.jpg)
19
Related Work: Programming Tools• Design By Contract
– Eiffel, iContract for Java– Checking invariants on an object state,
pre-/post-conditions on object methods, recovery by predefined recovery action
– Partial monitoring of liveness, based on timeout
– Monitoring of safety outside of stabilization scope
• Exceptions– Suitable for single process only
• Unpractical for changing the program flow
![Page 20: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/20.jpg)
20
Related Work: Online Recovery• Recovery blocks (N-programming) [RX94]• ROC [PBB02], Java MOP[CR'05],
Kinesthetics eXtreme [KPGV'03], "On Modeling and Tolerating Incorrect Software" [AT'03]
• Monitoring/correcting layer that alternates the failed component behaviour
![Page 21: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/21.jpg)
21
Related Work: Online Recovery
• Assumption of monitoring/correcting layer stability– ROC [PBB02], Java MOP[CR'05], Kinesthetics
eXtreme [KPGV'03]• Intrusive correcting actions
– Empty program: correcting actions define the program
• "On Modelling and Tolerating Incorrect Software" [AT'03]
![Page 22: Recovery Oriented Programming Olga Brukman and Shlomi Dolev Ben-Gurion University Beer-Sheva Israel](https://reader030.vdocuments.us/reader030/viewer/2022032522/56649d645503460f94a46e48/html5/thumbnails/22.jpg)
22
Conclusions
• Recovery Oriented Programming paradigm for a programming language
• Full monitoring of safety and liveness properties in the scope of stabilization
• Formal correctness proof scheme for the resulting code