distributed algorithms and vlsi - ti.tuwien.ac.at · distributed algorithms and vlsi ulrich schmid...
TRANSCRIPT
Keynote SSS‘08
Distributed Algorithms and VLSI
Ulrich SchmidVienna University of Technology
Keynote SSS'08 U. Schmid 2
Content
Short introduction to Very Large Scale Integration (VLSI): A photo gallery …– Great perspectives– But …
VLSI Circuits ↔ Distributed Algorithms– DAs and VLSI: Do’s and Don’t’s
Do’s – an Example: DARTS Fault-tolerant Clocks– Starting point: A simple distributed algorithm– How to implement it in VLSI ?– Proofs – [Under the rug: Metastability …]
Keynote SSS'08 U. Schmid 3
Short introduction to VLSI: A photo gallery …
Keynote SSS'08 U. Schmid 4
VLSI Circuits
Keynote SSS'08 U. Schmid 5
Major IngredientsTransistors (nMOS):
Polysilicon GateSiO2Insulator
n n
p substrate
channel
Source Drain
LW
Gate
Source
Drain
Interconnect (wires):
Form & connect gates
(Inverter)
Keynote SSS'08 U. Schmid 6
Miniaturization: Moore‘s Law
Intel 4004 (1971) Intel P4 (2001)• 2.250 transistors• 12 mm2 / 10 µm• 0.74 MHz, 1W
• 42.000.000 transistors• 217 mm2 / 0.180 µm = 180 nm• 2 GHz, 50 W
Keynote SSS'08 U. Schmid 7
Multicore Processors
IBM POWER4 (dual-core)
IBM Cell (8-core)
Tilera TILE64
Today: < 45 nm
Keynote SSS'08 U. Schmid 8
Systems-on-Chip (SoC)
Assemble whole SoC from suitable componentsMarket for “IP cores”, from different vendorsSync/asyncinterfaces
Nvidia Tegra
Keynote SSS'08 U. Schmid 9
Great perspectives for VLSI circuits.
But …
Keynote SSS'08 U. Schmid 10
Manufacturing Limitations
VLSILab Politechnico Torino
Optical Proximity Correction, Intel Corp.
Keynote SSS'08 U. Schmid 11
Defects (Electromigration)
P. Gutman, IBM T.J.Watson Research Center
M. Ohring, Reliability and Failure of Electronic Materials and Devices,1998 ASM Corp. Shanghai
Wiskers Hillock
Void
Keynote SSS'08 U. Schmid 12
Defects (Gate Oxide BD)
K.-L. Pey, C.-H. Tung, Physical characterization of breakdown in metal-oxide-semiconductor transistors
Breakdown−induced thermochemical reactions in (a) poly−Si gate and (b) p−Si substrate of n−channel MOSFETs.
Semitracks, Inc.
ESD-induced gate oxide breakdownwww.siliconfareast.com
Keynote SSS'08 U. Schmid 13
Power Dissipation Problems
A. Choudhary, UMassSmall transistor dissipating 5mW in an SOI wafer; University of Bolton
→ Reduce supply voltage !
Keynote SSS'08 U. Schmid 14
Radiation-induced Soft Errors
SLAC National Accelerator LabStanford
SET SEU
Powell, 1959
0 km10 km
1
10-3
Soft error rates dominate in VLSI !
Keynote SSS'08 U. Schmid 15
Slow Signal Propagation
Transistors switch faster
BUTWires thinnerLess transistor driving strengthRC Signal propagation along wires dominate circuit speed
Keynote SSS'08 U. Schmid 16
Clock Distribution Problem
Circuit & physical design of the POWER4 microprocessor, IBM J. Res. Dev.
Cell processor
tPD,CLK
CLK
D
CLK
D
CLK
D
CLK
D
…
tdly,DATA,1m
tdly,DATA,2m
tdly,DATA,km
FF1
FF2
FFk FFmcombin. logic
Clock signal (common!)
CLK
D
CLK
DCombinat. logic (gates)
Data
Synchronous design paradigm:
→ Synchronous abstraction increasingly difficult to maintain !
Keynote SSS'08 U. Schmid 17
Hence, deep submicron VLSI circuits …
Keynote SSS'08 U. Schmid 18
… are in fact FT Distributed Systems
Spatial distributionMessage-passing communicationMassive concurrencyAsynchronyFailuresSecurity issues (IP cores!)
Worth-while undertaking:Explore the applicability of DA results & approaches to VLSI circuits …
Keynote SSS'08 U. Schmid 19
Applying DA Research in VLSI ?
2008 Dagstuhl-Seminar Distributed Algorithms in VLSI Chips (B. Charron-Bost, J. Ebergen, S. Dolev, U. Schmid, http://www.dagstuhl.de/08371)
[Great place for such undertakings …]
Keynote SSS'08 U. Schmid 20
DA and VLSI – Don’t’s
Apply standard DAs in the VLSI context – too heavy weight in terms of computation & communicationApply standard replication-based FT (for coping with “classic” VLSI faults) – too heavy-weight in terms of power & area penalties
BUT …
Keynote SSS'08 U. Schmid 21
DA and VLSI – Do’s (I)Apply “light-weight” DAs for decentralized handling of [nowadays centralized] functions, e.g. in large multicores– Memory access scheduling (Moscibroda & Mutlu, PODC’08)– Apply self-stabilizing algorithms for handling transient failures (S.
Dolev & Haviv, IEEE ToC, 2006)– Fault-tolerant clock generation in SoCs (Függer, Schmid, Fuchs,
Kempf, EDCC’06)
Apply replication-based FT to cope with malicious failures in VLSI – IP core security threats in SoCs– Inconsistently propagated errors in high-dependability
applications
Tilera TILE64
Keynote SSS'08 U. Schmid 22
DA and VLSI – Do’s (II)
Apply VLSI results & approaches in DA research– Error-correcting codes and asynchronous consensus (Friedmann,
Mostefaoui, Rajsbaum & Raynal, IEEE ToC, 2007)– Corruption-resilient Codes (S. Dolev & Tzachar, DISC’08)
Extend DA approaches, to contribute to a (still lacking!) “Theory of Dependable VLSI Circuits”– Early example: Arbiter-Problem (Lamport, ~1980)– Handle massive concurrency (continuously computing gates!)– Handle computation and communication resource restrictions– Handle “non-closed” specifications– Define suitable failure models
Keynote SSS'08 U. Schmid 23
Do’s – an Example: DARTS Fault-tolerant Clocks
Keynote SSS'08 U. Schmid 24
DARTS – Distributed Algorithms for Robust Tick Synchronization
Joint work with A. Steininger, M. Függer, G. Fuchs [and many others]
http://ti.tuwien.ac.at/ecs/research/projects/darts
Keynote SSS'08 U. Schmid 25
Clocking in SoCs (I)
Classic synchronous paradigmConcept: Common notion of time for entire chip
Method: Single quartz oscillatorGlobal, phase-accurate clock tree
Disadvantages- Cumbersome clock tree design- High power consumption- Clock is single point of failure!
DSP
WLAN
Video
GPRS
GPS
Keynote SSS'08 U. Schmid 26
Clocking in SoCs (II)
Alternative: DARTS clocksConcept: Multiple synchronized tick generators
Method: Distributed FT tick generation alg (TG algs)Interacting via dedicated clock network (TG net)
Advantages- No quartz oscillator(s)- No critical clock tree- Clock is no single point of failure!- Reasonable synchrony
DSP
WLAN
Video
GPRS
GPS
Keynote SSS'08 U. Schmid 27
Reasonable Synchrony ?
Phase synchronization
Clock synchronization
- max precision, - min/max frequency
Tick synchronization
Keynote SSS'08 U. Schmid 28
Starting point: A Distributed Algorithm
Keynote SSS'08 U. Schmid 29
On booting do:send tick(0) to all; C:= 0; /* C is last tick number sent */
Continuously do:
If received tick(C) from n – f different processes:send tick(C+1) to all; C := C+1;
On booting do:send tick(0) to all; C:= 0; /* C is last tick number sent */
Continuously do:
If received tick(C) from all n processes:send tick(C+1) to all; C := C+1;
Failure-free case (f = 0): Simple barrier synchronization(Modified) Srikanth & Toueg algorithmFailure case f > 0 ?
A Distributed Algorithm (I)
On booting do:send tick(0) to all; C:= 0; /* C is last tick number sent */
Continuously do:If received tick(X) from f +1 different processes and X > C:
send tick(C+1),…, tick(X) to all [once]; C := X;If received tick(C) from n – f different processes:
send tick(C+1) to all [once]; C := C+1;
Keynote SSS'08 U. Schmid 30
A Distributed Algorithm (III)For n ≥ 3f + 1 and up to f Byz. failures,
with end-to-end delays ∈[d,d+ε]:Suppose process p sends tick(C+1) at time tThen, process q also sends tick(C+1)by time t+d+2ε
⇒ Clock ticks occur approximately synchronously
On booting: send tick(0) to all; C := 0; If got tick(X) from f +1 procs and X > C: send tick(C+1),…, tick(X) to all [once]; C := X; If got tick(C) from n - f processes: send tick(C+1) to all [once]; C := C+1;
f + 1
n − f ≥ 2f + 1
p at t any q’ at t+ε any q at t+d+2ε
≤ ε≤ d+ε
Keynote SSS'08 U. Schmid 31
How to implement this DA in VLSI ?
Mind: We don’t have any clock available for a synchronous implementation …
Keynote SSS'08 U. Schmid 32
Asynchronous Basic Circuits
a
b
y
loop
b y
a
y
prop
a b0
10
01
10
1
yold0
1yold
AND, OR, …; Muller C-Gate:- Continuously computes y = y(a,b) [with delay tprop]- AND gate for signal transitions ( barrier synchronization)- Note: Inevitably involves feedback loop [tloop]
Keynote SSS'08 U. Schmid 33
Asynchronous Communication
Convey alternating up/down signal transitions only FIFO “zero-bit message” channels [with delay]
performance penalty (serial data transmission)additional wires (parallel data transmission)
Sender Receiver
k-bit
k-bit data transmission costly: Additional circuitry +
Signal wires
Keynote SSS'08 U. Schmid 34
Major Challenges
If received tick(X) from f +1 processes and X > C :send tick(C+1),…, tick(X) to all [once]C := X
If received tick(C) from n − f processes :send tick(C+1) to all [once]C := C+1
k-bit message, k unbounded
Atomicity of actions
To be replaced byzero-bit messages
k kept at receiver
To be ensured byarchitecture + pathdelay constraints
Build suitablethreshold circuits
Thresholdcomparison
Keynote SSS'08 U. Schmid 35
k-bit Zero-bit Messages
...
...C
C
C
C
Rremote,in
C
C
C
C
NAND
NOR
NOR
NAND
NAND
NAND
GEQe
GRe
GEQo
GRo
Ctop
Pipe Compare Signal Generation
Diff-Gate Local PipeRemote Pipe
Counter Module
LocalClk
TG net feeds everyclock signal to everyTG alg (bus of width n)At every TG alg, n − 1 Counter Modules [oneper remote TG alg] maintain tick numbersAnonymous ticks ⇒rules only distinguish– r rem > r loc (f + 1, GR
rule) – r rem ≥ r loc (n − f, GEQ
rule)
Asynchronous up/down-counterTG alg 1
TG alg 6
TG alg 5
TG alg 4
TG alg 3
TG alg 2
TG net
On booting: send tick(0) to all; C := 0; If got tick(X) from f +1 procs and X > C: send tick(C+1),…, tick(X) to all [once]; C := X; If got tick(C) from n - f processes: send tick(C+1) to all [once]; C := C+1;
Move tick number maintenance from sender to receiver
Keynote SSS'08 U. Schmid 36
Asynchron. Up/Down Counter
C
C
C
C
Rremote,in
C
C
C
C
NAND
NOR
NOR
NAND
NAND
NAND
GEQe
GRe
GEQo
GRo
Ctop
Pipe Compare Signal Generation
Diff-Gate Local PipeRemote Pipe
Counter Module
LocalClk
Ingredients:– Two elastic pipelines (= FIFO buffers for signal
transitions) count remote and local clock ticks– Common transitions removed by Diff-Gate– GR and GEQ status signals derived from last stages
Metastability-free by construction [well, almost …]
Keynote SSS'08 U. Schmid 37
Atomicity of Actions
The gates making up the f + 1 and the n − f rulecompute continuously and concurrently, hence– may both produce tick(k), for the same k– this must be circumvented by all means [„once“]
How to ensure this atomicity?– Use separate circuitry for generating up-transitions (odd
k) and down-transitions (even k) → tick(k − 1) and tick(k) never mixed up
– Ensure that ratio of the maximum and minimum delay along certain paths is bounded (cp. Θ–Model [WLS05], ABC Model [RS08]) → tick(k − 2) and tick(k) nevermixed up
Keynote SSS'08 U. Schmid 38
Threshold Modules
...
...
......
......
GR and GEQ statussignals of the n − 1 Counter Modules fedinto f +1 and n − fthreshold gatesBack-transition from status signals to transition-signalling for generating tick(k)
Keynote SSS'08 U. Schmid 39
Proofs
Keynote SSS'08 U. Schmid 40
Proofs & Implementations (SW)
abstraction
model (alg+sys)
implementation
SW
specificationproof
On booting: send tick(0) to all; C := 0; If got tick(X) from f +1 procs and X > C: send tick(C+1),…, tick(X) to all [once]; C := X; If got tick(C) from n - f processes: send tick(C+1) to all [once]; C := C+1;
- max precision- min/max frequency
Ticksync n TG Algs, f Byz.
Executable machine code, real system
Prove that the model meets the specificationMinimize „proof gap“ between model and implementation
Proof goals:
Tick synced FT clocks
Distr. state machine, Byzantine failures
TTP implementation
Keynote SSS'08 U. Schmid 41
Proofs & Implementations (HW)
abstraction
model (alg+sys)
implementation
SW HW
partitioning & constraints
HW capabilities
specificationproof
On booting: send tick(0) to all; C := 0; If got tick(X) from f +1 procs and X > C: send tick(C+1),…, tick(X) to all [once]; C := X; If got tick(C) from n - f processes: send tick(C+1) to all [once]; C := C+1;
Keynote SSS'08 U. Schmid 42
Hierarchical Proof
Specification of low-level building blocks Up/down ticks correctly simulate tick(k)Synchronization propertiesBounded Precision & FrequencyBounded space (pipeline)
tick-up/downInterlocking proof
tick(k), tick(k+1), …
(P)
Precision & Frequency
(U) (S)
Bounded space
Keynote SSS'08 U. Schmid 43
On booting:send tick(0) to all; C := 0;
If got tick(X) from f +1 procs and X > C: send tick(C+1),…, tick(X) to all [once];C := X;
If got from n - f processes: send to all [once];C := C+1;
Interlocking Proof - “[once]”
k
k+1
k-2
x
tick-up/down
tick(k), tick(k+1), …
Interlocking proof
tick(k+1)tick(k)
x
tick(C)tick(C+1)
Keynote SSS'08 U. Schmid 44
Higher-Level Properties
(P) Progress. If all correct nodes send tick(k) by time t, then every correct node sends at least tick(k+1) by t + T+.(U) Unforgeability. If no correct node sends tick(k) by time t, then no correct node sends tick(k+1) by t+T-
first.(S) Simultaneity. If some correct node sends tick(k) by time t, then every correct process sends at least tick(k) by t+T-
first
and, on top of those,
Precision & FrequencyBounded pipeline size
tick(k), tick(k+1), …
(P)
Precision & Frequency
(U) (S)
Bounded pipes
Prove elementary synchronization properties
Keynote SSS'08 U. Schmid 45
Complete Suite of Proofs
[EDCC’06]
Keynote SSS'08 U. Schmid 46
ack_ext ack_int
req_ext req_int
Remote Pipe
____
_G
EQe
GR
e
GEQ
o
___
GR
o
3f+1
1
= 2f+1 = 2f+1
= f+1 = f+1
......
......
Threshold Logic_____GEQe
GRe
GEQo
___GRo
clk_
out
Pipeline 1
Node p
...
...
...
Pipe Compare Signal Generators
CC
CC
CC
CC
C
Diff-GateCC
C
Local Pipe
rem
ote
clk_
in
External Pipe
Pipeline 2
Local PipeDiff-Gate
Pipe Compare Signal Gen.
ExternalPipe
Pipeline 3
Local PipeDiff-Gate
Pipe Compare Signal Gen.
RemotePipe
Pipeline 3f+1
LocalPipe
Diff-Gate
Pipe Compare Signal Gen.
...
Complete Implementation
Implementation of the model only needs to– implement the low-level building blocks as specified– ensure the additional delay ratio bounds for
interlocking proof (place & route constraints)
[DFT’06]
Keynote SSS'08 U. Schmid 47
DARTS - Lessons Learned
Fault-tolerant distributed algorithms are indeed applicable in the VLSI context, but need “down-sizing” Distributed computing models with bounded delay ratio (Θ-Model, ABC model) well-suited for VLSI context (technology migration, re-using of models, etc.)Sole transition logic approach not sufficient for fault-tolerance ⇒ need a model that integrates event and state representationTime-free models suffer from a large “proof-gap” ⇒ need a model incorporating (continuous) timeFailures raise new metastability concerns ⇒ MS needs further investigation
Keynote SSS'08 U. Schmid 48
Under the rug: Metastability …
[Stolen from Dagstuhl presentation of A. Steininger …]
Keynote SSS'08 U. Schmid 49
Metastability
1
2
3
4
5
1 2 3 4 5
Inv 1
Inv 2
ui,2 = uo,1
ui,1 = uo,2
stable (HI)
stable (LO)
metastable
Bistable element(memory cell) withpositive feedback
Keynote SSS'08 U. Schmid 50
Revisit Muller C-Element
1
01
0x
a
x
y
a
x
y
a
x
y
pure delay at gateand interconnect
limited output slope
normal operation
oscillationcreeping
b y
a
Keynote SSS'08 U. Schmid 51
Error Containment
count pr
count pq
ThM
TG
node p
count qp
count qr
ThM
TG
node q
count rp
count rq
ThM
TG
node r
According to our proofs the wall holds – but we ignored metastability!
Keynote SSS'08 U. Schmid 52
The Counter Module
count pr
count pq
ThM
TG
node p
count qp
count qr
ThM
TG
node q
count rp
count rq
ThM
TG
node r
C
C
C
C
Rremote,in
C
C
C
C
NAND
NOR
NOR
NAND
NAND
NAND
GEQe
GRe
GEQo
GRo
Ctop
Pipe Compare Signal Generation
Diff-Gate Local PipeRemote Pipe
Counter Module
LocalClk
purely combinational logicwon‘t hurt
BUT won‘t help
Muller C-ElementMetastable input may pass through!
Keynote SSS'08 U. Schmid 53
The Threshold Module
count pr
count pq
ThM
TG
node p
count qp
count qr
ThM
TG
node q
count rp
count rq
ThM
TG
node r
Threshold Modulepurely combinational logic=> will not create metastability problem
BUT:
will propagate metastabilitywhile being near thethreshold
NO masking, NO protection
Keynote SSS'08 U. Schmid 54
Metastability Containment ?
count pr
count pq
ThM
TG
node p
count qp
count qr
ThM
TG
node q
count rp
count rq
ThM
TG
node r
Keynote SSS'08 U. Schmid 55
The End … © 2007, WDR
Keynote SSS'08 U. Schmid 56
Some References[Bau05] R. Baumann. Radiation-induced soft errors in advanced semiconductor technologies. IEEE Transactions on Device and Materials Reliability 5(3):305--316, Sept. 2005.[BJ83] J. C. Barros and B. W. Johnson. Equivalence of the arbiter, the synchronizer, the latch, and the inertial delay. IEEE Trans. Comput., 32(7):603--614, 1983.[BZMLCLD02] R. Bhamidipati, A. Zaidi, S. Makineni, K. Low, R. Chen, K.-Y. Liu, and J. Dalgrehn. Challenges and methodologies for implementing high-performance network processors. Intel Technology Journal, 6(3):83--92, Aug. 2002.[BY07] A. Bink and R. York. Arm996hs, the first licensable, clockless 32-bit processor core. IEEE Micro, 25(2):58--68, February 2007.[Bor05] S. Borkar. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro, 25(6):10--16, Nov. 2005.[Cha84] D. M. Chapiro. Globally-Asynchronous Locally-Synchronous Systems. PhD thesis, Stanford University, Oct. 1984.[Con03] C. Constantinescu. Trends and challenges in VLSI circuit reliability. IEEE Micro, 23(4):14--19, July 2003.[DH06a] S. Dolev and Y. Haviv. Self-stabilizing microprocessors, analyzing and overcoming soft-errors. IEEE Transactions on Computers, 55(4):385--399, Apr. 2006.[Dol00] S. Dolev. Self-Stabilization. MIT Press, 2000.[DR98] C. Dyer and D. Rodgers. Effects on spacecraft \& aircraft electronics. In Proceedings ESA Workshop on Space Weather, ESA WPP-155, pages 17--27, Nordwijk, The Netherlands, nov 1998. ESA. [DT08] S. Dolev and N. Tzachar. Brief announcment: Corruption resilient fountain codes. In DISC, pages 502--503, 2008.[FFSK06:DFT] M. Ferringer, G. Fuchs, A. Steininger, and G. Kempf. VLSI Implementation of a Fault-Tolerant Distributed Clock Generation. IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT2006), pages 563--571, Oct. 2006.
Keynote SSS'08 U. Schmid 57
Some References
[FMRR07] R. Friedman, A. Mostefaoui, S. Rajsbaum, and M. Raynal. Asynchronous agreement and its relation with error-correcting codes. IEEE Trans. Comput., 56(7):865--875, 2007.[Fri01] E. G. Friedman. Clock distribution networks in synchronous digital integrated circuits. Proceedings of the IEEE, 89(5):665--692, May 2001.[FSFK06] M. Fuegger, U. Schmid, G. Fuchs, and G. Kempf. Fault-Tolerant Distributed Clock Generation in VLSI Systems-on-Chip. In Proceedings of the Sixth European Dependable Computing Conference (EDCC-6), pages 87--96. IEEE Computer Society Press, Oct. 2006.[ITRS05] International technology roadmap for semiconductors, 2005.[KHP04] T. Karnik, P. Hazucha, and J. Patel. Characterization of soft errors caused by singleevent upsets in CMOS processes. Dependable and Secure Computing, IEEE Transactions on, 1(2):128--143, April-June 2004.[KK98] I. Koren and Z. Koren. Defect tolerance in VLSI circuits: Techniques and yield analysis. Proceedings of the IEEE, 86(9):1819--1838, Sep 1998.[Lam84] L. Lamport. Buridan's principle. Technical report, SRI Technical Report, 1984.[Lam03] L. Lamport. Arbitration-free synchronization. Distributed Computing, 16(2/3):219--237, September 2003. [LP76] L. Lamport and R. Palais. On the glitch phenomenon. Technical report, SRI Technical Report, 1976.[LS03] G. Le Lann and U. Schmid. How to implement a timer-free perfect failure detector in partially synchronous systems. Technical Report 183/1-127, Department of Automation, Technische Universit\"at Wien, January 2003.[Mar81] L. Marino. General theory of metastable operation. IEEE Transactions on Computers, C-30(2):107--115, February 1981.[MA01] M. S. Maza and M. L. Aranda. Analysis of clock distribution networks in the presence of crosstalk and groundbounce. In Proceedings International IEEE Conference on Electronics, Circuits, and Systems (ICECS), pages 773--776, 2001.
Keynote SSS'08 U. Schmid 58
Some References[Nic05] M. Nicolaidis. Design for soft error mitigation. Device and Materials Reliability, IEEE Transactions on, 5(3):405--418, Sept. 2005.[Nor96] E. Normand. Single-event effects in avionics. IEEE Transactions on Nuclear Science,43(2):461--474, Apr 1996.[PB93] M. Peercy and P. Banerjee. Fault tolerant VLSI systems. Proceedings of the IEEE, 81(5):745--758, May 1993.[Res01] P. J. Restle and others. A clock distribution network for microprocessors. IEEE Journal of Solid-State Circuits, 36(5):792--799, May 2001. [RDS90] L. M. Reyneri, D. DelCorso, and B. Sacco. Oscillatory metastability in homogeneous and nhomogeneous flip-flops. IEEE Journal of Solid-State Circuits, SC-25(1):254--264, February 1990.[RS08] P. Robinson and U. Schmid. The Asynchronous Bounded-Cycle Model. Proceedings SSS'08, 2008.[SE02] I. E. Sutherland and J. Ebergen. Computers without Clocks. Scientific American, 287(2):62--69, Aug. 2002.[Sut89] I. E. Sutherland. Micropipelines. Communications of the ACM, Turing Award, 32(6):720--738, June 1989. ISSN:0001-0782.[WLS05] J. Widder, G. Le Lann, and U. Schmid. Failure detection with booting in partially synchronous systems. In Proceedings of the 5th European Dependable Computing Conference (EDCC-5), volume 3463 of LNCS, pages 20--37, Budapest, Hungary, Apr. 2005. Springer Verlag.[WS05] J. Widder and U. Schmid. Achieving synchrony without clocks. Research Report 49/2005, Technische Universität Wien, Institut für Technische Informatik, 2005. (submitted).