d_160 / mapld - 2004burke 1 fault tolerant state machines gary burke, stephanie taft jet propulsion...
TRANSCRIPT
D_160 / MAPLD - 2004 Burke 1
Fault Tolerant State Machines
Gary Burke, Stephanie Taft
Jet Propulsion Laboratory, California Institute of Technology
D_160 / MAPLD - 2004 Burke 2
Reasons for Fault Tolerant State Machines
• Reliable designs are essential for Flight systems
• The state machine needs to be tolerant of single event upsets
D_160 / MAPLD - 2004 Burke 3
State Machines• A state machine is a sequential machine that when
built into an FPGA or ASIC controls the sequencing of actions in the digital logic
• The current state of a machine is held in a state register which is updated on a clock
• The next value of the state register (next state) is derived from the current state and the inputs
• Outputs from the state machine are decoded from the state register and can also be combined with the inputs
D_160 / MAPLD - 2004 Burke 4
State Machine Encoding
• Each distinct state of a state machine is represented by a unique binary code
• Encoding is the assignment of binary codes to states
D_160 / MAPLD - 2004 Burke 5
Different Methods of Encoding States
• Binary– The simplest encoding method in which each
state is given the next available binary number in sequence
• One Hot – The number of bits in the code is equal to the
number of states– Each encoded state has just 1 bit in the encoded
word set to a 1 (the rest are 0)
D_160 / MAPLD - 2004 Burke 6
Different Methods of Encoding States Continued
• Hamming Distance of 2 (H2)– Compared to Binary encoding Hamming 2 uses one extra bit
to ensure all codes are separated by a Hamming distance of 2– It will take 2 changes in the state register to reach another
known state
• Hamming Distance of 3 (H3)– This extension on Hamming distance of 2 encoding uses
additional bits to ensure all codes are separated by a Hamming distance of 3
– It will take 3 changes in the state register to reach another known state
D_160 / MAPLD - 2004 Burke 7
Synthesis• To check the overhead of each of the state
machines, they were individually synthesized• Finite state machine optimization is turned off• A clock frequency of 50 MHz is used• Target device is a Xilinx Spartan 2, speed grade 6• Error injection circuitry is not included
D_160 / MAPLD - 2004 Burke 8
Synthesis ResultsState
Machine Size
# Slice Flip
Flops
# of 4 input LUTs
Clock Period
(ns)
Max Synthesized Frequency
(MHz)
Minimum Period (ns)
4 3 8 20 226.6 4.48 4 22 20 133.5 7.5
12 5 41 20 124.5 8.016 5 49 20 117.8 8.524 6 84 20 91.5 10.932 6 107 20 87.3 11.5
4 5 15 20 162.8 6.18 6 42 20 117.4 8.5
12 7 55 20 105.0 9.516 7 71 20 102.6 9.824 9 91 20 88.7 11.332 9 137 20 83.5 12.0
Hamming 2
Hamming 3
State Machine
Size
# Slice Flip
Flops
# of 4 input LUTs
Clock Period
(ns)
Max Synthesized Frequency
(MHz)
Minimum Period
(ns)
4 2 7 20 272.1 3.78 3 15 20 178.8 5.6
12 4 25 20 129.6 7.716 4 38 20 122.1 8.224 5 50 20 109.6 9.132 5 96 20 94.5 10.6
4 4 10 20 238.2 4.28 8 20 20 194.8 5.1
32 12 31 20 173.0 5.816 16 41 20 148.9 6.724 24 63 20 148.9 6.732 32 237 20 68.6 14.6
Binary
One Hot
D_160 / MAPLD - 2004 Burke 9
Four Bit State Encoding
4 Bit State Encoding
2
4
3
5
7
10
8
15
3.74.2 4.4
6.1
0
2
4
6
8
10
12
14
16
Binary One Hot Hamming 2 Hamming 3
# of Slice Flip Flops
# of Four Input LUTs
Clock Period (ns)
D_160 / MAPLD - 2004 Burke 10
Eight Bit State Encoding
8 Bit State Encoding
3
8
4
6
15
20
22
15
5.6 5.1
7.58.5
0
5
10
15
20
25
Binary One Hot Hamming 2 Hamming 3
# of Slice Flip Flops
# of Four Input LUTs
Clock Period (ns)
D_160 / MAPLD - 2004 Burke 11
Twelve Bit State Encoding
12 Bit State Encoding
4
12
57
25
31
41
55
7.75.8
8.0 9.5
0
10
20
30
40
50
60
Binary States One Hot Hamming 2 Hamming 3
# of Slice Flip Flops
# of Four Input LUTs
Clock Period (ns)
D_160 / MAPLD - 2004 Burke 12
Sixteen Bit State Encoding
16 Bit State Encoding
4
16
57
3841
49
71
8.2 6.78.5 9.8
0
10
20
30
40
50
60
70
80
Binary One Hot Hamming 2 Hamming 3
# of Slice Flip Flops
# of Four Input LUTs
Clock Period (ns)
D_160 / MAPLD - 2004 Burke 13
Twenty-Four Bit State Encoding
24 Bit State Encoding
5
24
69
50
91
9.1 6.710.9 11.3
63
84
0
10
20
30
40
50
60
70
80
90
100
Binary One Hot Hamming 2 Hamming 3
# of Slice Flip Flops
# of Four Input LUTs
Clock Period (ns)
D_160 / MAPLD - 2004 Burke 14
Thirty-Two Bit State Encoding
32 Bit State Encoding
5 6 9
96107
137
14.6 11.5 12.032
237
10.6
0
50
100
150
200
250
Binary One Hot Hamming 2 Hamming 3
# of Slice Flip Flops
# of Four Input LUTs
Clock Period (ns)
D_160 / MAPLD - 2004 Burke 15
Fault Injection Test
• A test circuit is generated with an example of each state machine executing the same task, plus a reference state machine
• The task chosen requires a16-state state machine, to detect a 16-bit pattern in a serial input stream
• An error generator injects faults into all state machines except the reference state machine
D_160 / MAPLD - 2004 Burke 16
Error Injection Test Continued
• The outputs of each state machine are compared to the reference output
• A set of counters tallies the comparison outputs• 2 types of failure are logged for each state
machine:– Failure to detect pattern
– False detection of pattern (false-positive)
D_160 / MAPLD - 2004 Burke 17
Error Injection Test Continued
• Non-key patterns are 1-bit different from the key pattern, to increase the likelihood of a false match
• Error rate can vary, set to 1:199 clocks in example• Errors are weighted by distributing them pseudo-randomly
over 16 bits. A state machine with a word size of n, receives n/16 of the total faults
• Synchronous fault injection is before the state register• Asynchronous fault injection is after the state register• All results are from actual implementation of the test
circuits in a Spartan 2 FPGA
D_160 / MAPLD - 2004 Burke 18
Error Rate – Synchronous Faults Synchronous (rate=199)
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
Binary 1-Hot H2 H3
erro
rs p
er p
atte
rn single
false-pos single
double
false-pos double
D_160 / MAPLD - 2004 Burke 19
Error Rate – Asynchronous Faults
Asynchronous (rate=199)
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
Binary 1-Hot H2 H3
erro
rs p
er p
atte
rn single
false-pos single
double
false-pos double
D_160 / MAPLD - 2004 Burke 20
Error Rate – Asynchronous Pulse Faults
Pulse (rate=199)
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
Binary 1-Hot H2 H3
erro
rs p
er p
atte
rn single
false-pos single
double
false-pos double
D_160 / MAPLD - 2004 Burke 21
Results: Binary Encoding
• Lowest resources used
• Second fastest speed after One Hot– Fastest for small number of states
• Second-most sensitive to errors
• Generates false-positive errors i.e. reports false pattern matches
D_160 / MAPLD - 2004 Burke 22
Results: One Hot Encoding
• No false-positive errors (single faults)• Fastest speed except for small number of states
and large number of states• Uses more resources than Binary• Inefficient for large number of states• Worst fault tolerance of all encoding tested• Has 2x the error rate of binary encoding
D_160 / MAPLD - 2004 Burke 23
Results: Hamming Distance of 2 (H2) Encoding
• No false-positive errors (single faults)
• Better Fault Tolerance than Binary
• More resources needed than One Hot, except for large number of states
D_160 / MAPLD - 2004 Burke 24
Results: Hamming Distance of 3 (H3) Encoding
• Zero single-fault errors– Immune to synchronous and asynchronous
errors
• Lowest double-fault errors• Most resources used (*)
~2x binary encoding
• Slowest speed (*)(*) Except for large number of states
D_160 / MAPLD - 2004 Burke 25
Summary
• Binary encoding will give unpredictable results when faults are injected; generating false-positive errors in the pattern matching example
• One Hot encoding provides false-positive protection, but at the cost of considerably more errors
• Hamming 2 encoded state machines will provide significantly better fault tolerance at a cost of about 25% more resources than binary
• Hamming 3 encoded state machines give excellent fault tolerance but at a ~2x increase in resources