a reconfigurable stochastic architecture for highly reliable computing
DESCRIPTION
Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja. A Reconfigurable Stochastic Architecture for Highly Reliable Computing. Electrical & Computer Engineering. University of Minnesota. GLSVLSI, Boston – May 12, 2009. Opportunities & Challenges. Topological constraints . - PowerPoint PPT PresentationTRANSCRIPT
Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja
A Reconfigurable Stochastic Architecture for Highly Reliable Computing
A Reconfigurable Stochastic Architecture for Highly Reliable Computing
Electrical & Computer EngineeringUniversity of Minnesota
A
B
C
GLSVLSI, Boston – May 12, 2009
Opportunities & Challenges
• Topological constraints.• Inherent structural randomness.• High defect rates.
Novel materials, devices, technologies:
Challenges for logic synthesis:
• High density of bits/logic/interconnects.
{
{
N Wires
M Wires
Opportunities & Challenges
Strategy:• Cast synthesis in terms of arithmetic
operations on real values.• Synthesize circuits that compute
logical values with probability corresponding to the real-valued inputs and outputs.
{
{
N Wires
M Wires
Probabilistic Signals
Claude E. Shannon1916 –2001
“A Mathematical Theory of Communication” Bell System Technical Journal, 1948.
deterministic
random
deterministic
Probabilistic Analysis
• Circuit Reliability – Probabilistic fault models.– Random test pattern generation.
• Statistical Timing Power (circuit level).
• Statistical Performance Measures (architectural level).
Probabilistic Analysis“There are known knowns; and there are unknown
unknowns; but today I’ll speak of the known unknowns.”
– Donald Rumsfeld, 2004
Independent
Known
Unknown
ProbabilisticInputs
ProbabilisticOutputs
DigitalCircuit
Probabilistic Analysis“There are known knowns; and there are unknown
unknowns; but today I’ll speak of the known unknowns.”
– Donald Rumsfeld, 2004
ProbabilisticInputs
ProbabilisticOutputs
DigitalCircuit
Synthesis of Probabilistic Circuits
Unknown(for us to design)
SpecifiedIndependent
Known
Unknown
Synthesis of Probabilistic Logic
• Shannon and von Neumann:– “Probabilistic Logic,”– “Reliable Circuits Using Less Reliable Relays”.
• K. Nepal, R. Bahar, J. Mundy, W. Patterson, and A. Zaslavsky, “Designing Logic Circuits for Probabilistic Computation in the Presence of Noise.”
• L. Chakrapani, P. Korkmaz, B. Akgul, and K. Palem, “Probabilistic System-on-a-chip Architecture.”
Stochastic Logic
Probability values are the input and output signals.
combinationalcircuit0.7
0.616
0.468
combinationalcircuitt
Stochastic Logic
Probability values are the input and output signals.
24.06.0 tt
3.08.08.0 2 tt
Functions of a probability value t.
X
Y
X
Y
Z
Z
(independently)tZX )1Pr()1Pr(
3.0)1Pr( Y
t
t
t
t
0.3
0.3
24.06.0 tt
3.08.08.0 2 tt
Stochastic Logic
Stochastic Bit Streams
A real value x in [0, 1] is encoded as a stream of bits X.For each bit, the probability that it is one is: P(X=1) = x.
x = 2/50,1,0,1,0
X
Probabilistic Bundles
01001
X
A real value x in [0, 1] is encoded as a stream of bits X.For each bit, the probability that it is one is: P(X=1) = x.
x = 2/5
Stochastic Logic
5/8
3/8
4/8
3/8
4/8
8/8
Probability values are the input and output signals.
combinationalcircuit
Stochastic LogicProbability values are the input and output signals.
1,1,0,1,0,1,1,0…
1,0,0,0,1,1,0,0,…
0,1,1,0,1,0,1,0,…
0,1,1,0,1,0,0,0,…
1,0,1,0,1,0,1,0,…
1,1,1,1,1,1,1,1,…
serial bit streams
combinationalcircuit
combinationalcircuit
Stochastic LogicProbability values are the input and output signals.
parallel bit streams
4/8
3/8
4/8
8/8
5/8
3/8
combinationalcircuit
RandomnessAnalog interface with fractional weighting of 1’s.
parallel bit streams
A/D
A/D
A/D
A/D
A/DA/D
combinationalcircuit
RandomnessAnalog interface with fractional weighting of 1’s.
parallel bit streams
LFSR
LFSR
LFSR
Accumulator
AccumulatorLFSR
A
VDD
A{{
N Wires
M Wires
Nanowire Crossbar (idealized)
Randomized connections,yet nearly one-to-one.
Fault Tolerance
Conventional approach: binary radix encoding.
0.111 (7/8)
0.010 (2/8)0.001 (1/8)
Fault Tolerance
Bit flips can result in large error.
Conventional approach: binary radix encoding.
0.111 (7/8)
0.110 (6/8)0.101 (5/8)
Fault Tolerance
0111111… (7/8)
1100000… (2/8)01000000… (1/8)
Stochastic Logic
AND
• Highly redundant.• Complex operations can be performed with simple logic.
Fault Tolerance
0111111… (7/8)
1100100… (3/8)01000100… (2/8)
Stochastic Logic
• Highly redundant.• Complex operations can be performed with simple logic.
AND
Bit flips never result in large errors.
Arithmetic Operations
AND
A
BC
Multiplication (Scaled) Addition
ba
BPAP
CPc
)()(
)(
)
)1(
()](1[)()(
)(
bsas
BPSPAPSP
CPc
A
BC
MUX
S
0
1
Synthesizing Stochastic Logic
combinationalcircuit
)(tgt
Only polynomials…
Questions:
• What kinds of functions can be implemented in the probabilistic domain?
• How can we synthesize the logic to implement these?
Synthesizing Polynomials
combinationalcircuit
)(tgt
Only polynomials…
• Implement polynomials using AND (multiplication) and MUX (scaled addition).
• Must consider polynomials with coefficients less than 0 or larger than 1…
A little math…
( ) (1 ) ,n i n ii
nB t t t
i
0,1, ,i n
Bernstein basis polynomial of degree n
A little math…
( ) (1 ) ,n i n ii
nB t t t
i
0,1, ,i n
0
( ) ( )n
n n ni i
i
B t b B t
Bernstein basis polynomial of degree n
Bernstein polynomial of degree n
nib is a Bernstein coefficient
A little math…
0
( ),
( )
iijn n
i jnj j
b a
0,1, ,i n
Obtain Bernstein coefficients from power-form coefficients:
Given0 0
( ) ( )n n
n i n ni i i
i i
g t a t b B t
, we have
Example: Converting a Polynomial
32 683)( ttttg
)()(3
2)()( 3
332
31 tBtBtBtg
)()(5
2)(
5
3
)()(4
1)(
6
1)(
4
3
55
52
51
44
43
42
41
tBtBtB
tBtBtBtB
Power-Form Polynomial
Bernstein Polynomial
coefficients in unit interval
Synthesizing Polynomials
combinationalcircuit )(tgt
Synthesis steps:
1. Convert the polynomial into a Bernstein form.
2. Elevate it until all coefficients are in the unit interval.
3. Implement this with “generalized multiplexing”.
Probabilistic Multiplexing
A
BC
MUX
T
)
)1(
()](1[)()(
)(
btat
BPTPAPTP
CPc
Bernstein polynomial
X1, …, Xn are independent Boolean random variables with Pr(Xi=1) = t, for 1 ≤ i ≤ n
Z0, …, Zn are independent Boolean random variables with Pr(Zi=1)= , for 0 ≤ i ≤ n
nib
n
i
ni
ni tBbY
0
)()1Pr(
Probabilistic Multiplexing
A Reconfigurable Architecture
Implement different functions by setting the coefficients:
n
i
ni
ni tBbY
0
)()1Pr(
32
4
5
8
15
8
9
4
1)( ttttf
Example
Implement
Example
Convert to )(8
6)(
8
3)(
8
5)(
8
2)( 3
332
31
30 tBtBtBtBtf
0,0,0,1,1,0,1,1 (4/8)
0,1,1,1,0,0,1,0 (4/8)
1,1,0,1,1,0,0,0 (4/8)
0,0,0,1,0,1,0,0 (2/8)
+x1
x2
x3
1,2,1,3,2,0,2,1
0,1,0,1,0,1,1,1 (5/8)
0,1,1,0,1,0,0,0 (3/8)
1,1,1,0,1,1,0,1 (6/8)
MUX 0,1,0,0,1,1,0,1 (4/8)
z0
z1
z2
z3
y
0
1
2
3
Example
)(8
6)(
8
3)(
8
5)(
8
2)( 3
332
31
30 tBtBtBtBtf
0
( ) ( )n
n n ni i
i
B t b B t
with , such that 10 n
ib
1
0
2))()(( dttBtf n
is minimized.
Non-Polynomial Functions
Find a Bernstein polynomial to approximate the function:
Non-Polynomial Functions
Example: Gamma correction function.
Degree 6 Bernstein coefficients are:
b0 = 0.0955, b1 = 0.7207, b2 = 0.3476, b3 = 0.9988,b4 = 0.7017, b5 = 0.9695, b6 = 0.9939
f (t) = t 0.45
Deterministic v.s. Stochastic Implementation of Gamma correction function with 10% noise injection.
Conventional Implementation
Stochastic Implementation
1% 2% 10%
Stochastic Implementation: no pixels with errors > 20%!
Deterministic implementation:37% pixels with errors > 20%
Comparison with Conventional Hardware Implementation of Image Processing Functions
* The entire ReSC architecture, including Randomizers and De-Randomizers.** The ReSC Unit by itself.
Number of LUTs in FPGA mapping
* Software using math function from ‘Math.h’
Speedup (1024 cycles needed)
** Software using direct function table lookup
Comparison with Conventional Software Implementation of Image Processing Functions
Percentage of Output Pixels with Errors Greater than 25%
Noise is injected in the form of a percentage of bit flips.
Comparison of Fault Tolerance for Image Processing Functions
The stochastic implementation never produces such errors!
Sixth-order Maclaurin polynomial approx., 10 bits:sin(x), cos(x), tan(x), arcsin(x), arctan(x), sinh(x),
cosh(x), tanh(x), arcsinh(x), exp(x), ln(x+1)
0
10
20
30
40
50
60
0 0.001 0.002 0.005 0.01 0.02 0.05 0.1
error ratio of input data
rela
tiv
e e
rro
r
Stochastic Deterministic
Comparison of Fault Tolerance for Mathematical Functions
Conclusions
• The hardware cost is comparable.• Stochastic computation is much more error tolerant.• Advantage for applications where large errors are critical but
small fluctuations can be tolerated is dramatic.• (Also some pretty interesting math…)
Future Directions
• Apply the method at the processor level.• Apply the method at the circuit level (e.g., with PCMOS).
Quantities of Different
Types
ProbabilityDistribution
on outcomes
BiologicalProcess
[computational] Synthetic Biology
[computational] Synthetic Biology
Z
YX
XPrwith
Y
X
fixedBiologicalProcess