variable latency speculative addition: a new paradigm for arithmetic circuit design
DESCRIPTION
csda. csda. Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design. Ajay K. Verma, Philip Brisk and Paolo Ienne Processor Architecture Laboratory (LAP) & Centre for Advanced Digital Systems (CSDA) Ecole Polytechnique Fédérale de Lausanne (EPFL). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/1.jpg)
Ajay K. Verma, Philip Brisk and Paolo Ienne
Processor Architecture Laboratory (LAP)& Centre for Advanced Digital Systems (CSDA)
Ecole Polytechnique Fédérale de Lausanne (EPFL)
csda
csda
Variable Latency Speculative Addition: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit A New Paradigm for Arithmetic Circuit
DesignDesign
![Page 2: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/2.jpg)
2
Do We Always Need 100% Do We Always Need 100% AccuracyAccuracy
Ariane 5 explosion, 96 Patriot missile failure, 91
Cryptography attacks
√
X
√
![Page 3: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/3.jpg)
3
Ciphertext-Only Attacks (1 of 2)Ciphertext-Only Attacks (1 of 2)
Guess a key
Decryption
Frequencyanalysis
Ciphertext
Yes
No
![Page 4: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/4.jpg)
4
Ciphertext-Only Attacks (2 of 2)Ciphertext-Only Attacks (2 of 2) Speeding up decryption process will allow
Large amount of ciphertext to decipher More key guesses
Error in the decryption of a few blocks will NOT Affect the frequencies of characters significantly Reduce the efficacy of attack
Use of extremely fast, almost correct arithmetic components is desirable
![Page 5: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/5.jpg)
5
Our ContributionOur Contribution Almost Correct Adder (ACA)
Exponentially faster compared to fastest reliable adder Produces correct result in 99.99% cases
Trade-off between delay and error-precision
Variable Latency Speculative Adder (VLSA) For a processor which allows variable latency
instructions Uses ACA as a component Always produces correct result Extremely fast in more than 99.99% cases
![Page 6: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/6.jpg)
6
OutlineOutline Related work Main Idea
Limited carry propagation occurs in most cases Design of the ACA
Delay optimal design with minimal area Design of the VLSA
Error detection and recovery of ACA Results Extension to other arithmetic components
Parallel counters, multipliers etc. Conclusions
![Page 7: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/7.jpg)
7
Related WorkRelated Work Design of optimal adders with respect to different metrics
Delay and area: Ripple carry adder, Carry lookahead adder, Prefix adder etc.
Maximum fanout, wiretrack: Kogge-Stone adder, Brent-Kung adder, Knowles adders
Generation of all Pareto-optimal prefix adders [Liu07]
Probabilistic arithmetic component Probabilistic arithmetic component to save energy [George06] Razor: circuit level correction for low power operations
[Ernst05] Error detection and correction due to reduction in power
supply voltage [Hegde01] Asynchronous speculative adder [Nowick96, Nowick97]
![Page 8: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/8.jpg)
8
Recurrence for A Typical AdderRecurrence for A Typical Addera15 a14 a13 a12 a1 a0
b15 b14 b13 b12 b1 b0
s15 s14 s13 s12 s1 s0
gi = ai bi
pi = ai bi
ki = ai + bisi = ai bi ci-1
ci = 0 if ki = 11 if gi = 1ci-1 if pi = 1
ci
ci-1
genkill
ci
ci-1
prop
X
![Page 9: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/9.jpg)
9
Main Idea: Limited Carry Main Idea: Limited Carry PropagationPropagation
gen
X
gen
X
prop prop prop kill
X
![Page 10: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/10.jpg)
10
Longest Sequence of PropagatesLongest Sequence of Propagates Longest sequence of propagates
Longest run of 1’s in the XOR of input integers (A B) Longest run of heads in tossing a coin n times
Tk = Tk-1 + average number of steps to advance from k-1 to k
Tk = Tk-1 +1 + (1 + Tk)
2 Tk = 2k+1 - 2
![Page 11: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/11.jpg)
11
Probabilistic Bounds on The Longest Probabilistic Bounds on The Longest Sequence of PropagatesSequence of Propagates
An (x) = number of instances in n-bit addition, where longest sequence of propagates is bounded by x
An (x) = 22n if n ≤ x
2n (An-1 (x) + An-2 (x) + … + An-x-1 (x)) otherwise
Bitwidth Longest sequence of propagates with 99%
probability
Longest sequence of propagates with 99.99%
probability64 11 17
128 12 18256 13 20512 14 211024 15 222048 16 23
![Page 12: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/12.jpg)
12
A Primitive Design of ACA (1 of 2) A Primitive Design of ACA (1 of 2)
![Page 13: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/13.jpg)
13
A Primitive Design of ACA (2 of 2)A Primitive Design of ACA (2 of 2)
ADDA [5, 0]B [5, 0]
S [0]
S [5]
ADDA [6, 1]B [6, 1] S [6]
ADDA [7, 2]B [7, 2] S [7]
ADDA [19, 14]B [19, 14] S [19]
Large area overhead due to the multitude
of small adders
![Page 14: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/14.jpg)
14
Area Overhead in ACA (1 of 2)Area Overhead in ACA (1 of 2)a15 a14 a13 a12 a1 a0
b15 b14 b13 b12 b1 b0
p, g (15, 0)p, g (14, 0)
bitposition
![Page 15: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/15.jpg)
15
Area Overhead in ACA (2 of 2)Area Overhead in ACA (2 of 2)
Step 1: compute the (p, g) for any group of two consecutive bit positions Step 2: compute the (p, g) for any group of four consecutive bit positions Final step: combine the computed (p, g)’s to compute the (p, g) for any group
of k consecutive bi positions
A slightly more complicated design can be used to further reduce the hardware area
![Page 16: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/16.jpg)
16
OutlineOutline Related work Main Idea
Limited carry propagation occurs in most cases Design of the ACA
Delay optimal design with minimal area Design of the VLSA
Error detection and recovery of ACA Results Extension to other arithmetic components
Parallel counters, multipliers etc. Conclusions
![Page 17: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/17.jpg)
17
Error DetectionError Detection Error occurs if there is a long chain of propagates
ER = ∑ pi pi+1 … pi+k
Delay of error detection Higher than the delay of an ACA Smaller than the delay of a traditional adder Experimentally 2/3 of the delay of a traditional adder
![Page 18: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/18.jpg)
18
Error RecoveryError Recovery
Significant amount of ACA computation can be used for the computation of correct addition in error recovery
![Page 19: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/19.jpg)
19
Variable Latency Speculative Variable Latency Speculative AdderAdder
![Page 20: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/20.jpg)
20
Example of VLSA ComputationExample of VLSA Computation
![Page 21: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/21.jpg)
21
Experimental SetupExperimental Setup
Input N (bitwidth)
Traditional fast adder(Prefix adder)
Almost correct adder(ACA)
Error detection
ACA + error recovery(VLSA)
Logic synthesis
Synopsis Design Compiler - compile_ultra - minimize delay
Artisan Standard CellsUMC (0.18µm)
![Page 22: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/22.jpg)
22
ResultsResults
Average delay of VLSA = 0.70 x delay of traditional adderDelay of ACA = 0.52 x delay of traditional adder
![Page 23: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/23.jpg)
23
ConclusionsConclusions We have presented an exponentially fast adder that
works correctly in more than 99.99% cases
We have also presented the reliable version of above adder that works correctly in all case, and Is extremely fast in more than 99.99% cases Has almost the same delay as traditional adder in
other cases
An extension for the similar approach for other arithmetic components is desirable
![Page 24: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/24.jpg)
24
Future Work: Can We Have A Fast Future Work: Can We Have A Fast Almost Correct (Counter/Multiplier)Almost Correct (Counter/Multiplier)
1
1
11
1
1 11
0
00
0
0
0
1
Ex [path number] = sum of bitsOutput = path number = 1001
00011001110111011101010110011001
1001000
Var [path number] = high
Since each output bit depends on each input bit equally,one cannot discard some input bits in the computation of an output bit
![Page 25: Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design](https://reader033.vdocuments.us/reader033/viewer/2022042719/56814edf550346895dbc740f/html5/thumbnails/25.jpg)
25
Future Work: Few Most Significant Future Work: Few Most Significant Bits in MultiplierBits in Multiplier
1001 01101101 1001x
0111 1111 0010 0110
10011101x
0111 0101
Even if we ignore the lower half bits of two inputs, most significant (log n) bits of output will remain same with high probability