alexandru bârleanu, vadim băitoiu and andrei stan

13
Floating-point to fixed-point code conversion with variable trade-off between computational complexity and accuracy loss Alexandru Bârleanu, Vadim Băitoiu and Andrei Stan Technical University “Gh. Asachi”, Iaşi, Romania 15 th International Conference on System Theory, Control and Computing (Joint conference of SINTES15, SACCS11, SIMSIS15) October 14-16, 2011 Sinaia, ROMANIA 1/13

Upload: brenna-morris

Post on 31-Dec-2015

21 views

Category:

Documents


0 download

DESCRIPTION

Floating-point to fixed-point code conversion with variable trade-off between computational complexity and accuracy loss. Alexandru Bârleanu, Vadim Băitoiu and Andrei Stan Technical University “ Gh . Asachi ”, Iaşi , Romania. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Floating-point to fixed-point code conversion with variable trade-off between

computational complexity and accuracy loss

Alexandru Bârleanu, Vadim Băitoiu and Andrei Stan

Technical University “Gh. Asachi”, Iaşi, Romania

15th International Conference on System Theory, Control and Computing(Joint conference of SINTES15, SACCS11, SIMSIS15)

October 14-16, 2011Sinaia, ROMANIA

1/13

Page 2: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Motivation

• Embedded microprocessors:– No hardware dedicated to floating-point– Limited processing capabilities

• Emulated floating-point arithmetic:– Unnecessary high accuracy– Long execution time

• Fixed-point code written manually:– Error-prone– Important accuracy loss

2/13

Page 3: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Existing work• For FPGA

– The main problem is fractional word-length optimization– The search space grows exponentially with the number of fixed-point

variables– Search techniques (often sophisticated) are necessary:

• Greedy algorithms• Genetic algorithms• Simulated annealing

– Optimization objectives: accuracy loss, area

• For microcontrollers, C language– Existing solutions:

• Fixed-point format is supplied by the user (in annotations, for example)• Fixed-point format is determined through simulations, taking into consideration for

example some accuracy constraints

– Available integer types types in C: only 16/32/64-bit signed/unsigned– Optimization objectives: accuracy loss, number of (scaling) operations

3/13

Page 4: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Problem formulation

The problem is constructed from practical considerations:• Input – a digital filter:

– Filter structure: Direct-Form I– Constant floating-point coefficients– Known input bounds (low/high values)

• Output – ANSI-C integer code:– ideally the result must be the same as if floating-point code

would have been used

4/13

𝑦=∑𝑖=0

𝑛

𝑎𝑖 𝑥 𝑖

Page 5: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Building the dataflow

• Initial state – very long fractional parts– Multiply operators overflow– Add operators have unaligned terms

• Changing the dataflow – making nodes representable in C– Resolving overflows in any operator– Aligning summation terms

Recursive method calls – bottom-up action

5/13

Run-time integer interval: [0; 4 400 000 000]Fractional word-length: 27Datatype: none (using only 16/32 bit integers)Floating-point interval: [0; 32.782...]

Run-time integer interval: [0; 2 200 000 000]Fractional word-length: 26Datatype: unsigned longFloating-point interval: [0; 32.782...]

Example: making node run-time integer interval smaller (scaling)

Page 6: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Dataflow transformation philosophy

At design-time(scaling coefficients)

At run-time(scaling operators)

Loss of accuracy large, because scaling occurs at dataflow sources

small, because scaling occurs close to dataflow root

Run-time operations 0 >0

Overflow avoidance(not optional!)

Run-time integer interval reduction(together with FWL)

Discarding of least significant bits(multiple ways)

6/13

Page 7: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Selecting the optimal dataflow transformation

𝑐𝑜𝑠𝑡=𝑘1∗𝑐𝑜𝑚𝑝𝑙𝑒𝑥𝑖𝑡𝑦+𝑘2∗𝑒𝑟𝑟𝑜𝑟

Size oferror interval

Number ofoperators

Increase or decrease node run-time integer interval

Construct multiple dataflow transformation variants

(alternative dataflow fragments)

Compare candidate dataflow transformation variants

using a linear cost function

Analitycally computed values

Number ofcycles

SQNR loss,error distribution...

Ideal values

7/13

Page 8: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Varying the cost function coefficients (example)

0.010513dB 0.000243dB 0.000025dB 0.000004dB

198 220 239 260

FilterResponse type: bandpassType: FIROrder: 40

Target/CompilationProcessor: ARM Cortex-M3Compiler: IAR C/C++ 5.41 for ARM (Kickstart)Optimizations: medium

SQNR loss Time (cycles)For comparison –the floating-point codetakes 3984-4078 cycles

4 dataflows shown from 18 total found

8/13

Page 9: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Implementation insights

• Language: Java SE 1.6• Techniques: OOP, polymorphism• Analitycal estimation of run-time integer intervals,

dataflow complexity, and node error intervals• Dataflows are transformed using Change instances (not

by copying large dataflow portions and modifying them).– Change instances are invertible (apply/undo)– Change instances can be combined in logical AND and OR

• Dataflow vizualization: dot (graph description language)

9/13

Page 10: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Usage exampleFilter propertiesResponse type: highpassType: FIROrder: 30Designed with: Matlab FDATool

Conversion informationNumber of dataflows produced by varying the cost function coefficients: 158 (18 different)Total transformation time: 2.44s

Performance of fixed-point function #7Distortion (SQNR loss): 3.1e-05dBSpeed test: Device: MSP430F149 Compiler: IAR 5.10 (Kickstart) Compiler opt.: High speed Factor: 11.5

10/13

Page 11: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

TestingAccuracy Speed

Compiler Microsoft C++ (Visual Studio 2010)

IAR, gcc

Compiler settings Optimizations: disabled / enabled (low, high, ...)

Processor variant • 8-bit (AVR)• 16-bit (MSP430)• 32-bit (ARM7 Cortex-M3)

Filter properties Type: FIR, IIR (work in progress)Order: 4-80 (FIR)Input interval: [0; 4095], [-4096; 4095], and otherDesign method: random coefficients, Matlab FDATool

Cost function From „low-complexity-low-accuracy” to „high-complexity-high-accuracy”

Code generation From „everything in one expression” (inline) to „every operator variable declared”

11/13

Page 12: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Results12/13

Number of cycles

Speed factor: 3...15(or more if compiler optimizations are applied)

Accuracy loss

SQNR loss:1e-5...1e-1 dB

Floating-point code

Variable trade-offbetween complexity and accuracy

Constant execution time(no jitter – more determinism)

Page 13: Alexandru  Bârleanu, Vadim Băitoiu and Andrei  Stan

Conclusions

An innovative floating-point to fixed-point conversion method for C language is proposed:

– Very good speed factor is obained (integer code compared with floating-point code).

– Very good accuracy is obtained for FIR filters.– The conversion algorithm is designed to use variable cost functions. It is

possible to specify, for example, that complexity is important and accuracy loss is unimportant when building the integer dataflow.

– The conversion time is very short. This happens because:• Dataflow metrics are estimated analytically• Dataflow nodes have cache information (run-time integer interval, error interval)• The automatic search of dataflows algorithm uses a heuristic to generate as few as

possible identical dataflows

13/13