tackled today getting serious about “going fast” on the...

8
Getting serious about “going fast” on the TigerSHARC What are the characteristics of most DSP algorithms? Calculating, and then removing a DC offset from an input stream of data 03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary, Canada 2 Tackled today What are the basic characteristics of a DSP algorithm? A near perfect “starting” example DCRemoval( ) has many of the features of the FIR filters used in all the Labs in 2007 Testing the performance of the CPP version First assembly version – using I-ALU operations – testing and timing Code will be examined in more detail in the next lecture 03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary, Canada 3 IEEE Micro Magazine Article How RISCy is DSP? Smith, M.R.; Micro, IEEE ,Volume: 12 , Issue: 6 , Dec. 1992 Pages:10 - 23 Available on line via the library “Electronic web links” 03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary, Canada 4 Characteristics of an FIR algorithm Involves one of the three basic types of DSP algorithms FIR (Type 1), IIR (Type 2) and FFT (Type 3) Representative of DSP equations found in filtering, convolution and modeling Multiplication / addition intensive Simple format within a (long) loop Many memory fetches of fixed and changing data Handle “infinite amount of input data” – need FIFO buffer when handling ON-LINE data All calculations “MUST” be completed in the time interval between samples

Upload: others

Post on 13-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

Getting serious about “going fast” on the TigerSHARC

What are the characteristics of most DSP algorithms?

Calculating, and then removing a DC offset from an input stream of data

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

2

Tackled todayWhat are the basic characteristics of a DSP algorithm?A near perfect “starting” example

DCRemoval( ) has many of the features of the FIR filters used in all the Labs in 2007

Testing the performance of the CPP versionFirst assembly version – using I-ALU operations –testing and timingCode will be examined in more detail in the next lecture

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

3

IEEE Micro Magazine ArticleHow RISCy is DSP?Smith, M.R.;Micro, IEEE ,Volume: 12 , Issue: 6 , Dec. 1992 Pages:10 - 23

Available on line via the library “Electronic web links”

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

4

Characteristics of an FIR algorithmInvolves one of the three basic types of DSP algorithms

FIR (Type 1), IIR (Type 2) and FFT (Type 3)

Representative of DSP equations found in filtering, convolution and modeling

Multiplication / addition intensiveSimple format within a (long) loopMany memory fetches of fixed and changing dataHandle “infinite amount of input data” – need FIFO buffer when handling ON-LINE dataAll calculations “MUST” be completed in the time interval between samples

Page 2: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

5

Comparing IIR and FIR filtersInfinite Impulse Responsefilters – few operations to produce output frominput for each IIR stage

Finite Impulse Responsefilters – many operations to produce output frominput. Long FIFO buffer whichmay require as many operationsAs FIR calculation itself.

Easy to optimize03-Feb-07 DC removal Lecture 1,

M. Smith, ECE, University of Calgary, Canada

6

DCRemoval( ) part of SDRMy version

Memoryintensive

Additionintensive

Loops formain code

FIFO implementedas circularbufferLab. 1 “shuffle”memory approach

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

7

DCRemoval( )

Not as complex as FIR, but many of the same requirementsEasier to handleYou use same ideas in optimizing FIR over Labs 2 and 3Two issues – speed and accuracy. Develop suitable tests for CPP code and check that various assembly language versions satisfy the same tests

Memoryintensive

Additionintensive

Loops formain code

FIFO implementedas circularbuffer

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

8

E-TDD format of DCRemoval( )perhaps a little unsophisticated

Clear the internalbuffer

Put in one knownvalue with knownresult (based onMY implementation

If algorithm worksfor long enoughthen gives thecorrect answer

Page 3: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

9

First attempt – “ENCM415 approach”Use the integer ALU operations (I-ALU)

Why? Looks less complex that other options Learn one thing at a timeCan be done “using direct translation of C++ (working code)

Tests1) Can we call and return from the assembly code routine? –understanding the C++ calling conventions2) Does the assembly code routine give the same result as the C++ version?3) How does the assembly code routine’s performance compare to the C++ version?and hidden (IMPLICIT) test – are there any errors in jumping backwards and forwards between C++ and many assembly code routines – detailed understanding of the C++ calling conventions

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

10

Call and return test

Basically – ifthe code gets hereit is probably that wedid not crash the system

I use a cut-and-paste approach todevelop code variants. This test is(embarrassingly) useful.

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

11

Initially we expect the code to fail to work correctly

If the code worksinitially, then itis doing so by accident

Use

XF_CHECK_EQUAL( )

Expected to fail

NOTE: This test is just a “cut-and-paste”version of C++ testshown in earlier slidewith three changesof function name

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

12

Timing testInitially – since all we do is “call and return” --we expect (trivially) fast codeIssues

Some algorithms may optimize better when called many times – cache and coding issuesDoes it matter whether the function is called once, 10 or 100 times? Call the function to test within a loop – make sure that the loop overhead for calling the function does not compromise the timing of the test – may be important if we develop very optimized code, and every last cycle (between interrupts) counts

Page 4: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

13

Timing test

Normalized the timing tests to “process the function once”Need to develop various other routines to make tests work -- DoNothing loop, run C++ and assembly code routines in a loopMay not be correctly performing timing – but gives initial concepts

Once

10 times

100 times

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

14

Other functions needed to run the test

Do NothingCareful – may beoptimized to “nothing”

C++ function loop

J-ALU function loop

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

15

Steps -- Manual techniqueAdd tests to projectBuild connect file so that tests will be activate

Notefile name

and directoryname

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

16

Steps -- E-TDD Gui techniqueAdd tests to projectBuild connect file so that tests will be activate

Page 5: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

17

Build and run the codeBuild and run manually

BUILD PROJECT and DEBUG | RUN

Or build and run using E-TDD GUI

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

18

Use build failure information to determine assembly code function name

Required name for void DCremovalASM_JALU(int *, int *)

_DCremoval_JALU__FPiT1

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

19

Write ASM “call-and-return”, then run the test

GHOST BREAKPOINT – A break point that is set in the code“some how” – completely random, but seems to occur after makingbig changes in a project – number of ways of handling them

SimpleASM stub

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

20

Proper test run and exit – lib_prog_term

Yellow indicates that there areNO failures but some expectedfailures

All successes and failures shownin console window

Page 6: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

21

Quick look at the codeWill examine in more detail in next class

void DCremovalASM(int *, int *)

Setting up the static arraysDefining and then setting pointersMoving incoming parameters in FIFOSumming the FIFO valuesPerforming (FAST) divisionReturning the correct valuesUpdating the FIFO in preparation for next time this function is called – discarding oldest value, and “rippling” the FIFO to make the “newest” FIFO slot empty

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

22

Developing the assembly codestatic arrays – “section data1”

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

23

Define the (250) register names for code maintainability (and marking) ease

Actualstaticarraydeclaration

DEFINE pointersinto arrays

DEFINE temps

DEFINE Inpars

SET pointersinto arrays 03-Feb-07 DC removal Lecture 1,

M. Smith, ECE, University of Calgary, Canada

24

Key and common errorSame as in C++There is a difference between

Defining / declaring the pointer register

and

Placing (setting) a value in the pointer register so it actually points some where

Register names – what they are and where they are stored. In an exam, use of a register in this format is “required” but it is preferred, rather than required, that you include the define statements – watch for question wording

Page 7: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

25

Value into FIFO buffer

RISC processor LOAD and STORE architecture MIPS-like (ENCM369) rather than CISC (ENCM415)Read from memory register store to memory

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

26

Perform sumHardware loops, some 64-bit and some 32-bit instructions

Sum

Division

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

27

Correcting INPARS and then updating the FIFO buffer

Adjust theINPARS

rememberint *

Update FIFOmemory usingload / storeapproach

SLOW

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

28

Adjust tests for expected success

Page 8: Tackled today Getting serious about “going fast” on the ...enel.ucalgary.ca/People/Smith/2007webs/encm515_07/... · as circular buffer Lab. 1 “shuffle” memory approach 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

29

Tackled todayWhat are the basic characteristics of a DSP algorithm?A near perfect “starting” example

DCRemoval( ) has many of the features of the FIR filters used in all the Labs

Testing the performance of the CPP versionFirst assembly version – using I-ALU operations –testing and timingCode will be examined in more detail in the next lecture