tackled today getting serious about “going fast” on the...
TRANSCRIPT
Getting serious about “going fast” on the TigerSHARC
What are the characteristics of most DSP algorithms?
Calculating, and then removing a DC offset from an input stream of data
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
2
Tackled todayWhat are the basic characteristics of a DSP algorithm?A near perfect “starting” example
DCRemoval( ) has many of the features of the FIR filters used in all the Labs in 2007
Testing the performance of the CPP versionFirst assembly version – using I-ALU operations –testing and timingCode will be examined in more detail in the next lecture
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
3
IEEE Micro Magazine ArticleHow RISCy is DSP?Smith, M.R.;Micro, IEEE ,Volume: 12 , Issue: 6 , Dec. 1992 Pages:10 - 23
Available on line via the library “Electronic web links”
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
4
Characteristics of an FIR algorithmInvolves one of the three basic types of DSP algorithms
FIR (Type 1), IIR (Type 2) and FFT (Type 3)
Representative of DSP equations found in filtering, convolution and modeling
Multiplication / addition intensiveSimple format within a (long) loopMany memory fetches of fixed and changing dataHandle “infinite amount of input data” – need FIFO buffer when handling ON-LINE dataAll calculations “MUST” be completed in the time interval between samples
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
5
Comparing IIR and FIR filtersInfinite Impulse Responsefilters – few operations to produce output frominput for each IIR stage
Finite Impulse Responsefilters – many operations to produce output frominput. Long FIFO buffer whichmay require as many operationsAs FIR calculation itself.
Easy to optimize03-Feb-07 DC removal Lecture 1,
M. Smith, ECE, University of Calgary, Canada
6
DCRemoval( ) part of SDRMy version
Memoryintensive
Additionintensive
Loops formain code
FIFO implementedas circularbufferLab. 1 “shuffle”memory approach
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
7
DCRemoval( )
Not as complex as FIR, but many of the same requirementsEasier to handleYou use same ideas in optimizing FIR over Labs 2 and 3Two issues – speed and accuracy. Develop suitable tests for CPP code and check that various assembly language versions satisfy the same tests
Memoryintensive
Additionintensive
Loops formain code
FIFO implementedas circularbuffer
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
8
E-TDD format of DCRemoval( )perhaps a little unsophisticated
Clear the internalbuffer
Put in one knownvalue with knownresult (based onMY implementation
If algorithm worksfor long enoughthen gives thecorrect answer
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
9
First attempt – “ENCM415 approach”Use the integer ALU operations (I-ALU)
Why? Looks less complex that other options Learn one thing at a timeCan be done “using direct translation of C++ (working code)
Tests1) Can we call and return from the assembly code routine? –understanding the C++ calling conventions2) Does the assembly code routine give the same result as the C++ version?3) How does the assembly code routine’s performance compare to the C++ version?and hidden (IMPLICIT) test – are there any errors in jumping backwards and forwards between C++ and many assembly code routines – detailed understanding of the C++ calling conventions
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
10
Call and return test
Basically – ifthe code gets hereit is probably that wedid not crash the system
I use a cut-and-paste approach todevelop code variants. This test is(embarrassingly) useful.
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
11
Initially we expect the code to fail to work correctly
If the code worksinitially, then itis doing so by accident
Use
XF_CHECK_EQUAL( )
Expected to fail
NOTE: This test is just a “cut-and-paste”version of C++ testshown in earlier slidewith three changesof function name
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
12
Timing testInitially – since all we do is “call and return” --we expect (trivially) fast codeIssues
Some algorithms may optimize better when called many times – cache and coding issuesDoes it matter whether the function is called once, 10 or 100 times? Call the function to test within a loop – make sure that the loop overhead for calling the function does not compromise the timing of the test – may be important if we develop very optimized code, and every last cycle (between interrupts) counts
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
13
Timing test
Normalized the timing tests to “process the function once”Need to develop various other routines to make tests work -- DoNothing loop, run C++ and assembly code routines in a loopMay not be correctly performing timing – but gives initial concepts
Once
10 times
100 times
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
14
Other functions needed to run the test
Do NothingCareful – may beoptimized to “nothing”
C++ function loop
J-ALU function loop
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
15
Steps -- Manual techniqueAdd tests to projectBuild connect file so that tests will be activate
Notefile name
and directoryname
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
16
Steps -- E-TDD Gui techniqueAdd tests to projectBuild connect file so that tests will be activate
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
17
Build and run the codeBuild and run manually
BUILD PROJECT and DEBUG | RUN
Or build and run using E-TDD GUI
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
18
Use build failure information to determine assembly code function name
Required name for void DCremovalASM_JALU(int *, int *)
_DCremoval_JALU__FPiT1
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
19
Write ASM “call-and-return”, then run the test
GHOST BREAKPOINT – A break point that is set in the code“some how” – completely random, but seems to occur after makingbig changes in a project – number of ways of handling them
SimpleASM stub
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
20
Proper test run and exit – lib_prog_term
Yellow indicates that there areNO failures but some expectedfailures
All successes and failures shownin console window
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
21
Quick look at the codeWill examine in more detail in next class
void DCremovalASM(int *, int *)
Setting up the static arraysDefining and then setting pointersMoving incoming parameters in FIFOSumming the FIFO valuesPerforming (FAST) divisionReturning the correct valuesUpdating the FIFO in preparation for next time this function is called – discarding oldest value, and “rippling” the FIFO to make the “newest” FIFO slot empty
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
22
Developing the assembly codestatic arrays – “section data1”
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
23
Define the (250) register names for code maintainability (and marking) ease
Actualstaticarraydeclaration
DEFINE pointersinto arrays
DEFINE temps
DEFINE Inpars
SET pointersinto arrays 03-Feb-07 DC removal Lecture 1,
M. Smith, ECE, University of Calgary, Canada
24
Key and common errorSame as in C++There is a difference between
Defining / declaring the pointer register
and
Placing (setting) a value in the pointer register so it actually points some where
Register names – what they are and where they are stored. In an exam, use of a register in this format is “required” but it is preferred, rather than required, that you include the define statements – watch for question wording
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
25
Value into FIFO buffer
RISC processor LOAD and STORE architecture MIPS-like (ENCM369) rather than CISC (ENCM415)Read from memory register store to memory
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
26
Perform sumHardware loops, some 64-bit and some 32-bit instructions
Sum
Division
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
27
Correcting INPARS and then updating the FIFO buffer
Adjust theINPARS
rememberint *
Update FIFOmemory usingload / storeapproach
SLOW
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
28
Adjust tests for expected success
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
29
Tackled todayWhat are the basic characteristics of a DSP algorithm?A near perfect “starting” example
DCRemoval( ) has many of the features of the FIR filters used in all the Labs
Testing the performance of the CPP versionFirst assembly version – using I-ALU operations –testing and timingCode will be examined in more detail in the next lecture