introduction to dsp processors - bguadmiclab/dsplab/introduction to... · introduction to dsp...
TRANSCRIPT
5/19/2006 1
ד ''בס
Introduction to DSP processors
Presented by Alexander Presented by Alexander ReizensonReizenson
5/19/2006 2
ד ''בס
Contents:
The modern processor’s architecture;The digital signal processing methods &algorithms;The D(igital) S(ignal) P(rocessing) algorithms implementationThe SHARC processor architecture;Data types & formats;C & Assembler;Getting started.
5/19/2006 3
ד ''בס
FPGA & EPLDFPGA & EPLD
The modern processor’s architecture.
Today a chips are distributed into three groups:Today a chips are distributed into three groups:
ASICASIC’’ss(Application (Application
Specific Specific IntegratedIntegrated
Circuits)Circuits)
Chips with hardwareChips with hardwarerealizationrealization
of data processingof data processingalgorithmsalgorithms
(microprocessors (microprocessors & microcontrollers)& microcontrollers)
5/19/2006 4
ד ''בס
The modern processor’s architecture.
Microprocessors & microcontrollersMicroprocessors & microcontrollers
DSP DSP microprocessors.microprocessors.
The processors are The processors are intended for Realintended for Real--Time Time
Digital Signal Digital Signal Processing systems.Processing systems.
GeneralGeneral--purpose purpose microprocessors.microprocessors.
This kind of processor This kind of processor is intended for is intended for
computer systems: computer systems: PC, workstation & PC, workstation &
parallel supercomputer.parallel supercomputer.
MicrocontrollersMicrocontrollers
Very especial Very especial processors are intended processors are intended for embedded systems for embedded systems
and in different and in different
household devices.household devices.
5/19/2006 5
ד ''בס
The digital signal processing methods & algorithms.
The analog signal processing example:The analog signal processing example:
+−=fCfjwRiR
fR
x(t)y(t)
11
y(t)y(t)
+-
RRii
RRff
CCff
x(t)x(t)
ffffcc
5/19/2006 6
ד ''בס
The digital signal processing methods & algorithms.
The digital signal processing system:The digital signal processing system:
x(t)x(t)AntiAnti--aliasingaliasing
filterfilter xx’’(t)(t)A/DA/D
xx’’(n)(n)D/AD/A
yy’’(t)(t)SmoothingSmoothing
filterfilter y(t)y(t)Digital filter or Digital filter or
digital digital transformtransform
yy’’(n(n))
( ) ∑=
−=
N
kknxkkCny
0
ffcc
||H(fH(f)|)|
5/19/2006 7
The digital signal processingד ''בסmethods & algorithms.
Analog signal processing:Analog signal processing:Cheaper;Cheaper;
More compact;More compact;
Power dissipation.Power dissipation.
Digital signal processing:Digital signal processing:More accurate;More accurate;
More stable for different More stable for different environments.environments.
Analog signal processing versusAnalog signal processing versusDigital signal processingDigital signal processing
5/19/2006 8
ד ''בס
The digital signal processing methods & algorithms.
Time sampling:Time sampling: Amplitude quantization:Amplitude quantization:
The basis concepts of DSP:The basis concepts of DSP:
DD f
TT
f 1or 1==
FTf D 2
1or 2F ≤≥
x(nT)x(nT) –– one sample at time T;one sample at time T;
T T –– sample rate (time);sample rate (time);
ffDD -- sampling frequency:sampling frequency:
Niquest frequency:Niquest frequency:
F F –– the highest frequency of the highest frequency of signal.signal.
x(nT)x(nT) ~ ~ x(n)x(n) ;;
Resolution:Resolution:
εεQQ −− quantization error:quantization error:
NR 2=
( )12max +−= NQε
NN–– bit number.bit number.
5/19/2006 9
ד ''בס
The digital signal processing methods & algorithms.
The basis concepts of DSP:The basis concepts of DSP:
time.algorithmtime;sample
a −−
τT
T
aτ delτ
.lg ;
;
1
del
orithmns in af operatio- number oNuencyr clk freq- processof
timeoperation thanf
NT
CLKCPU
op
CLKCPUopopaa
−
=⋅=−=
τ
τττττ
5/19/2006 10
ד ''בס
PPeak eak (technical) (technical) performanceperformance of microprocessor:of microprocessor:Maximum theoretical microprocessorMaximum theoretical microprocessor’’s speed in ideal s speed in ideal conditionconditions. Its. It’’s s
defined by number of calculating operation which had done in somdefined by number of calculating operation which had done in some e time. time.
Real (sustained) Real (sustained) performanceperformance of microprocessor:of microprocessor:Real microprocessorReal microprocessor’’s speed in real s speed in real conditionconditions. The real performance is s. The real performance is
calculated by execution of some popular programs.calculated by execution of some popular programs.(like FIR,IIR or FFT). (like FIR,IIR or FFT).
The methods for computer performance The methods for computer performance measurementmeasurement
The digital signal processing methods & algorithms.
5/19/2006 11
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
The major tasks in DSP:The major tasks in DSP:
Linear Filtering Linear Filtering (Filter design)(Filter design);;
Spectral AnalysisSpectral Analysis (Speech detection, image (Speech detection, image recognition)recognition);;
TimingTiming--Frequency Analysis Frequency Analysis (Image & Speech (Image & Speech compression)compression);;
Adaptation Filtering Adaptation Filtering (Image & signal processing)(Image & signal processing);;
NonNon--Linear processing Linear processing (Coding, median filters)(Coding, median filters);;
Multi Speed Processing Multi Speed Processing (Interpolation & decimation)(Interpolation & decimation)..
5/19/2006 12
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
FIR filterFIR filter
IIR filterIIR filter
FFTFFT
Polynomial equations Polynomial equations solvingsolving
The very usable DSPThe very usable DSPalgorithms.algorithms.
5/19/2006 13
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
FIR FIR –– filterfilter(Finite Impulse Response)(Finite Impulse Response)
( ) ( )∑−
=
−=1
0
N
ii inxbny
+
x(n)
+ΣZ-1 Z-1
+Σ
+ y(n)
b2 b1 b0
5/19/2006 14
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
IIR IIR –– filter filter (Infinite Impulse Response)(Infinite Impulse Response)
( ) ( ) ( )∑ ∑−
=
−
=
−−−=1
0
1
1
N
i
M
kki knyainxbny
–
+
x(n)
Σ–
+ΣZ-1 Z-1
+Σ
+ y(n)
b2 b1 b0
a1 a0
5/19/2006 15
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
( ) 1,...,0,1)(21
0−=⋅⋅
=
−−
=∑ Nkenx
NkX N
iknN
n
π
Discrete Fourier TransformDiscrete Fourier Transform
Nikn
knN eW
π2−
=
5/19/2006 16
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
Frequency DomainFrequency Domain ⇒⇒ ⇒⇒ INVERSE DFT INVERSE DFT ⇒⇒ ⇒⇒ Time DomainTime Domain
Frequency DomainFrequency Domain ⇐⇐⇐⇐ DFT DFT ⇐⇐⇐⇐ Time DomainTime Domain
∑−
=
⋅=1
0)(1)(
N
n
nkNWnx
NkX
∑−
=
−⋅=1
0)()(
N
k
nkNWkXnx
THE COMPLEX DFTTHE COMPLEX DFT
5/19/2006 17
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
X(0) = x(0)W80 + x(1)W8
0 + x(2)W80 + x(3)W8
0 + x(4)W80 + x(5)W8
0 + x(6)W80 + x(7)W8
0
X(1) = x(0)W80 + x(1)W8
1 + x(2)W82 + x(3)W8
3 + x(4)W84 + x(5)W8
5 + x(6)W86 + x(7)W8
7
X(2) = x(0)W80 + x(1)W8
2 + x(2)W84 + x(3)W8
6 + x(4)W88 + x(5)W8
10 + x(6)W812 + x(7)W8
14
X(3) = x(0)W80 + x(1)W8
3 + x(2)W86 + x(3)W8
9 + x(4)W812 + x(5)W8
15 + x(6)W818 + x(7)W8
21
X(4) = x(0)W80 + x(1)W8
4 + x(2)W88 + x(3)W8
12 + x(4)W816 + x(5)W8
20 + x(6)W824 + x(7)W8
28
X(5) = x(0)W80 + x(1)W8
5 + x(2)W810 + x(3)W8
15 + x(4)W820 + x(5)W8
25 + x(6)W830 + x(7)W8
35
X(6) = x(0)W80 + x(1)W8
6 + x(2)W812 + x(3)W8
18 + x(4)W824 + x(5)W8
30 + x(6)W836 + x(7)W8
42
X(7) = x(0)W80 + x(1)W8
7 + x(2)W814 + x(3)W8
21 + x(4)W828 + x(5)W8
35 + x(6)W842 + x(7)W8
49
∑−
=
⋅=1
0)(1)(
N
n
nkNWnx
NkXTHE 8THE 8--POINT DFT:POINT DFT:
5/19/2006 18
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
Direct computation of the DFT is basically inefficient because it does not exploit the symmetry and periodicity properties of the phase factor WN. In
particular, these two properties are:
Symmetry property:
Periodicity property:
kN
NkN WW −=+ 2/
kN
NkN WW =+
X(7) = x(0)W88 + x(1)W8
7 + x(2)W814 + x(3)W8
21 + x(4)W828 + x(5)W8
35 + x(6)W842 + x(7)W8
49
X(7) = x(0)W88 + x(1)W8
7 + x(2)W86 + x(3)W8
5 + x(4)W84 + x(5)W8
3 + x(6)W82 + x(7)W8
1
5/19/2006 19
ד ''בס
The D(igital) S(ignal) P(rocessing) algorithms implementation
x(7)WN
4
x(3)
x(5)WN
4
x(1)
X(7)
X(6)
X(5)
X(4)WN
0
WN0
WN6
WN4
WN2
WN0
x(6)WN
4
x(2)
x(4)WN
4
x(0)
X(3)
X(2)
X(1)
X(0)WN
0
WN0
WN6
WN4
WN2
WN0
WN7
WN6
WN5
WN4
WN3
WN2
WN1
WN0
X(7) = x(0)W80 + x(1)W8
7 + x(2)W86 + x(3)W8
13 + x(4)W84 + x(5)W8
11 + x(6)W810 + x(7)W8
17
X(7) = x(0)W88 + x(1)W8
7 + x(2)W86 + x(3)W8
5 + x(4)W84 + x(5)W8
3 + x(6)W82 + x(7)W8
1
(W88= W8
0, W813= W8
5, W811= W8
3, W810= W8
2, W817= W8
1)
5/19/2006 20
ד ''בס
The DSP processor ‘s architecture
RequirementRequirement for DSP for DSP processors:processors:
1.1. High speed input data, High speed input data, different interface devices;different interface devices;
2.2. Input data wide dynamic Input data wide dynamic range;range;
3.3. ADD, MULT & SHIFT ADD, MULT & SHIFT hardware implementation. hardware implementation. Parallel processing;Parallel processing;
4.4. Flexible processing Flexible processing (possibility to (possibility to ““jumpjump”” from from one process to another);one process to another);
5.5. AlgorithmAlgorithm’’s regularity s regularity (Operation (Operation ““come backcome back””););
DSP processors featuresDSP processors features
1.1. Various interface Various interface highspeedhighspeed ports ports and timers and timers
2.2. Parallel access memory Parallel access memory architecture; architecture;
3.3. Three mathematical units: ALU, Three mathematical units: ALU, barrel Shifter and Multiplier with barrel Shifter and Multiplier with fast MAC operation (MBR = MBR fast MAC operation (MBR = MBR + Rx * + Rx * RyRy););
4.4. Cycles, branches & interrupt fast Cycles, branches & interrupt fast handling. Addressing special handling. Addressing special modes;modes;
5.5. Circular buffer.Circular buffer.
5/19/2006 21
ד ''בס
The DSP processor ‘s architecture.
““TraditionalTraditional”” fonfon NeimanNeiman architecturearchitecture
HarvardHarvard architecturearchitecture
CPUCPUMemory Memory
data & data & instructioninstruction
Address busAddress bus
Data busData bus
Program Program Memory Memory instruction instruction
onlyonly PM data busPM data bus
Data Data Memory Memory
data onlydata onlyDM data busDM data busCPUCPU
DM address busDM address busPM address busPM address bus
5/19/2006 22
ד ''בס
The DSP processor ‘s architecture.
Super HarvardSuper Harvard architecturearchitecture
I/O I/O ControllerController
DataData
Program Program Memory Memory instruction instruction
onlyonly PM data busPM data bus
Data Data Memory Memory
data onlydata onlyDM data busDM data bus
CPUCPU DM address busDM address busPM address busPM address bus
Instruction Instruction CacheCache
5/19/2006 23
ד ''בס
The DSP processor ‘s architecture.
SHARC DSP processor structureSHARC DSP processor structure
5/19/2006 24
ד ''בס
The DSP processor ‘s architecture.The ADSPThe ADSP--21160 hardware structure.21160 hardware structure.
SERIAL PORTS(2)
LINK PORTS (6)
DMACONTROLLER
ADDR BUSMUX
IOD 64
IOA 18
IOPREGISTERS
6
6
6x10
4
Dual-Ported SRAM
External Port
I/O Processor
PROCESSORPORT
I/OPORT
ADDR DATA ADDR DATA
Two Independent, Dual-Ported MemoryBlocks
ADDR DATA ADDR DATA
MULTIPROCESSOR
32
64
HOST PORT
INTERFACE
PM Address Bus 32
DM Address Bus 32
PM Data Bus 16/32/40/48/64
DM Data Bus 32/40 64
INSTRUCTION CACHE 32 x 48-Bit
DA G 2 8 x 4 x 32
DAG 1 8 x 4 x 32
Core Processor
PROGRAM SEQUENCER
TIMER
Connect Bus
(PX)
7JTAG
Test & Emulation
PMD
DMD
EPD
IOD
BLO
CK
0
BLO
CK
1
DATA BUSMUX
MULTIPLIER BARREL SHIFTER ALU
DATA REGISTER
FILE
16 x 40-Bit
5/19/2006 25
ד ''בס
The DSP processor ‘s architecture.
100 MHz - 600 MFLOPS- SIMD Core1024 point, complex FFT benchmark: 90 us
4 Mbits on chip SRAM14 zero overhead DMA channelsSustained 700 Mbyte/sec over IOP busTwo 50 mbit/sec Synchronous Serial PortsSix 100 Mbyte/sec link ports64 bit synchronous external port
Cluster multiprocessing support
ADSP-21160 Features
5/19/2006 26
ד ''בס
The DSP processor ‘s architecture.
PipePipe--Line command execution:Line command execution:Instruction fetching (a);Instruction fetching (a);
Decoding (b);Decoding (b);
Execution (c)Execution (c).n-1 operation
n operation
n+1 operation
a b c
a b c
a b c
5/19/2006 27
ד ''בס
The DSP processor ‘s architecture.
DSP processors with fixed and flouting point.DSP processors with fixed and flouting point.
Fixed versus Flouting:Fixed versus Flouting:
Fixed point arithmetic Fixed point arithmetic operations are more simple;operations are more simple;
Flouting point DSP Flouting point DSP processor has more data processor has more data types and commands;types and commands;
FloutingFlouting point advantages:point advantages:
Increases accuracy;Increases accuracy;
Wide dynamic range;Wide dynamic range;
DoesnDoesn’’t have problem with t have problem with data overflow;data overflow;
Friendly for C compiler.Friendly for C compiler.
FixedFixed point advantages:point advantages:
Cheaper;Cheaper;
Compact.Compact.
5/19/2006 28
ד ''בס
Data types & formats.
Data types in DSP processors Data types in DSP processors algorithms:algorithms:
Integer (cycles, coefficients and Integer (cycles, coefficients and arrays numbers);arrays numbers);
Real (input & output data);Real (input & output data);
Complex (applications in frequency Complex (applications in frequency domain);domain);
Logic (bitwise operation).Logic (bitwise operation).
Data format in DSP Data format in DSP processors :processors :
Byte Byte –– 8 bit;8 bit;
Short word Short word –– 16 bit;16 bit;
Normal word Normal word –– 32 bit;32 bit;
Instruction word Instruction word –– 48 bit;48 bit;
Extended normal word Extended normal word –– 40 40 bit;bit;
Long word Long word –– 64 bit.64 bit.
5/19/2006 29
ד ''בס
Data types & formats.
Dynamic range:Dynamic range:
or in db:or in db:
maximum linearity error maximum linearity error (b (b –– data width)data width)::
0 min max
≠=
voluevolue
DynR
( )
≠=
0 min max
log20volue
voluedbDynR
b−2
5/19/2006 30
ד ''בס
SHARC instruction set
SHARC programming model.SHARC programming model.
SHARC assembly language.SHARC assembly language.
SHARC data operations.SHARC data operations.
SHARC flow of control.SHARC flow of control.
ד ''בס
SHARC programming model
Register files:Register files:
R0R0--R15 (aliased as F0R15 (aliased as F0--F15 for floating point)F15 for floating point)
Status registers.Status registers.
Loop registers.Loop registers.
Data address generator registers.Data address generator registers.
5/19/2006 32
ד ''בס
SHARC assembly language
R1=DM(M0,I0), R2=PM(M8,I8); // comment label: R3=R1+R2;
data memory access program memory access
Algebraic notation terminated by semicolon:Algebraic notation terminated by semicolon:
5/19/2006 33
ד ''בס
SHARC instruction set
Hardware realization of Hardware realization of program functions:program functions:
ALU (32 bits);ALU (32 bits);
Multiplier (32 bits);Multiplier (32 bits);
MAC (80 bits);MAC (80 bits);
Shifter (32 bits);Shifter (32 bits);
Register file.Register file.
5/19/2006 34
ד ''בס
Flag operations
ALU operations set:ALU operations set:AZ (zero),AZ (zero),AN (negative), AN (negative), AV (overflow), AV (overflow), AC (fixedAC (fixed--point carry),point carry),AI (floatingAI (floating--point invalid),point invalid),AF (last ALU operation).AF (last ALU operation).
Multiplier operations set:Multiplier operations set:MN (negative),MN (negative),MV (overflow),MV (overflow),MU (flouting point overflow),MU (flouting point overflow),MI (floatingMI (floating--point invalid).point invalid).
Shifter operations set:Shifter operations set:SV (overflow),SV (overflow),SZ (zero),SZ (zero),SS (sign).SS (sign).
FixedFixed--point: point: --1 + 1 = 0:AZ = 1, AF = 0, AN = 0,AV = 0, AC = 1, AI = 0.
FixedFixed--point: point: --2*3=-6:MN = 1, MV = 0, MU = 1,MI = 0.
LSHIFT LSHIFT 0x7fffffff BY 3: : SZ=0,SV=1,SS=0.
ד ''בס
Multifunction computations
Can issue some computations in parallel:Can issue some computations in parallel:
–– dual adddual add--subtract;subtract;
–– fixedfixed--point multiply/accumulate and add, subtractpoint multiply/accumulate and add, subtract
–– floatingfloating--point multiply and ALU operationpoint multiply and ALU operation
5/19/2006 36
ד ''בס
Example Multi-Function Instruction
In a In a SingleCycleSingleCycle the SHARC Performs:the SHARC Performs:1(2) Multiply1(2) Multiply
1 (2) Addition1 (2) Addition
1 (2) Subtraction1 (2) Subtraction
1 (2) Memory Read1 (2) Memory Read
1 (2) Memory Write1 (2) Memory Write
2 Address Pointer Updates2 Address Pointer Updates
Plus the I/O Processor Performs:Plus the I/O Processor Performs:Active Serial Port Channels (2 Transmit, 2 Receive)Active Serial Port Channels (2 Transmit, 2 Receive)
Active Link Ports (6)Active Link Ports (6)
Memory DMAMemory DMA
2 DMA Pointer Updates2 DMA Pointer Updates
f11=f1*f7, f3=f9+f14, f9=f9-f14, dm(i2,m0)=f13, f7=pm(i8,m8);
5/19/2006 37
ד ''בס
SHARC load/store
Load/store architecture: no memoryLoad/store architecture: no memory--direct direct operations.operations.Two Two data address generators (data address generators (DAGsDAGs):):
data memory.data memory.program memory;program memory;
Must set up DAG registers to control Must set up DAG registers to control loads/stores.loads/stores.
Provide indexed, modulo, bitProvide indexed, modulo, bit--reverse indexing.reverse indexing.
5/19/2006 38
ד ''בס
BASIC addressing
Immediate value:Immediate value:r0 = DM(0x20000000);
Direct load:Direct load:r0 = DM(_a); // Loads contents of _a
Direct store:Direct store:DM(_a)= r0; // Stores R0 at _a
5/19/2006 39
ד ''בס
The DSP processor ‘s architecture.
Circular bufferCircular buffer
5/19/2006 40
ד ''בס
DAGs registers
I0I1I2I3
I4I5I6I7
M0M1M2M3
M4M5M6M7
L0L1L2L3
L4L5L6L7
B0B1B2B3
B4B5B6B7
5/19/2006 41
ד ''בס
Post-modify with update
I register holds start address.I register holds start address.M register/immediate holds modifier value.M register/immediate holds modifier value.
r0 = DM(I3,M3) // LoadDM(I2,1) = r1 // Store
Circular buffer: I register is buffer start index, B is Circular buffer: I register is buffer start index, B is buffer base address.buffer base address.Can put data in program memory to read two Can put data in program memory to read two values per cycle:values per cycle:
f0 = DM(I0,M0), f1 = PM(I9,M8);
Compiler allows programmer to control which Compiler allows programmer to control which memory values are stored in.memory values are stored in.
5/19/2006 42
ד ''בס
Post-modify with update
M6 = 1;
R0 = dm(I4, M6); // post-modify
// means: R0 = dm(I4), and then I4 = I4 + M6// However:R0 = dm(M6, I4); // offset index only// means: R0 = dm(M6 +I4), and still keeps I4 = I4
5/19/2006 43
ד ''בס
SHARC assembly language
B4 = 4000;L4 = 0; // set to 0I4 = 4002;M6 = 1;
R0 = dm(M6, I4); // offset index onlyR1 = dm(M6, I4); // offset index only
// means R0 = dm(4002 + 1) and R1 = dm(4002 + 1)// with I4 = 4002 still unchanged at the end of the code
R0 = dm(I4, M6); // post-modify R1 = dm(I4, M6); // post-modify
// means R0 = dm(4002) and R1 = dm(4003)// with I4 = 4004 at the end of the code
PostPost--incrementing and Offsetincrementing and Offset
5/19/2006 44
ד ''בס
Circular buffer implementation
B4 = 4000;L4 = 3; I4 = 4002;M6 = 1;
R0 = dm(M6, I4); // offset index onlyR1 = dm(M6, I4); // offset index only// means R0 = dm(4002 + 1) and R1 = dm(4002 + 1)// with I4 = 4002 still
R0 = dm(I4, M6); // post-increment R1 = dm(I4, M6); // post-increment // means R0 = dm(4002) with I4 = 4003, // however R1 = dm(4000) {4003 – 3} with I4 = 4001
5/19/2006 45
ד ''בס
Example: C assignments
C:C:x = (a + b) - c;
Assembler:Assembler:r0 = DM(_a) // Load a r1 = DM(_b); // Load br3 = r0+r1;r2 = DM(_c); // Load cr3 = r3-r2;DM(_x) = r3; // Store result in x
5/19/2006 46
ד ''בס
Example: C assignments
C:C:y = a*(b+c);
Assembler:Assembler:r1 = DM(_b); // Load br2 = DM(_c); // Load cr2 = r1 + r2;r0 = DM(_a); // Load ar2 = r2*r0;DM(_y) = r2; // Store result in y
5/19/2006 47
ד ''בס
Example: C assignments
Shorter version using pointers:Shorter version using pointers:
// Load b, cr2 = DM(I1,M5), r1 = PM(I8,M13);r0 = r2+r1, r12 = DM(I0,M5);r8 = r12*r0;DM(I0,M5)= r8; // Store in y
5/19/2006 48
ד ''בס
Example: C assignments
C:C:z = (a << 2) | (b & 15);
Assembler:Assembler:r0 = DM(_a); // Load ar0 = LSHIFT r0 by 2; // Left shiftr1 = DM(_b), r3 = 15;// Load immediater1 = r1 AND r3;r0 = r1 OR r0;DM(_z) = r0;
5/19/2006 49
ד ''בס
SHARC jump
Unconditional flow of control change:Unconditional flow of control change:JUMP label;
Three addressing modes:Three addressing modes:–– direct;direct;
–– indirect;indirect;
–– PCPC--relative.relative.
5/19/2006 50
ד ''בס
Example: C if statement
C: C: if (a > b) y = c + d;
else y = c - d;
Assembler:Assembler:// if condition
r0 = DM(_a); r1 = DM(_b);COMP(r0,r1); // CompareIF GT JUMP label;
// False blockr0 = DM(_c); r1 = DM(_d);r1 = r0 - r1;DM(_y)= r1;JUMP other; // Skip false block
// True blocklabel: r0 = DM(_c);
r1 = DM(_d);r1 = r0 + r1;DM(_y) = r1;
other: // Code after if
5/19/2006 51
ד ''בס
The best if implementation
C:C:if (a > b)
y = c + d; else y = c - d;
Assembler:Assembler:// Load values
r1 = DM(_a), r2 = PM(_b);r3 = DM(_c), r4 = PM(_d);
// Compute both sum and differencer12 = r3 + r4, r0 = r3 - r4;
// Choose which one to savecomp(r2,r1);if GT r0 = r12;dm(_y) = r0 // Write to y
ד ''בס
DO UNTIL loops
DO UNTIL instruction provides efficient looping:DO UNTIL instruction provides efficient looping:LCNTR = 30, DO label UNTIL LCE;r0 = DM(I0,M0), f2 = PM(I8,M8);r1 = r0 - r15;
label: f4 = f2 + f3;
Loop length (16 bit) Last instruction in loop Termination condition
5/19/2006 53
ד ''בס
Example: FIR filter
C:C:for (i=0, y=0; i<N; i++)y = y + a[i]*x[i];
5/19/2006 54
ד ''בס
FIR filter assembler
// setupI0 = _a; I8 = _x;// a[0] (DAG0), x[0] (DAG1)r12 = 0; // f = 0;M0 = 1; M8 = 1; // Set up increments
// Loop bodyLCNTR = N, DO loopend UNTIL LCE;
// Use post-increment moder1 = DM(I0,M0), r2 = PM(I8,M8);r8 = r1 * r2;
loopend: r12 = r12 + r8;
5/19/2006 55
ד ''בס
Example: C main + ASM function
C:C:extern void fir(float dm *,float dm *);//mainvoid main(){y = fir(a,x);
}
5/19/2006 56
Example: C main + ASMד ''בסfunction
Assembler:Assembler:#include <asm_sprt.h>.SEGMENT/PM seg_pmco;.global _fir;.extern _a, _x, _y;_fir: entry;// setup
I0=_c; I8=_x; // c[0](DAG0),x[0](DAG1) // or I0 = r4, I8 = r8
r12 = 0; // f = 0;M0=1; M8=1; // Set up increments
// Loop bodyLCNTR = N, DO loopend UNTIL LCE;r1 = DM(I0,M0), r2 = PM(I8,M8);r3 = r1 * r2;
loopend: r12 = r12 + r3;r0 = r12; // or dm(_y)=r12;
exit;_fir.end:.endseg;
5/19/2006 57
Example: C main + ASM functionד ''בס(work with STACK)
int a,b,c,d,e,f;
extern int asm_proc( int a, int b, int c, int d, int e );
void main(){
a = 0xAAAAAA;b = 0xBBBBBB;c = 0xCCCCCC;d = 0xDDDDDD;e = 0xEEEEEE; f = asm_proc(a,b,c,d,e);
}
5/19/2006 58
ד ''בס
Example: C main + ASM function(work with STACK)
#include "asm_sprt.h".SEGMENT/PM seg_pmco;.GLOBAL _asm_proc;_asm_proc:
start:// m7 = -1 (compiler definition)// m6 = 1 (compiler definition)
r15 = i6; // i6 - save C sp (stack pointer)// i7 - asm sp (stack pointer)
i2 = r15;modify(i2,m6);
r0 = r4;r1 = r8;r2 = r12;r3 = dm(i2,m6);
// C sp + 2 (fourth argument place) r4 = dm(i2,m6);
// C sp + 3 (fifth argument place)r5 =0x555555;r0 = r0 + r5;
// r0 = return()_asm_proc.end:.endseg;exit;
ד ''בס
Important programming reminders
Registers for parameters transfer: Registers for parameters transfer: r4,r8,r12,r0;
Interrupt does not occur until 2 instructions after Interrupt does not occur until 2 instructions after delayed branch (needs 2 delayed branch (needs 2 NOPsNOPs););
Some DAG register transfers are disallowed in Some DAG register transfers are disallowed in assembler routine;assembler routine;
It is preferable not use the following couples in all It is preferable not use the following couples in all combinations:combinations: (M7,I6), (M14,I12), (M6,I5), (M5,I6)..
5/19/2006 60
ד ''בס
Getting started
5/19/2006 61
ד ''בס
5/19/2006 62
ד ''בס
5/19/2006 63
ד ''בס
5/19/2006 64
ד ''בס
5/19/2006 65
ד ''בס
5/19/2006 66
ד ''בס
5/19/2006 67
ד ''בס
5/19/2006 68
ד ''בס
5/19/2006 69
ד ''בס
5/19/2006 70
ד ''בס
5/19/2006 71
ד ''בס
5/19/2006 72
ד ''בס
5/19/2006 73
ד ''בס
5/19/2006 74
ד ''בס
5/19/2006 75
ד ''בס
5/19/2006 76
ד ''בס
5/19/2006 77
ד ''בס
5/19/2006 78
ד ''בס
5/19/2006 79
ד ''בס
5/19/2006 80
ד ''בס
5/19/2006 81
ד ''בס
5/19/2006 82
ד ''בס
5/19/2006 83
ד ''בס
5/19/2006 84
ד ''בס
5/19/2006 85
ד ''בס
5/19/2006 86
ד ''בס
5/19/2006 87
ד ''בס
Paths to examples
C:C:\\Program FilesProgram Files\\Analog DevicesAnalog Devices\\VisualDSPVisualDSP\\211xx211xx\\examplesexamples
C:C:\\Program FilesProgram Files\\Analog DevicesAnalog Devices\\VisualDSPVisualDSP\\21k21k\\examplesexamples
ororD:D:\\Program FilesProgram Files\\Analog DevicesAnalog Devices\\VisualDSPVisualDSP\\211xx211xx\\examplesexamples
DD::\\Program FilesProgram Files\\Analog DevicesAnalog Devices\\VisualDSPVisualDSP\\21k21k\\examplesexamples
5/19/2006 88
ד ''בס
Introduction to DSP processors
TheThe ENDEND