ee 445s real-time digital signal processing lab fall 2011 lab #3.1 digital filters debarati kundu...
TRANSCRIPT
EE 445S Real-Time Digital Signal Processing Lab
Fall 2011
Lab #3.1Digital Filters
Debarati Kundu(With the help of Mr. Eric Wilbur, TI)
2
OutlineDiscrete-Time ConvolutionFIR Filter DesignConvolution Using Circular BufferFIR Filter ImplementationFIR Filter Block ProcessingCode Optimization
3
Discrete-Time ConvolutionRepresented by the following equation
Filter implementations will use the second version (hold h[n] in place and flip-and-slide x[n] about h[n])
Z-transform of convolution
kk
knxkhknhkxny ][][][][][
][][][][ zHzXznyzYn
n
4
Discrete-Time Sinusoidal Response Input two-sided complex sinusoid: x[n]
= e j n
LTI system has impulse response h[n]Output y[n] = x[n] * h[n]
H() is frequency response of the LTI system
Filters are stable, so H() = H[z] |z=exp(j )
Multiplying by H() = A() e j () causes change in magnitude by A() and change in phase by ()
Heemhemheky kj
m
mjkj
m
mkj ][][][
H()
5
FIR Filters Design & ImplementationAn FIR filter does discrete-time
convolution
k
knxkhny ][][][
•z-1 indicates delay elements and hence we need a buffer•We shall implement FIR filters using circular buffers
6
FIR Filters Design & Implementation Implementation
Use the Filter Design & Analysis Tool (fdatool) to get the co-efficient.
Specifications are given in the task listUse convolve function (explained in
subsequent slides) to implement FIR filter given coefficients from fdatool.
7
Convolution Using Circular Buffer
1
0
]mod)[(][][N
k
Nknewestxcirckhny
Always choose the size of circular buffer to be larger than N.
Make sure that the size of the circular buffer is a power of 2.
8
Convolution Using Circular Buffermain(){
int x_index = 0;float y, xcirc[N];------
/*--------------------------------------------*/
/* circularly increment newest (No %)*/++newest;if(newest == N) newest = 0;
/*-------------------------------------------*/
/* Put new sample in delay line. */xcirc[newest] = newsample;
/*-------------------------------------------*/
/* Do convolution sum */Go on to the next column
y = 0;x_index = newestfor (k = 0; k < No_of_coeff; k++){
y += h[k]*xcirc[x_index];/*-------------------------------------*//* circularly decrement x_index */
--x_index;if(x_index == -1) x_index =
N-1;/*-------------------------------------*/}
...}
9
Block Processing using Ping-Pong Buffer This lab uses a double-buffered (PING/PONG) channel-
sorted (L/R) buffering scheme. A FIR algorithm requires “history” to be preserved
over calls to the algorithm. FIR_process() must first copy the history, then process
the data.
data
rcvPingL.hist
rcvPingL.data
hist
data
histrcvPongL.hist
rcvPongL.data
PING
PONG
• Processing of the last data blk (PONG) starts from the top of hist down thru data for DATA_SIZE items.• This leaves the last ORDER-1 data items NOT processed.• Therefore, user must copy the history of the last processed buffer (PONG) to the new buffer (PING), then filter.• Repeat the process…
TSKHWI
isrAudio
rcvBufs
ADC
DAC
McASPSR12
SR11
isrAudio
xmtBufsFIR orCOPY
AIC3106Audio Codec
SEM_post()
LED
PRD1 PRD2
CLK100ms 500ms
SW8
Code Optimizationcount
i = 1Y = coeffi * xi
A typical goal of any system’s algorithm is to meet real-timeYou might also want to approach or achieve “CPU Min” in
order to maximize #channels processed
The minimum # cycles the algorithm takes based on architecturallimits (e.g. data size, #loads, math operations req’d)
Goals:
CPU Min (the “limit”):
Often, meeting real-time only requires setting a few compiler options However, achieving “CPU Min” often requires extensive knowledge
of the architecture (harder, requires more time)
Real-time vs. CPU Min
12
“Debug” vs “Optimized” Benchmarksfor (j = 0; j < nr; j++) { sum = 0; for (i = 0; i < nh; i++) sum += x[i + j] * h[i]; r[j] = sum >> 15;}
Debug – get your code LOGICALLY correct first (no optimization)
“Opt” – increase performance using compiler options (easier)
“CPU Min” – it depends. Could require extensive time
Optimization Machine Cycles
Debug (no opt, –g) 817K
“Release” (-o3, no -g) 18K
CPU Min 6650
Provides the best “debug” environment with full symbolicsupport, no “code motion”, easy to single step
Code is NOT optimized – i.e. very poor performanceCreate test vectors on FUNCTION boundaries (use same
vectors as Opt Env)
“Debug” (–g, NO opt): Get Code Logically Correct
Higher levels of optimization results in code motion – functions become “black boxes” (hence the use of FXN vectors)
Optimizer can find “errors” in your code (use volatile)Highly optimized code (can reach “CPU Min” w/some algos)Each level of optimization increases optimizer’s “scope”…
“Release” (–o3, –g ): Increase Performance
Levels of OptimizationFILE1.C{ {
}{ . . .}
}
{ . . .}
FILE2.C
-o0, -o1 -o2 -o3 -pm -o3
LOCALsingle block
FUNCTION
Across blocksFILE
Acrossfunctions PROGRAM
Across files
{ . . .}
DSPLIB Optimized DSP Function
Library for C programmers using C62x/C67x and C64x devices
These routines are typically used in computationally intensive real-time applications where optimal execution speed is critical.
By using these routines, you can achieve execution speeds considerably faster than equivalent code written in standard ANSI C language. And these ready-to-use functions can significantly shorten your development time.
The DSP library features: C-callable Hand-coded assembly-
optimized Tested against C model
and existing run-time-support functions
Adaptive filtering
Math
DSP_firlms2 DSP_dotp_sqrCorrelation DSP_dotprod
DSP_autocor DSP_maxvalFFT DSP_maxidx
DSP_bitrev_cplx DSP_minvalDSP_radix 2 DSP_mul32DSP_r4fft DSP_neg32DSP_fft DSP_recip16DSP_fft16x16r DSP_vecsumsqDSP_fft16x16t DSP_w_vecDSP_fft16x32 MatrixDSP_fft32x32 DSP_mat_mulDSP_fft32x32s DSP_mat_transDSP_ifft16x32 MiscellaneousDSP_ifft32x32 DSP_bexp
Filters & convolution
DSP_blk_eswap16
DSP_fir_cplxDSP_blk_eswap32
DSP_fir_genDSP_blk_eswap64
DSP_fir_r4 DSP_blk_moveDSP_fir_r8 DSP_fltoq15DSP_fir_sym DSP_minerrorDSP_iir DSP_q15tofl