ch. 11 digital signal processing using general-purpose processors

22
Ch. 11 Digital Signal Processing Using General-Purpose Processors Kathy Grimes

Upload: torgny

Post on 23-Feb-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Ch. 11 Digital Signal Processing Using General-Purpose Processors. Kathy Grimes. Signals. Signals Electrical Mechanical Acoustic Most real-world signals are Analog – they vary continuously over time Many Limitations with Analog Repeatability Tolerances - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Ch. 11 Digital Signal Processing Using General-Purpose ProcessorsKathy Grimes

Page 2: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Signals• Signals

• Electrical• Mechanical• Acoustic

• Most real-world signals are Analog – they vary continuously over time

• Many Limitations with Analog• Repeatability• Tolerances• Difficulty storing information or implementing certain

operationsLeads us to DSP…

Page 3: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Digital Signal Processing (DSP)• Represent signals by sequences of numbers• Pros

• Repeatable• Accuracy can be controlled• Time-varying operations are easier to implement

• Cons• Sampling cause loss of information• Round-off errors• A/D and D/A mixed-signal hardware

Page 4: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Digital Signal Processing (DSP)• Analog to Digital Converter

• Continuous to Discrete time signal• 11.1 shows the sampling of a signal

• Common Signals• Step Discontinuity (Figure 11.2) Impulse (Figure 11.3)

FIGURE 11.1 Discrete Time Signals.

FIGURE 11.2 Step Function. FIGURE 11.3 Impulse Function.

Page 5: Ch. 11  Digital Signal Processing Using General-Purpose Processors

DSP Building Blocks• Based off of three basic functions:

• Delay• Add• Multiply

• Raw Performance for DSP algorithm is usually by # of ops needed to execute

FIGURE 11.4 Add Function. FIGURE 11.5 Multiply Function.

FIGURE 11.6 Delay Function.

Page 6: Ch. 11  Digital Signal Processing Using General-Purpose Processors

DSP Building Blocks• These two systems in combination can be used to

develop any discrete difference equation

FIGURE 11.7 Feedforward System.

FIGURE 11.8 Feedback System.

Page 7: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Fixed-Point and Floating-Point Implementations• Floating-Point DSP perform Integer Operation

• Dynamic operating range• Fixed-Point DSP perform Integer and Floating

Operation• Fixed range – 16 bit = 65536 max range

• Analog world signals = infinite precision• Floating-point mimic the “infinite” range better

• Easier to implement, avoids rounding and overflow errors• Why not always use Floating-point?

• Cost, Availability, Price, and Performance• Precision Floating Point is good for smaller values but is

poorer at larger values using same number of bits

Page 8: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Single Instruction Multiple Data• SIMD Microarchitecture and Instructions

• One clock cycle for 4 data x(1 instruction)x 1 value• Increase of performance for low-level DSP functions (MAC)

FIGURE 11.10 SIMD Instruction.

Page 9: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Microarchitecture Considerations• Processor Clockspeed• Cache size

• Usually DSP architectures manually partition the memory space in order to reduce number of accesses to external memory• Latency = costly in terms of time and resources

• Intel architectures have large amounts of cache and can overcome the fast/slow memory, however, all memory starts in “far” caches

• Output data should be generated sequentially Accessing memory in a scattered pattern (while using threads) should be avoided

Page 10: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Implementation Options for Intel• Intrinsic• Vectorization• Intel Performance Primitives

Page 11: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Intrinsics and Data Types• C code that calls special built-in compiler capabilities

that map closely to underlying SSE instruction set• Added Data Types

• _m64, _m128, _m128d, _m128i• Intrinsic Operation Types

• Arithmetic (fixed- and floating-point)• Shift• Logical• Compare• Set• Shuffle• Concatenation

Adds four FP values packed into a and b and performs four additions in one instruction

Page 12: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Vectorization• Use compiler to apply vectorization techniques to

loops within data processing iteration looks for opportunities to convert loops from single set to vector-based implementation (so that multiple operands can be operated at the same time)• Like GCC -- >aligned with SIMD instruction set

• Use #pragma directives to guide compiler to avoid overheads such as data dependces

Listing 11.4 Explicitly Don’t Vectorize Loop.

Listing 11.7 Memory Alignment Property and Discarding Assumed Data Dependences.

Page 13: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Vectorization• Comparisons on Performance

• This performance would be vastly different if the memory was not already aligned

Page 14: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Performance Primitives• Intel Libraries – highly optimized implementations for

many different applications (include audio codecs, image processing, data compression, etc…)

• Libraries take full advantage of CPU and SIMD (and most are written for performance)

• Libraries are threaded and can obtain performance gains by parallelizing the algorithm

• Libraries that take advantage are:• Signal Processing – Convolution and correlation, Finite impulse

response (FIR) filter, FIR coefficints generation function, Infinite response filter (IIR), Transforms

• Image Processing• Small Matrices and Realistic Rendering• Cryptography

Page 15: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Finite Impulse Response Filter• FIR filter equation

• Y[n] = a.x[n] + b.x[n-1] + c.x[n-2]

Listing 11.8 FIR Filter C Code Example

Listing 11.9 FIR Using Intel Performance Primitives.

Page 16: Ch. 11  Digital Signal Processing Using General-Purpose Processors

FIR Ex: Intel SSE

• Loop Unrolling to get rid of data dependences

• By changing the data elements, we can reduce the number of times we need to read data

Page 17: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Medical Ultrasound Imaging• Computation intensive

• Needs a significant amount of embedded computational performance

• Same basic algorithmic pattern even though physical configurations, parameters, and functionality are different• Beam forming• Envelope Extraction• Polar-to-Cartesian coordinate translation

Page 18: Ch. 11  Digital Signal Processing Using General-Purpose Processors

FIGURE 11.12 Block Diagram of a Typical Ultrasound Imaging Application.

Page 19: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Envelope Detector

FIGURE 11.15 Block Diagram of the Envelope Detector.

Page 20: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Envelope Detector

FIGURE 11.16 Polar-to-Cartesian Conversion of a Hypothetically Scanned RectangularObject.

Listing 11.11 Code Sample for Envelope Detector.

Page 21: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Performance Results

• Why such a large difference?

Page 22: Ch. 11  Digital Signal Processing Using General-Purpose Processors

Summary• Digital Signal Processing in general-purpose

processors• Extend Processing Capabilities

• Simplifies overall application when platforms require Control, Communications, and General-purpose processing w/DSP

• Many ways to improve an Intel system by implementing special C code, vectorization, and specific libraries

• Performance is greatly enhanced when DSP is implemented properly