weekly group meeting 20080710 title: programmable baseband processors (chapters 5) by assad saleem
TRANSCRIPT
Weekly Group Meeting20080710
Title: Programmable Baseband Processors (Chapters 5)
By
Assad Saleem
2
Radio System Overview
• A typical wireless communication system
3
Digital Baseband Processor
• Transmitter performs:– Channel coding– Modulation– Symbol shaping
• sReceiver performs:– Filtering, synchronization, Gain control– Demodulation, channel estimation, and
compensation– Forward error correction
4
Baseband Processing Challenges
• Multipath propogation (Fading)– Data transported trough air gets effected by the
surrounding environment– Multiple propagation paths– Delayed multi-path signal components add at
the receiver– Some freq.s add constructively, other
destructively– Thus destroying the original signal– Causing inter-symbol interference
5
Baseband Processing Challenges
• Timing and Frequency Offset– Different reference oscillators in the transmitter
and receiver– Causing slight discrepancy between the
transmitter and receiver• Carrier frequency
• Sample rate
– Uncorrected, limits the useful data rate of a system
6
Baseband Processing Challenges
• Mobility– Fast fading – rapid changes of the channel– Doppler-spread – further increases the
frequency offset
7
Baseband Processing Challenges
• Noise and Burst Interference– Signal degradation– Bit errors– FEC techniques are utilized to increase the
reliability of the wireless link– Popular FEC codes and algorithms are the
Viterbi algorithm used for the Convolutional codes, Turbo codes, Reed-Solomon codes
– Interleaving is used to even out bit error and burst interference or frequency selective fading
8
Baseband Processing Challenges
• Dynamic Range– Fading and other surrounding equipment
increase dynamic range– A dynamic range of 60-100 dB is not
uncommon– Not practical to design systems with such large
dynamic range– Instead AGC circuit are used
9
Baseband Processing Challenges
• Processing Latency– Baseband processing: strict hard real-time
procedure– Heavy peak work load for the processor during
computationally demanding tasks, such as:• Channel decoding
• Channel estimation
• Gain control calculations
– Hardware must be able to handle peak work load
• Even though it occurs less than 1 percent of the time
10
Programmable Baseband Processors
• Traditionally fixed function hardware have been used, since baseband processing is computationally very heavy
• Two disadvantages of fixed function hardware– Low flexibility– Short product lifetime
• Whereas programmable solutions need only software update
11
Programmable Baseband Processors
• Multimode systems– A high end cellular telephone will support a
number of standards such as:• GSM/GPRS, EDGE, UMTS, WLAN, WiMAX,
UWB, Blutetooth, GPS, DVB-H
• One way is to integrate many separate baseband processing modules
• Its drawbacks:– Large silicon area– Lack of hardware reuse
12
Programmable Baseband Processors
• Dynamic MIPS allocation– Redistribute the resources dynamically– Focus on either mobility management or high
data rate• During severe fading, we run advanced channel
tracking and compensation algorithms for reliable communication
• In good channels, more resources can be allocated to symbol processing for high throughput
13
Programmable Baseband Processors
14
Programmable Baseband Processors
• Hardware multiplexing through programmability– Most wireless communication schemes use
multiplexing which can be divided into three classes:
• OFDM, CDMA, single carrier modulation
– By carefully selecting the functional blocks the hardware reuse between different standards can be achieved
15
Programmable BasebandProcessors
• Hardware multiplexing on
LeoCore DSPs
16
Spectrum of an OFDM (orthogonal frequency division multiplexing)
communication system
[16] National Instruments, Orthogonal Frequency Division Multiplexing available at www.ni.com, 2004.
17
Cyclic Prefix Insertion
[16] National Instruments, Orthogonal Frequency Division Multiplexing available at www.ni.com, 2004.
18
OFDM symbol time structure showing insertion of Cyclic Prefix
[4] WiMAX Forum, Mobile WiMAX – Part 1 : A Technical Overview and Performance Evaluation, 2006.
19
OFDMProcessing Flow
20
Job overview
FFT Computation complexity for different communication standards
21
Hardware Considerations for Programmable OFDM Processing
• Flexibility– Multiple FFT sizes must be supported for
different standards– As a bonus, other transforms such as cosine and
Walsh transforms can also be supported
• Hardware Reuse– In many cases, it results into a smaller total
silicon area than a corresponding fixed function solution
22
Hardware Considerations for Programmable OFDM Processing
23
Code Division Multiple Access (CDMA)
• Concurrent transmission in the same spectrum using orthogonal spreading codes
• In a CDMA transmitter– A binary data is mapped onto complex valued
symbols which are then multiplicated (spread) with a code from a set of orthogonal set of codes
– Length of the code is called the spreading factor
24
Code Division Multiple Access (CDMA)
• In a CDMA Receiver– Data is recovered by calculating a dot product
(de-spread) between the received data and the assigned code
– Dot product will be zero for all other codes since the spreading codes are selected from a set of orthogonal codes except the assigned code.
• WCDMA– Can scale the bandwidth of a user by assigning
multiple spreading codes to that user
25
Job Overview
• Signal processing in WCDMA and HSDPA can be divided into:– Chip-rate processing– Symbol-rate processing
• Chip is one complex element of the spreading code– Synchronization, channel estimation, channel
equalization are performed in chip-rate– Additional channel equalization is performed in
symbol-rate
26
Job Overview
• Synchronization– Responsible of finding the start of the data
frame and identifying the base station parameters
– This is done by correlating the received data with 256 chips long synchronization code
– The chip rate of WCDMA/HSDPA is 3.84 MChips/s
– Main operation in the step is complex multiplication and accumulation (complex dot product)
27
Job Overview
• Channel Equalization– Two step procedure in WCDMA
1. Strongest multi-path components are identified (using the data from the synchronizer)
2. Components are aligned in time and added constructively (using max. ratio combining). This is known as a Rake receiver.
– In HSDPA, (which uses up to 16 QAM), additional equalization is necessary
• The resulting complex-valued symbols (after de-spread) is equalized by a second linear equalizer
• It uses training symbols inserted in the middle of the data slot (mid-amble)
28
Job Overview
29
Hardware considerations for a WCDMA Processor
• All chip and symbol related operations are performed on complex valued data (true for OFDM also)– programmable baseband processor needs to do complex efficient
computing
• Fairly short symbols with high data rate– Min loop overhead– Which means wider execution units for processing efficiency
• In WCDMA, HSDPA, and other CDMA systems– Complex spreading codes have constant envelop– Therefore, de-spread operation can be performed in the complex
ALU instead of entirely in a complex MAC unit.
• Addressing support for Rake-addressing– Implemented as function level accelerators in memory blocks
30
Multi-standard Processor Design
• A processor architecture suitable for OFDM, CDMA, and single carrier based standard is as follows:
• Requirements for such a processor– Efficient instruction set suited for baseband processing.
Use of both natively complex computing and integer computing.
– Efficient hardware reuse through instruction level acceleration.
– Wide execution units to increase processing parallelism.
– High memory bandwidth to support parallel execution.– Low overhead in processing– Balance between configurable accelerators and
execution units.
31
Complex Computing
• Very large part of processing (FFTs, frequency/timing
offset estimation, synchronization, and channel estimation) employ convolution based functions.– Such operations can be performed efficiently in
DSPs using CMAC unit, optimized memory, bus architecture, and addressing modes.
• In baseband processing all operations are complex-valued– Therefore, complex computing should be
supported throughout the architecture
32
Complex Computing
33
LeoCore Processor Architecture
• Two main parts:– Natively complex part which operates on vectors of complex
numbers
– Natively integer part which operates on integers and single bits
34
LeoCore Processor Architecture
• Two main parts’ purpose:– Complex part is used to extract soft data symbols that can be de-
mapped into bits
– Integer part is used for FEC and bit manipulation
35
LeoCore Processor Architecture• Execution units
– To do complex tasks in an efficient manner
– DSP controller core, multi-lane complex MAC, ALU SIMD data-paths
– Execution units range from a CMAC units capable of executing a radix-4 FFT butterflies in one clock cycle, to complex ALUs used by CDMA based standards
36
LeoCore Processor Architecture• Memory subsystems:
– Memory is connected through on-chip network
– The on-chip network allows any memory to be connected to any execution unit
– Amount of memory needed is small but the required memory bandwidth is very large (several hundred M sample/s)
– Whereas each sample consists of two parts (real and imaginary)
37
LeoCore Processor Architecture
• It uses vector instructions– E.g., a single instruction that triggers a complete vector operation
such as a complex 128 sample dot-product
– This means that the execution unit must be able to process large data chunks without any intervention from the processor core
– Which in turn means – execution unit and memory subsystem to have
• Automatic address generation
• Efficient load/store subsystems
• Therefore, the base architecture utilizes de-centralized memories, memory addressing together with vector execution units
38
LeoCore Processor Architecture• HW Acceleration
– Accelerators are also attached to the network
– To improve efficiency function level accelerators could be used• Function level accelerator is a configurable piece of hardware which performs
a specific task without support from the processor core.
– How to decide which functions to accelerate? Consider following• MIPS cost, Reuse, Circuit area (considerable reduction of clock frequency and
power)
39
LeoCore Processor Architecture• Typical Accelerators
– Front-end acceleration (filtering/decimation)
– Re-sampling
– Rotor (an NCO and a complex multiplier)
– Packet Detector
– Shaping filter
– Forward Error Correction (FEC)
40
Conclusion
• Multi-standard baseband processing can be implemented in a programmable hardware
• Its main features should be:– Support for complex valued computing– Instruction level acceleration of FFT,
convolution and similar kernel functions– Small total memory but with optimized
architecture meeting• High bandwidth and real-time requirements
– Function level accelerators for channel coding, and general tasks close to ADC/DAC interface
41
Reference
1. M. Ismail, D. Gonzalez “Radion Design in Nanometer Technologies” 2006 Springer
Thank You.