518 ieee transactions on circuits and systems—i:...

518 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 54, NO. 3, MARCH 2007

Design and Implementation of a Baseband WCDMADual-Antenna Mobile Terminal

Jean-François Frigon, Member, IEEE, Ahmed M. Eltawil, Eugene Grayver, Alireza Tarighat, Member, IEEE,and Hanli Zou

Abstract—The design and implementation of a basebandwide-band code-division multiple access (WCDMA) dual-antennamobile terminal system-on-a-chip (SoC) is presented in this paper.Spatial diversity processing mitigates wireless channel impair-ments and is a key enabling technology for WCDMA to supporthigh quality of service at high data rates and capacity. The SoCintegrates the baseband transceiver, coding and decoding func-tions, microcontrollers to implement the radio access protocols,and external interfaces to communicate with the application layer.The receiver design, which takes advantage of diversity benefits inseveral blocks, is described in detail. The SoC was fabricated ina 0.18- m 1.8-V CMOS technology and requires a total area of72 mm2 consuming 532 mW at the maximum data rates. The ap-plication-specific integrated circuit was used in lab testing wherea gain of up to 9 dB was observed for the dual-antenna receiver,which demonstrates the tremendous improvement provided byspatial diversity. The results presented in this paper provide abase architecture and a performance benchmark for commercialimplementations of WCDMA mobile terminals.

Index Terms—Baseband modem, diversity, smart antenna pro-cessing, system-on-a-chip (SoC), wide-band code-division multipleaccess (WCDMA).

I. INTRODUCTION

THIRD generation (3G) cellular networks [1] offer voiceand data based services to wireless mobile users in large

areas. Data rates and quality-of-service (QoS) expected byusers will be comparable to those offered by conventionalwired networks. However, wireless channel impairments suchas fading and multipath propagation make it difficult to meetthe user expectations in the entire service area. Furthermore,these wireless systems have to meet a tight power budget,maximize the number of users, and provide small and low costhandsets, which present additional design challenges. Spatialdiversity is an effective tool to solve the problem of providingthe desired QoS with a small cost and complexity overhead

Manuscript received June 6, 2005; revised May 9, 2006. This paper was rec-ommended by Associate Editor W. Namgoong.

J.-F. Frigon is with the Department of Electrical Engineering, ÉcolePolytechnique de Montréal, Montréal, QC H3C 3A7, Canada (e-mail:[email protected]).

A. M. Eltawil is with the Electrical Engineering and Computer ScienceDepartment, University of California, Irvine, CA 92697 USA (e-mail:[email protected]).

E. Grayver is with the Aerospace Corporation, El Segundo, CA 90245 USA(e-mail: [email protected]).

A. Tarighat was with the Electrical Engineering Department, University ofCalifornia, Los Angeles, CA 90095 USA. He is now with Wilinx, Los Angeles,CA 90025 USA (e-mail: [email protected]).

H. Zou is with Broadcom Corporation, Irvine, CA 92612 USA (e-mail:[email protected]).

Digital Object Identifier 10.1109/TCSI.2006.887620

for the mobile terminal [2]. Recent advances in RF circuittechniques and integration [3] have made implementation ofa smart antenna mobile receiver feasible. This paper presentsthe system and VLSI design of a wide-band code-divisionmultiple access (WCDMA) dual-antenna mobile terminal, andits implementation on a system-on-a-chip (SoC).

In the early development stages of wireless receivers, the de-sign focus is on specific blocks, such as the RAKE receiver [4],[5], which leads to a complete transceiver design [6]–[8]. How-ever, these are far away from a final implementation for a mo-bile terminal and lack support for several functionalities, suchas the physical layer management logic, coding and decodingfunctions, radio link control (RLC) layer, medium access con-trol (MAC) layer, radio resource control (RRC) protocol, andexternal interfaces. Furthermore, several are implemented usinga mixture of a field-programmable gate array (FPGA), a digitalsignal processer (DSP), and microcontrollers [9], [10]; which isnot optimal for a low-power small-area low-cost implementa-tion required for handsets. An alternative approach to the appli-cation specific integrated circuit (ASIC) is to use a software de-fined radio (SDR) flexible architecture [11]. However, the SDRarchitecture can not meet the stringent requirements of massmarket handsets.

We have developed an ASIC which implements asingle-mode WCDMA dual-antenna baseband modem. TheSoC provides the physical layer, coding layer, RLC, MAC,RRC, and other functionalities related to the radio aspects of auser equipment (UE) [12]. Several interfaces are also integratedto communicate with the application layer. A dual-antennadiversity receiver provides several theoretical benefits, asdescribed in Section II. The baseband transceiver design andarchitecture, which takes advantage of diversity in severalreceiver blocks to improve the system performance, is de-scribed in Section III. The baseband transceiver was integrated,as discussed in Section IV, with customized coding blocks,microcontrollers for control and monitoring of the transceiverand higher layers functionalities, and external interfaces. Thesystem was fabricated in a 0.18- m, 1.8-V CMOS technologyand tested in the lab using a channel emulator. ASIC statis-tics and test results for dual-antenna reception are given inSection V. Implementation tradeoff of a dual-antenna architec-ture are also highlighted in Section V.

Similar systems implementing a baseband WCDMA mo-bile terminal have only been reported by the industry (e.g.,MSM6280 by Qualcomm [13], BCM2152 by Broadcom [14],MXC300-30 by Freescale Semiconductor [15], OMAPV2230by Texas Instruments [16], and SoftFone-W by Analog De-vices [17]). All of these systems integrate on a single chip:

1549-8328/$25.00 © 2007 IEEE

FRIGON et al.: WCDMA DUAL-ANTENNA MOBILE TERMINAL 519

Fig. 1. Required node-B Tx power as a function of the number of users.

a specific hardware accelerator for the WCDMA basebandtransceiver (some chips are also multimode and include GSM,GPRS EDGE, HSDPA, and GPS accelerators), one or twomicrocontrollers (the ARM9 and ARM11 microcontrollers aremostly used), multimedia accelerators and various external in-terfaces. Several chips also integrate an intermediate-frequencyinterface, the analog data converters, and one or more DSPs formodem and/or multimedia functions. Only the Qualcom chipoffers a two-antenna receive diversity architecture. However, itis important to note that since these chips are commercial, littleinformation is available apart from top-level block diagramsand marketing brochures. This paper aims at filling this infor-mation void. The transceiver and SoC design presented in thispaper offer a base architecture for future commercial imple-mentation of WCDMA UE ASICs and the test results can beused as a benchmark for available and future implementations.

II. DIVERSITY

The WCDMA link quality from the base station to a mo-bile user can be greatly improved by using appropriate diver-sity techniques. In typical indoor and urban outdoor environ-ments, the signal strength displays large fluctuations over shortdistances on the order of a few wavelengths. In such an envi-ronment, two antennas separated by at least half a wavelength(6.9 to 7.8 cm for WCDMA downlink frequency bands) re-ceive uncorrelated signals [2]. If the antenna spacing is not prac-tical, polarization diversity can also be used to implement a twobranches diversity receiver. Algorithms using diversity tech-niques exploit the random nature of the propagated signal andthe independence between the received signals to improve boththe average power of the effective waveform as well as reduceits variance.

For a fixed number of users, using diversity techniques re-sults in lower transmit power at the base station. For example,assume that we want to serve 99% of the users trying to accessthe network in a Rayleigh fading environment. Fig. 1 shows therequired transmit power from the base station for this scenarioif all users have either one or two antennas. It can be seen that,

Fig. 2. Number of voice users as a function of the number of data users.

for a fixed capacity, the base station needs to transmit approxi-mately 7 dB less power in the case where two antenna receiversare used. Equivalently, for a constant transmit power, the ca-pacity increases by a factor of 5.

The impact of diversity on the performance of a WCDMAcellular system is further illustrated in Fig. 2 which shows thenumber of 12.2-kbps voice terminals that can be supported bythe system as a function of the number of 144-kbps data usersfor the WCDMA standard channel case 1 [12]. In these theoret-ical simulations, all the data users were assumed to use eithera single or a dual-antenna receiver and all the voice terminalswere using a single-antenna receiver. We can observe the sig-nificant improvement in capacity provided by using a diversityenabled data receiver. For example, if we have five data termi-nals with single-antenna receiver, only a single voice user canbe supported by the system, while if the data terminals are usingdual-antenna receivers the network can still support 12 voiceusers.

III. TRANSCEIVER DESCRIPTION

A. Receiver Overview

Fig. 3 depicts the top-level block diagram of the diversity en-abled WCDMA modem. The receiver processes the incominganalog signals from the two antennas to compute the soft de-cision allowing the decoding of the bits transmitted from thebase station with the lowest probability of error. Dual-antennaprocessing is inherent in most algorithms to improve their per-formance and enhance immunity to signal fading. All dedicatedsignal processing functions are configured and monitored by aglobal controller.

The received RF signal from each antenna is first down-converted to baseband, scaled using a variable gain amplifier(VGA) to appropriately load the analog-to-digital converter(ADC) and sampled at four times the chip rate with a precisionof 6 bits. Note that the RF and analog sections are not treatedin this paper and are not included in the baseband SoC de-scribed in this work. The digital processing begins with the 29


Fig. 3. Top-level modem block diagram.

taps matched square-root raised cosine filters (SRRCFs) thatwere implemented using canonic signed digit (CSD) formatnumeric representation to conserve power [18]. Followingthe independent filtering of both received digital signals, thefrequency tracking unit output signal is used to derotate thesignals in order to remove the frequency offset between thereceived signal and the local reference. The data streams arethen linearly interpolated by a factor of 2 and compensated forthe sampling frequency error. It is critical to mitigate the effectsof sampling frequency error as early as possible in the datapathsince it can represent a multipath position drift of up to 0.384chips per frame, or 3.072 samples per frame, for a 10-ppmerror. This can cause major problems for the operation of thecell acquisition algorithm and the multipath searcher. Note thatthe carrier frequency and sampling frequency control loops areentirely digital to decrease the component cost and improve thesystem performance (lower loop delay, greater controllabilityand precision improvement). The drawback of a digital loopis the reduction in dynamic range. However, the digital loopswere designed to handle an offset of up to 15 ppm, which issufficient for commercial use.

The eight times oversampled signal at the interpolator output,after acquisition of the first base station, is compensated for thecarrier and sampling frequency errors. The synchronized signalis then presented, for chip-level processing, to the cell acquisi-tion processor, multipath searcher, and correlation engine. Thecell acquisition processor and multipath searcher are discussedin Sections III-C and III-D, respectively. The correlation en-gine is a programmable bank of correlators and is described inSection III-E. Once a symbol is descrambled and despread inthe correlation engine, its value and the appropriate tagging in-

formation (identifying which multipath it belongs to, symbolnumber, etc.) is transferred to the RAKE and antenna combinerunit described in Section III-E. This unit computes the soft de-cision of the transmitted bits that are used for symbol-level pro-cessing and decoding. The combiner unit is also responsible forcomputing the channel estimate used for various measurementsand provides feedback to the frequency tracking loop and theindividual multipath timing tracking loops. The RAKE receiverhardware configuration is designed to be scalable in resources.As processing requirements are increased (e.g., more multipathwithin an environment), more correlation engines can be ac-tivated. The particular implementation described in this paperprovides a total of four correlation engines with 25 correlationresources each.

The operation of the receiver can be briefly described as fol-lows. Once initial synchronization with a base station to withinone chip period is achieved, fine frequency and timing trackingloops are initiated to lock onto the detected multipath. The mul-tipath searcher is then configured to scan for new multipaths andupdates the controller. RAKE fingers are assigned to the appro-priate detected multipaths for each antenna by the controller andthe combiner unit is configured to process the correlator outputs.The cell acquisition processor and multipath searcher are thenconfigured to search and monitor new base stations and multi-paths.

B. Control Loops

1) Automatic Gain Control (AGC): The purpose of the AGCalgorithm is to assure a proper loading of the ADCs. This is ac-complished by monitoring the two digital data streams after theADC and controlling the VGA in each analog receive chain. It is


reasonable to assume that for multiple-antenna communicationsystems the noise and interference power at each receive an-tenna is approximately the same. The same gain is thus appliedto each receive chain in order to obtain a similar noise powerlevel at both ADCs. This is done to simplify the design of thecombiner algorithm (see Section III-E). Let be the digitalcomplex sample output at instant for the ADCs (I and Q) as-sociated with antenna . To avoid signal saturation on any ofthe two receive path, the AGC algorithm is using the followingmetric:

where

which is compared to a threshold, low-pass filter and used tocontrol the analog VGA stages in both receive chains. Note thatsince the goal is to obtain an equal overall gain in both analogchains, the VGA control signals must be corrected by the appro-priate calibration factor for each receive chain.

2) Frequency Tracking Unit: The local reference signalused to downconvert the RF signal received from each antennato baseband is derived from a common crystal. The carrierfrequency offsets for both complex baseband signals are thusthe same. The filtered signals are derotated by multiplying withthe direct digital frequency synthesizer (DDFS) complex outputsignal from the frequency tracking unit. The goal of the fre-quency recovery loop is to remove the carrier frequency offsetwhile the phase error is compensated in the combiner sinceit is multipath and antenna dependent, unlike the frequencyoffset. Diversity is incorporated in the frequency recovery loopby using the strongest multipath from each receive antenna tocompute the frequency error. The phase difference betweensuccessive pilot symbols is employed to detect, for each an-tenna, the rotation direction due to the frequency offset. Equalgain combining is used to combine the rotation direction fromeach antenna and a hard decision is made. Denote the pilotsymbol for antenna by . The dual-antenna receiverfrequency-loop discriminator is then given by

is low-pass filtered and used as the input to the DDFS.3) Timing Tracking: Similar to the RF mixers, the sampling

clock of the ADC digitizing both received analog signals is de-rived from a common crystal. The sampling frequency offset ex-perienced by both antennas is the same and is directly related tothe carrier frequency offset. The controller uses the frequencyoffset information available from the frequency tracking loopto control the sampling frequency compensation circuit in theinterpolator which uses sample swallowing or repetition to re-move the sampling frequency offset error. On the other hand,the sampling phase offset (i.e., optimal downsampling time) isdifferent for each multipath. Also, it can not be assumed that amultipath has the same sampling phase offset on each antenna.Independent conventional early-late tracking loops are thereforeused for each multipath and antenna to track the optimal down-sampling time. The early-late signal is then integrated and ap-plied to a Schmitt quantizer to determine which sample out of

Fig. 4. P-SCH matched filter hardware architecture.

the 8 input samples per chip will be used for correlation. TheSchmitt quantizer was found to be necessary to minimize tog-gling between adjacent samples due to noise and improves thesystem performance by up to 5 dB.

C. Cell Acquisition

Cell search is required for initial synchronization with theWCDMA network and to search for candidate cells to performhandoffs. In the first case, referred to as initial cell search, themobile is “free-running” and needs to find an initial cell withouta priori information and to synchronize its carrier and samplingfrequency with the base station. In the latter case, a priori in-formation is available at the mobile (e.g., scrambling code andtiming offset of neighbour cells) and the tracking loops are syn-chronized with the network. The mobile needs to detect the pres-ence of these cells and find their exact timing offset through thetarget cell search. Cell search in WCDMA is performed usinga three stage pipelined approach, successively identifying slotboundary, frame boundary, and the code group [19], [20]. Thecell acquisition processing blocks use as their input the interpo-lator output downsampled by a factor of 8 to the chip rate.

1) Primary Synchronization: The goal of the primary syn-chronization stage is to detect slot boundaries. This is achievedby detecting the 256 chips primary synchronization channel(P-SCH) sequence transmitted at the beginning of every down-link slot. This sequence is common to all basestations. Dueto the low operating signal-to-noise ratio (SNR), the matchedfilter output has to be averaged over several slots to get areliable result. Furthermore, during the initial cell search,the frequency offset at the receiver can be relatively large,resulting in a significant loss if the correlation is performedcoherently. To compensate for the frequency offset and achieverobustness, noncoherent combining of four successive shortercorrelations of length 64 chips is performed. Since no ref-erence signal is available during primary synchronization,noncoherent combining of the output of the P-SCH matchedfilter for both antenna is also performed. During a target cellsearch, the frequency offset is known; therefore the 256 chipsP-SCH correlation does not suffer from any coherence loss.However, noncoherent combining of the two antenna outputsis still required. The output of the P-SCH matched filter isstored for the 2560 possible slot offsets and accumulated overthe 15 slots in a frame. The offset with the maximum averagedvalue is declared as the slot boundary candidate for the secondsynchronization stage.

A characteristic of the 256 chips that form the P-SCH channelis that they are the Kronecker product of two codes of length 16(denoted by PN1 and PN2) [12]. This observation leads to thehardware architecture shown in Fig. 4 where a 16-tap filter ismatched to code PN1 while a memory block of size 256 is usedto perform the filtering against the code PN2. In this approach,


only a 16-tap shift register is used, therefore reducing the powerconsumed for clocking the flip-flop registers and the area usedfor implementing it. Furthermore, for every output of the firststage matched filter, only 16 memory locations (partial correla-tion results) have to be updated, making it feasible to use a singlememory block (to save area) and a clock rate of 16 times thechip rate ( MHz) to update the memory locations properly.This structure leads to a 60% reduction in power consumptioncompared to a direct 256 tap matched filter where each flip-flopis clocked. This is an important consideration since the unit isrunning at the chip rate. To further reduce power, a differentialmatched filter was used in which the codes could have tertiaryvalues of , which allows correlation to be performedwith half the number of multiply-add operations [20].

2) Secondary Synchronization: The secondary synchroniza-tion channel (S-SCH) consists of a sequence of 15 symbols oflength 256 chips transmitted at the beginning of every slot andis repeatedly transmitted every frame. The symbols can take oneof 16 values. The 15 symbols in a frame form a codeword takenfrom a codebook of 64 16-ary codewords. These 64 codewordscorrespond to the 64 code groups used in the WCDMA systemand are chosen such that they have distinct phase shifts. There-fore, the correct frame boundary and code group can be detectedby finding the transmitted codeword and its phase. The P-SCH istransmitted in parallel to the S-SCH and can be used as a knownreference signal to decode the code transmitted on the S-SCH.The downsampled sequence is therefore correlated against theprimary and secondary sequence. Unlike the case for primarysynchronization, maximum ratio combining (MRC), with theP-SCH as a reference, of the shorter correlations of length 64chips for both antennas can be used to compute the likelihoodmetrics for each receive symbols. For target cell search, MRC ofthe 256 chips correlation for both antennas is used. Additionaldetails for the secondary synchronization stage are available in[20].

3) Code Searcher: At the end of each frame, the code groupand frame boundary searcher indicates the most likely codegroup and frame offset, as well as the timing offset of theslot boundary. The code searcher then uses the common pilotchannel (CPICH) to identify the cell-specific scrambling code.For the 150 blocks of 256 chips in the frame the downsampledreceived signals from both antennas are correlated for 256chips against the 8 Gold codes in the indicated code group. Noconvenient reference signal is available during the initial andtarget cell search. For initial cell search, the code hypothesistesting metric is thus produced by noncoherently combiningfour 64 chips correlation values for each antenna.

For each symbol, the scrambling code associated with thelargest decision statistic is chosen and receives one vote. Themost likely transmitted scrambling is selected at the end of theframe by a majority vote over the 150 decisions. However, thedetected scrambling code and frame timing is accepted only ifthe number of votes associated with the chosen scrambling codeexceeds a threshold chosen to minimize the probability offalse alarm. Otherwise, the candidate is rejected and the cellsearch continues. For target cell search, the carrier frequencyoffset is known and corrected by the frequency tracking loop.

Fig. 5. Diversity impact on average cell search waiting time.

The correlation for each antenna can therefore be computed co-herently over 256 chips and the outputs for each antenna arecombined noncoherently.

4) Performance: Fig. 5 shows the simulated diversity impacton the average time to complete the cell search. In this simu-lation, the value for the ratio between the power allocated tothe CPICH and the total transmit power is set to 10 dB andthe value for the ratio between the power allocated to the SCHand the total transmit power is set to 12 dB. The power allo-cated to the SCH is equally divided between the P-SCH andthe S-SCH. The performance was evaluated for a flat fadingchannel as a function of the geometry factor defined as theratio between the total received power from the desired base sta-tion and the channel interference (i.e., additive white Gaussiannoise and inter-cell interference). For example, if is 1.5 dB,a single-antenna receiver will be able to find an appropriate cellafter 10 frames (100 ms) whereas for a dual-antenna receiver thesame waiting time would require only 3.5 dB for . Diversitythus provides a 2-dB improvement for synchronization.

D. Multipath Searcher

The WCDMA receiver employs a RAKE architecture todemodulate the received signal where a separate correlator re-ceiver is assigned to each detected multipath (see Section III-E).Using additional correct multipaths provides more signal en-ergy to the RAKE receiver while combining invalid multipathsincreases the noise level. The multipath searcher unit performsa critical task by continuously monitoring the time varyingchannel and determining the current multipath profile. Weprovide a brief description of this unit in this paper and furtherdetails on its design and implementation are available in [21].

The multipath searcher algorithm consists of three pipelinestages as depicted in Fig. 6. The input for each stage consistsof the oversampled output of the interpolator and is processedwithin one frame. Outputs are available for the next stage orfor controller usage at the end of a frame. The global objectiveof the multipath searcher is to maximize , the probability ofdetecting a valid multipath, while minimizing the probability of


Fig. 6. Top-level block diagram of multipath searcher.

false alarm , the probability of declaring an invalid multipath.The propagation channel from the base station to each of the tworeceiver antennas is affected by independent fading conditions.Therefore, each antenna is served by an independent instance ofthe multipath searcher.

During the initial multipath search, a range of offsets isscanned with a resolution of 1/2 a chip to find a set of can-didate multipaths. The correlation output energy for eachoffset is compared with two adaptive thresholds. In low SNRor fast fading channels, the statistic available in the initialsearch stage is only reliable for strong multipaths (CPICH

15 dB) when the threshold is set for an acceptable. Having a provision for detecting strong multipaths during

the initial stage is invaluable since these multipaths will beavailable for addition to the RAKE receiver within one frame oftheir appearance and can be used to mitigate the adverse effectsof birth and death channel conditions. On the other hand, thereexists no threshold that guarantees acceptable and forweaker multipaths. A dual dwell search approach is thus used,where the correlator output energy is compared to a lowerthreshold to determine a set of candidates which will be furtheranalyzed in the verification stage. Finally, the initial searchstage provides an estimate of the background noise power tothe controller to implement the decision stage of the multipathsearcher and the RAKE finger management algorithm.

For each multipath candidate provided by the initial searcherstage, the multipath searcher verification stage computes anaccurate power estimate. The power estimate is obtained byaveraging the correlation output energy computed in ten uni-formly distributed periods. After the verification stage, thepower estimate of each candidate is compared to a thresholdproviding a relatively high for average power multipaths(CPICH 20 dB). These multipaths can thus bedetected within two frames of the multipath searcher start.However, weaker multipaths with a CPICH in the rangeof 25 to 20 dB still can not be reliably detected after themultipath searcher verification stage due to low SNR and strongcross correlation peaks.

At the end of each frame, the initial and verification stagesreport the offset and accurate power estimate for each candidateoffset to the multipath searcher controller algorithm. These can-didates are then monitored in the decision stage implemented inthe controller over 3 frames so that the complete multipath pro-file (multipaths with CPICH 25 dB) can be identi-fied with low and high . The final decision on the presenceof a multipath at a specific offset is made in the detection stagebased on the average power and the reporting frequency for the

candidates. The high detection probability of the complete algo-rithm is at a cost of 5-frame delay in the detection process whichwould have been unacceptable in fast fading and birth and deathchannels. The provision for reporting strong and average mul-tipaths directly in the earlier stages is thus essential for theseconditions.

E. RAKE Receiver and Combiner

The dual-antenna WCDMA receiver combines time pro-cessing of multipaths with spatial processing of the signalreceived on both antennas to improve the system performance.However, the architecture design of the space–time systemshould be selected to fully exploit the space and time diversity.Three types of architecture can be used to combine space andtime processing: 1) antenna combining followed by time pro-cessing; 2) time processing for each antenna signal followed byantenna combining; and 3) joint antenna and time processing.The first architecture is not suitable for channel with largerms delay, as expected in WCDMA. The second architectureperforms better in outdoor channels since it has access to thecomplete diversity information in the time and space domains.On the other hand, the joint space–time processing architecturesimultaneously removes the inter-symbol interference and com-bines the signals received on the multiple receive antennas andmultipaths through an minimum mean-square error (MMSE)combiner, for example. This architecture is slightly more com-plex than the second one due to more computationally intensiveweight calculation but does not require more correlators andhas a similar combiner structure. However, its performance isbetter than the second architecture in the presence of coloredinterference. The WCDMA receiver described in this paperimplements the joint space–time processing architecture. ARAKE correlator is assigned to the detected multipath on eachantenna. All correlator outputs are then combined together in asingle space–time combiner. The weights used in the combinercan be computed using either an MRC or MMSE algorithm.

The oversampled signals at the output of the interpolator foreach antenna which have been corrected for carrier and sam-pling frequency offsets are distributed to the correlation engines.The correlation engine is a programmable bank of correlatorswhere each correlator can be configured as a regular data cor-relator, a timing tracking correlator, or a pilot correlator. Eachcorrelator can be independently configured to the desired type,code offset and receive antenna. Each configuration triggers adifferent data flow within the correlation engine and in the in-teraction of the correlation engine with surrounding blocks. Foroptimal power management, the supervising controller has both


Fig. 7. Block diagram of a correlator block.

coarse and fine control over the resources. It controls the activa-tion sequence of the correlation engines and of the correlationresources within each block. Clock gating is used extensivelyto save power, any inactive resource is clock gated to minimizepower consumption.

For each antenna , the cell searcher selectsmultipaths and independent correlator blocks are assigned toeach of these multipaths. The multipath managementalgorithm allocates RAKE fingers to multipaths found by thesearcher based on the power of the reported multipaths, the his-tory of multipaths currently tracked, the availability of RAKEfingers, and the current bit error rate (BER), frame error rate(FER) and SNR.

The block diagram of a correlator block is shown in Fig. 7.The incoming signal is downsampled to the chip rate and corre-lated with orthogonal variable spreading factor (OVSF) codesto obtain the outputs for each data channel. It is also correlatedwith the pilot code associated with these data channels to obtainthe pilot output. The oversampled signal is also used to drive theearly-late timing recovery loop that controls the sampling timeof the data and pilot correlators.

Fig. 8 provides a block diagram of the combiner. The pilotcorrelations for each multipath and antenna are used to computethe weighting coefficients for combining the data correlationoutputs. For each data channel multipaths are combined(in this context a multipath is a generic name that includes actualmultipaths, multiple antennas and multiple base stations). Notethat the number of combined multipaths might be adjustedon a per channel basis depending on the quality-of-servicerequirement and correlators available. Furthermore, the com-bining mode [normal, space–time transmit diversity (STTD)or closed-loop transmit diversity] can change on a per channelbasis. The combiner memories are designed to accommodatemultipaths within a time interval taking into account the max-imum propagation delay and soft handovers. The combiner unitis also responsible for computing the channel estimate used for

Fig. 8. Block diagram of the combiner.

various measurements and providing feedback to the frequencytracking loop and the multipath timing tracking loops.

Assuming that channel uses a spreading factor ,the correlator outputs for multipath are denoted by

and the weights availablefor this multipath at the instant where the correlator outputs arecombined are denoted by . In normal mode of operation,the combiner soft output is then given by

In this paper we will not describe the combining methods andweight computation algorithms for STTD and closed-looptransmit diversity modes, due to space limitations.

Assume that the pilot correlator outputs for multipath aredenoted by , and the weights after the th update are de-noted by . Two approaches are used to compute the com-bining weights: MRC and least mean square (LMS) algorithm.For MRC, multipaths are weighed proportionally to their SNR.However, the noise power levels are similar for all multipaths.We can thus directly use the channel estimates as weighting co-efficients for MRC. The weights for a given multipath are up-dated every time a pilot correlation is available for this mul-tipath. The weight update is computed as follows:

is a forgetting factor that allows the tracking of fading chan-nels and filters the noise. Multiplication by (1- ) is required toremove the CPICH modulation to obtain the channel estimate.When LMS is employed, the weights for all multipaths are up-dated simultaneously every time the pilot correlations for all

multipaths are available. The weight update is computedas follows:


Fig. 9. Performance comparison of MRC and LMS combining algorithms.

The sign function of a complex number is defined as. is a leakage factor and provides

a better performance in fast fading channel conditions. Addi-tional details on the combiner design and implementation canbe found in [22].

Fig. 9 compares the combiner output SNR cumulative dis-tribution function for a voice user using a dual-antenna receiverfor MRC and LMS combiners. The simulation assumes 60 userswith equally distributed power. The received Ec/No was set to6 dB in the simulation with additive white Gaussian noise. Thechannel conditions were fixed and the channel profile for the3GPP standardized channel model number 3 was used. Note thatfor this channel a four finger RAKE receiver is used for each re-ceived antenna. It can be easily observed from these results thatfor this channel, a receiver using an LMS algorithm offers an im-provement of approximately 1.5 dB over a conventional MRCreceiver. Furthermore, for these simulations, the inter-cell inter-ference was modeled as white noise. However, in a real system,the inter-cell interference is colored. Under these conditions, theLMS improvement would be more significant.

F. Transmitter

The transmitter can either channelize encoded data bits fromhigher layers or generate preamble sequences for the randomaccess procedure. The transmitter supports up to 6 paralleldata channels at a spreading factor of 4, or one data channelat spreading factors of 4 to 256, in addition to the controlchannel. The scaling factors of the data and control channelsare programmable. The complex data or preamble sequenceat the 3.84-MHz chip rate is then scrambled. All scramblingsequences described in the 3GPP standard are supported.

The mobile terminal modulated carrier frequency shall be ac-curate to within ppm compared to the carrier frequency re-ceived from the base station. Furthermore, the same frequencysource for both RF frequency generation and the chip clockshall be used. This is a strict requirement since most inexpen-sive crystal oscillators do not have very good frequency stability.Since for both the terminal and the base station the frequencysources used in the downlink and uplink are the same and the

Fig. 10. Block diagram of the WCDMA SoC.

Doppler shift is the also the same, the frequency error measurein the receiver is the same as the correction required for trans-mission. The receiver DDFS input signal word is therefore mon-itored and used to configure the transmitter digital correctionunits. An overflowing numerically controlled oscillator (NCO)is used to generate, from the free running master clock, the syn-chronized chip clock controlling all transmitter units. The fre-quency error of the RF mixer is pre-compensated by rotating thescrambled complex sequences using a 32 points DDFS.

The signal then needs to be up-sampled by a factor of 4and filtered using a SRRCF. This is accomplished by firstup-sampling the signal by a factor of 2 and filtering using a17-tap SRRCF followed by a 15-tap polyphase half-band filter.This implementation leads to savings of approximately 50%compared to an equivalent direct 33-tap SRRCF at 4 times thechip rate. The filtered signal is finally scaled depending onthe number of data channel and clipped to reduce the peak toaverage power ratio. The complex signal is finally quantifiedto 8 bits and sent to the and digital-to-analog converters(DACs) at four times the corrected chip rate.

The digital signal for the transmitter has an error vector mag-nitude of 4% (the 3GPP requirements is 17.5%) and the out-of-band emission and adjacent channel leakage power ratio are metwith a minimum margin of 10 dB. When integrated with theanalog and RF transmitter sections, the transmitter met all 3GPPrequirements.

IV. SOC IMPLEMENTATION

The dual-antenna WCDMA transceiver was implementedin a SoC, which also integrates dedicated hardware blocks forcoding and decoding functions, and microcontrollers to mon-itor and configure the modem and implement the coding andprotocol layers. The SoC is described in this section, followedby an overview of the external interfaces and a description ofthe power management techniques.

A. SoC Overview

Fig. 10 illustrates the overall SoC which includes the base-band modem section, the coding layer and the protocol layer ofthe standard. It is based on a dual embedded microcontroller ar-chitecture (ARM922T cores). The first microcontroller (PHY)performs all the physical layer control including control of theRF section and the digital modem. PHY is also responsible forperforming some of the coding and decoding tasks in firmware.


Fig. 11. Coding layer functions (a) Downlink. (b) Uplink.

The second microcontroller (PRO) is responsible for protocolstack operations including communication with higher layersto establish call control as well as communication with offchip components. A dual processor architecture allows a clearseparation between the physical layer code and the protocolstack code which simplifies code development and debuggingleading to a robust solution. This architecture also allows themaster clock frequency to be dynamically scaled depending onthe computational requirements of the scenario being demodu-lated (e.g., full rate versus low rates) and provides an efficientglobal power management for the ASIC.

The combiner output symbols from the modem are 10-bit softvalues. Control symbols are first extracted from the data channeland made available directly to the controller. In particular, thecurrent transport format is determined by decoding the TFCIbits and the current SNR for the data channel is computed usingthe dedicated pilot bits. The SNR estimate is used to determinethe scaling factor used to reduce the precision from 10 to 4 bitswith optimal scaling in fading conditions for soft Viterbi and/orTurbo decoding. Symbol-level processing is then performed onthe 4-bit soft values.

Fig. 11(a) illustrates the decoding steps as defined by theWCDMA standard. The shaded areas indicate operations thatare hardware assisted by dedicated accelerator units as well as acoprocessor. The first decoding block extracts control symbolsand performs data scaling and de-interleaving. Symbols are thenmade available to the PHY microcontroller for the followingfour control intensive steps up to first de-interleaving which isperformed by an embedded co-processor that has access to themicrocontroller internal registers. This tight coupling allows forfast execution and reduced power consumption since the con-troller does not have to access memory. The second decodingblock then performs rate matching and Turbo or trellis decoding.Fig. 11(b) illustrates the reciprocal uplink encoding as definedby the WCDMA standard. Shaded steps indicate hardware as-sisted operations. The same co-processor used for downlink isreused to perform second and first interleaving as well as rate

matching. Turbo encoding is performed by a dedicated hard-ware unit.

B. SoC External Interfaces

The WCDMA transceiver can be used in a number of dif-ferent configurations and thus requires different interfaces (e.g.,CardBus, USB, OMAP serial, UART) for the relatively highdata rates. The ASIC implements a general purpose data portthat can be adapted for use with different standards. The dataport implements a full SRAM-type interface with both masterand slave modes. The PRO can master the interface to read/writedata from a slave device. Alternatively, an external device cantreat the transceiver as a simple memory mapped device by mas-tering the port and writing directly to internal memory. Thememory buffer is large enough to hold 10 ms worth of receiveand transmit data as well as control information. A standard se-rial interface (SPI) is used to control the analog/RF front end,and additional GPIOs are used to control nonstandard compo-nents.

The high data rates required by WCMDA combined with thecomplexity of the protocol layer result in memory requirementsthat are much higher than for previous generations of standards.The ASIC interfaces to an external FLASH and an externalDRAM. The FLASH holds the software for both the protocoland physical layers. However, running the software out of therelatively slow FLASH is not feasible at high data rates. Instead,the code is copied to the much faster DRAM and run from there.At the highest data rates, the code and the data use over 4 MB ofmemory. To save power during low data rate conditions the ex-ternal SDRAM is switched off and the code can run from Flash.

C. Power Management Techniques

One of the main goals of the design was to provide aggres-sive means of power control since most 3G applications arepower hungry. To address these issues we utilized power con-trol methods that can be broadly categorized into distributedand centralized methods. These techniques are in addition to thepower saving modes mentioned before, namely, dual processorarchitecture and hardware assisted processing.

Fine grained clock gating was inserted at the synthesis levelto disable the clock to each group of flip-flops with a constantoutput. Banks of registers that exceed four flip-flops in depthhave been gated. This approach minimizes both the clock loadcapacitance and register switching but requires a large numberof clock gates. Careful analysis was performed to ensure thatthe total capacitance of the inserted clock gates is much lessthan the gated registers to reap the power savings benefit. Thismethodology is important since as the complexity of the designincreases the number of clock gates and their drive strength in-creases which defeats the purpose of power savings.

Centralized schemes on the other hand are managed by themicrocontroller and affect the entire SoC. These are divided intotwo approaches.

1) Centralized clock gates for entire blocks can be gated onor off by writing an instruction from the microcontrollerto a control register. This coarse clock-gating is importantto complement the distributed clock gating since the clockgating elements will be clocked regardless of whether the


TABLE ISOC STATISTICS

Fig. 12. SoC die photo.

subsequent logic is active or not. By providing a masterclock gate for blocks the distributed clock gating structureswithin that block are not clocked.

2) Ability to maintain functionality over a wide range ofmaster clock frequencies. Rather than fix the master clockfrequency at which the chip can provide functionality, itwas our goal to design the chip such that there is a gradualand fine-grained reduction in computational capacity asa function of the frequency. Thus, by predicting the an-ticipated computational load, substantial power gains canbe achieved by operating the chip at a reduced frequency.The ASIC was designed to run at a nominal frequencyof 100 MHz, however it maintains functionality for therange of frequencies from 64 MHZ (12 RAKE fingers384-kbps downlink) up to 100 MHZ (20 RAKE fingers, 2Mbps downlink). An integrated PLL is used to provide therequired frequency granularity.

V. PERFORMANCE EVALUATION

The SoC was implemented in a 0.18- m, 1.8-V CMOS tech-nology. Table I presents the statistics of the physical chip imple-mentation, while the die photo is presented in Fig. 12. The chipis mostly dominated by the two controllers and the memory re-quired for frame buffering at high data rates. Table II presentsa detailed breakdown of the different blocks. As indicated inthe table, the dual-antenna operation only affects the RAKE re-ceiver section since after the combiner all contributions fromboth antenna are merged into symbols.

Table III isolates the statistics of the RAKE receiver itself andindicates the area breakdown of the RAKE engine major com-ponents. Although, it is difficult to isolate the exact overhead as-sociated with dual-antenna processing, it is clear that relatively

TABLE IISOC BREAKDOWN

TABLE IIIRAKE ENGINE AREA (�m ) BREAKDOWN

to the entire SoC the overhead is acceptable. For example, as-suming the worst case of complete replication of hardware for adual-antenna solution versus a single-antenna solution, the over-head in power is 40 mW which is 7.3% of the total power of theSoC, while the area overhead is 4.4% of the entire SoC area.This is a pessimistic estimate since a large proportion of thelogic that processes one antenna is re-used for the dual-antennacase, however it is useful to achieve an upper bound on the as-sociated overhead.

To measure the performance of the SoC under realistic wire-less conditions, a test setup incorporating a WCDMA signalgenerator, a wireless channel emulator, a reference board in-cluding the SoC, and a PC was built. In this setup, a standardconforming WCDMA signal is generated, upconverted to RFand passed at RF to the channel emulator. The channel emu-lator output signal is presented at RF to a board incorporatingthe SoC described in this paper. The board includes RF, analogand digital sections designed with commercially available com-ponents. In particular, RFMD chip sets [23] were used for the up


TABLE IVPERFORMANCE FOR TEST CASE SCENARIOS

Fig. 13. SoC measured performance for a 384-kbps connection.

and down conversion chains and Maxim ADCs and DAC wereused [24]. BER and receive statistics are reported from the SoCto the PC for monitoring.

Fig. 13 illustrates the test results for single and dual-antennaunder flat fading conditions, at a speed of 3 km/h, with a SNRof 9 db and a 384-kbps DCH. The experimental results showa significant improvement of 9 dB between dual and single-an-tenna processing that at 1% block error rate (BLER). To val-idate the performance under multipath fading channel condi-tions, experimental tests were setup for selected 3GPP configu-rations [12] and tested at 1% BLER. Table IV summarizes theresults and compares with end-to-end simulations results of theproposed dual-antenna WCDMA mobile receiver. The test re-sults confirm the simulation results and clearly demonstrate theperformance improvement provided by using a dual-antenna re-ceiver. Depending on the scenario, the gain varies from 2.5 to10 dB. For example, for the 3GPP test case 2 at 12.2 kbps, adual-antenna receiver would allow a base station to reduce thepower allocated to the dedicated data channel by 6.7 dB. Thebase station will thus be able to support more users. Alterna-tively, a dual-antenna receiver could tolerate a 6.7 dB strongerinterference for the same power allocation, thereby extendingthe coverage area.

VI. CONCLUSION

A complete SoC for a WCDMA dual-antenna mobile ter-minal was introduced in this paper. The receiver was designedto take advantage of the available spatial diversity in several keyblocks to improve the system performance. The main receiver

algorithms and implementation architectures were described.Details of the SoC integration and design were also given. TheSoC was fabricated in a 0.18- m, 1.8-V CMOS technology. Itoccupies a total area of 72 mm and consumes 550 mW at themaximum data rates. The ASIC was integrated in a test platformfor performance validation. The UE was standard compliant andmeasurements confirmed the significant improvement that canbe provided by a dual-antenna receiver in a WCDMA environ-ment.

REFERENCES

[1] H. Holma and A. Toskala, WCDMA for UMTS: Radio Access for ThirdGeneration Mobile Communications, 3rd ed. New York: Wiley,2004.

[2] T. S. Rappaport, Wireless Communications: Principles and Practice,2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 2001.

[3] B. Razavi, “RF CMOS transceivers for cellular telephony,” IEEECommun. Mag., vol. 41, no. 8, pp. 144–149, Aug. 2003.

[4] L. Harju, M. Kuulusa, and J. Nurmi, “Flexible implementation of aWCDMA rake receiver,” in Proc. IEEE Workshop on Signal Process.Syst. (SIPS), Oct. 2002, pp. 177–182.

[5] H.-J. Lee and D. S. Ha, “An area and power efficient RAKE receiverarchitecture for DSSS systems,” in Proc. IEEE Int. SoC Conf., Sep.2003, pp. 103–106.

[6] A. ElTawil, E. Grayver, H. Zou, J. F. Frigon, G. Poberezhskiy, andB. Daneshrad, “Dual-antenna UMTS mobile station transceiver ASICfor 2 Mb/s data rate,” in Proc. IEEE ISSCC, Feb. 2003, vol. 46, pp.146–147.

[7] M. Suzuki, M. Kawabe, T. Yano, J. Kiyota, H. Ishii, T. Tamaki, and N.Doi, “A W-CDMA baseband modem LSI with multi-engine architec-ture,” IEICE Trans. Electron., vol. E85-C, pp. 352–358, Feb. 2002.

[8] H. Igura, M. Hirata, J. Yamada, M. Yamashina, and S. Ono, “A low-power W-CDMA demodulator using specially-designed micro-DSPs,”in Proc. IEEE CICC, May 2002, pp. 397–400.

[9] D. H. Lee, A. Choi, J. M. Koo, J. I. Lee, and B. M. Kim, “A wide-band DS-CDMA modem for a mobile station,” IEEE Trans. ConsumerElectron., vol. 45, no. 11, pp. 1259–1269, Nov. 1999.

[10] K. H. Chang, M. C. Song, H. S. Park, Y. S. Song, K. Y. Sohn, Y. H.Kim, C. I. Yeh, C. W. Yu, and D. H. Kim, “On the design of wide-bandCDMA user equipment (UE) modem,” in Proc. Int. Symp. on DSP forCommun. Systems, Jan. 2002, pp. 21–25.

[11] G. D. Jo, M. J. Sheen, S. H. Lee, and K. R. Cho, “A DSP-based recon-figurable SDR platform for 3G systems,” IEICE Trans. Commun., vol.E88-B, pp. 678–686, Feb. 2005.

[12] 3rd Generation Partnership Project 2006 [Online]. Available: http://www.3gpp.org/specs/specs.htm

[13] MSM6280 chipset solution, Qualcomm, San Diego, CA,2006 [Online]. Available: http://www.cdmatech.com/prod-ucts/msm6280_chipset_solution.jsp

[14] BCM2152 HSDPA/WCDMA/EDGE/GPRS/GSM multimedia base-band processor, Broadcom, Santa Clara, CA, 2006 [Online]. Available:http://www.broadcom.com/products/Cellular/3G-Baseband-Proces-sors/BCM2152

[15] MXC300–30: 3G single core modem platform, FreescaleSemiconductor, Austin, TX, 2006 [Online]. Available:http://www.freescale.com/webapp/sps/site/prod_sum-mary.jsp?code=MXC300–30&nodeId=01J4Fsm6cyDbFf

[16] 3G UMTS/WCDMA: OMAPV2230, Texas Instruments, Dallas, TX,2006 [Online]. Available: http://focus.ti.com/general/docs/wtbu/wt-buproductcontent.tsp?templateId=6123&navigationId=12593&con-tentId=4665


[17] SOFTONE-W chipset for W-CDMA/UMTS mobile terminals, AnalogDevices, Norwood, MA, 2006 [Online]. Available: http://www.analog.com/en/prod/0,,SOFTONE-W,00.html

[18] R. M. Hewlitt and E. S. Swartzlantler, “Canonical signed digit rep-resentation for FIR digital filters,” in Proc. IEEE Workshop SignalProcess. Syst., 2000, pp. 416–426.

[19] Y. E. Wang and T. Ottosson, “Cell search in W-CDMA,” IEEE J. Se-lect. Areas Commun., vol. 18, no. 8, pp. 1470–1482, Aug. 2000.

[20] A. ElTawil, E. Grayver, A. Tarighat, J. F. Frigon, K. Shoarinejad, H.Zou, and D. Cabric, “Diversity processing cell searcher implementa-tion,” in Proc. IEEE VTC, Sep. 2004, vol. 6, pp. 3900–3904.

[21] E. Grayver, J. F. Frigon, A. ElTawil, A. Tarighat, K. Shoarinejad, A. A.Abbasfar, D. Cabric, and B. Daneshrad, “Design and VLSI implemen-tation for a WCDMA multipath searcher,” IEEE Trans. Vehic. Technol.,vol. 54, no. 5, pp. 889–902, May 2005.

[22] A. Tarighat, E. Grayver, A. Eltawil, J. F. Frigon, G. Poberezhskiy, andH. Zou, “A low-power ASIC implementation of 2Mbps antenna-rakecombiner for WCDMA with MRC and LMS capabilities,” in Proc.IEEE CICC, San Jose, CA, Sep. 2005, pp. 69–72.

[23] Handsets, RF Micro Devices, Greensboro, NC, 2007 [Online]. Avail-able: http://www.rfmd.comm/handsets.asp

[24] Data converters, Maxim, Chantilly, VA, 2007 [Online]. Available:http://www.maxim-ic.com/ADCDACRef.cfm

Jean-François Frigon (S’96–M’01) received theB.Eng. degree from École Polytechnique de Mon-tréal, Montréal, QC, Canada, in 1996, the M.A.Sc.degree from the University of British Columbia,Vancouver, BC, Canada, in 1998, and the Ph.D.degree from the University of California at LosAngeles (UCLA), in 2004.

From 2001 to 2003, he worked as the directorof wireless communications systems at InnovicsWireless where he oversaw the design of a diversityenabled WCDMA terminal. He joined the Electrical

Engineering department at École Polytechnique de Montréal in 2004 where heis currently an Assistant Professor. His research interests include wireless net-works, MAC protocols and design of multiple antennas digital communicationsystems for high speed wireless communications.

Ahmed M. Eltawil received the B.Sc and M.Sc. de-grees in communications and electrical engineeringfrom Cairo University, Cairo, Egypt, in 1997 and1999, respectively, and the doctorate degree fromthe University of California at Los Angeles, in 2003with a focus on VLSI architectures for wide-bandwireless communications.

From January 2001 to August 2003, he wasDirector of ASIC Engineering at Innovics Wireless,where he led the development of the first reporteddiversity-enabled third-generation WCDMA mobile

station. In January 2005, he joined the University of California, Irvine, asan Assistant Professor in the Electrical Engineering and Computer ScienceDepartment. His current research interests are in advanced digital circuit andsignal processing techniques for communication systems, including both circuitand system design.

Dr. Eltawil is the recipient of several awards in his field including being thefirst Henry Samueli faculty fellow of Engineering at the University of California.

Eugene Grayver received the B.S. degree in elec-trical engineering from California Institute of Tech-nology (Caltech), Pasadena, CA, and the Ph.D. de-gree from University of California at Los Angeles(UCLA) in 2000.

He was one of the founding team membersof a fabless semiconductor company working onlow-power application-specific integrated circuitsASICs for multi-antenna 3G mobile receivers. In2003, he joined The Aerospace Corporation, wherehe is working on flexible communications platforms.

His research interests include reconfigurable implementations of digital signalprocessing algorithms, adaptive computing, low-power VLSI circuits forcommunications, and system design of wireless data communication systems.Lately, he has been concentrating on the interaction between nonlinear ampli-fiers and modern error correction codes. He has seven journal publications, andover a dozen conference papers.

Alireza Tarighat (S’00–M’05) received the B.Sc.degree in electrical engineering from Sharif Univer-sity of Technology, Tehran, Iran, in 1998, and theM.Sc. and Ph.D. degrees in electrical engineeringfrom the University of California at Los Angeles(UCLA) in 2001 and 2005, respectively.

During the summer of 2000, he was withBroadcom, El Segundo, CA, where he worked ondesigning IEEE 802.11 transceivers. From 2001 to2002, he was with Innovics Wireless, Los Angeles,CA, working on systems and ASIC development

of advanced antenna diversity and rake processing for 3G WCDMA mo-bile terminals. Since 2005, he has been with WiLinx, Los Angeles, CA,working on systems and silicon development of UWB wireless networks.His research interests are in communication theory and signal processing, in-cluding multiple-input multiple output (MIMO) orthogonal frequency-divisionmultiplexing systems, multi-user MIMO wireless networks, algorithms forimpairments compensation, and experimental and practical communicationsystems.

Mr. Tarighat was the recipient of the Gold Medal of the National PhysicsOlympiad, Iran, 1993, and the Honorable Mention Diploma of the 25th Inter-national Physics Olympiad, Beijing, China, 1994. He received the 2006 Out-standing Ph.D. Dissertation Award in electrical engineering from UCLA.

Hanli Zou received the B.S. degree in electronic en-gineering from Tsinghua University, Beijing, China,in 1996, and the M.S. and Ph.D. degrees in electricalengineering from the University of California, LosAngeles (UCLA) in 2000 and 2003 respectively, witha focus in integrated circuits and systems.

From 2001 to 2003, he worked as Senior ASICDesigner at Innovics Wireless Corporation, LosAngeles, CA, designing baseband chipsets fora diversity enabled WCDMA transceiver. Since2003, he has been with the Broadband VLSI group,

Broadcom Corporation, Irvine, CA, as a Principal Scientist, working onadvanced transceiver chips for cable, satellite, terrestrial digital TV andhome-networking applications. His research interest includes system designand VLSI implementation for high-speed communications, with an emphasison synchronization, channel estimation, equalization and diversity processing.

518 ieee transactions on circuits and systems—i:...

Documents