research open access peak-to-average-power-ratio (papr ... · many recent wide-band digital...

RESEARCH Open Access

Peak-to-Average-Power-Ratio (PAPR) reduction inWiMAX and OFDM/A systemsSeyran Khademi1*, Thomas Svantesson2, Mats Viberg1 and Thomas Eriksson1

Abstract

A peak to average power ratio (PAPR) reduction method is proposed that exploits the precoding or beamformingmode in WiMAX. The method is applicable to any OFDM/A systems that implements beamforming usingdedicated pilots which use the same beamforming antenna weights for both pilots and data. Beamformingperformance depends on the relative phase shift between antennas, but is unaffected by a phase shift common toall antennas. PAPR, on the other hand, changes with a common phase shift and this paper exploits that property.An effective optimization technique based on sequential quadratic programming is proposed to compute thecommon phase shift. The proposed technique has several advantages compared with traditional PAPR reductiontechniques in that it does not require any side-information and has no effect on power and bit-error-rate whileproviding better PAPR reduction performance than most other methods.

Keywords: WiMAX, OFDM, PTS, PAPR reduction, phase optimization, sequential quadratic programing

1. IntroductionMany recent wide-band digital communication systemsuse a multi-carrier technology known as orthogonal-fre-quency-division-multiplexing (OFDM), where the bandis divided into many narrow-band channels. A key bene-fit of OFDM is that it can be efficiently implementedusing the fast-fourier-transform (FFT), and that thereceiver structure becomes simple since each channel orsub-carrier can be treated as narrow-band instead of amore complicated wide-band channel. Orthogonal-fre-quency-division-multi-access (OFDMA) is a similartechnique, but the bands can be occupied by differentusers.Although OFDM and OFDMA have many benefits

contributing to its popularity, a well-known drawback isthat the amplitude of the resulting time domain signalvaries with the transmitted symbols in the frequencydomain. From OFDM symbol to OFDM symbol, themaximum amplitude can vary dramatically dependingon the transmitted symbols. If the maximum amplitudeof the time domain signal is large, it may push theamplifier into the non-linear region which creates many

problems that reduce performance. For example, itbreaks the orthogonality of the sub-carriers which willresult in a substantial increase in the error rate. A com-mon practice to avoid this peak-to-average-power-ratio(PAPR) problem is to reduce the operating point of theamplifier with a back-off margin. This back-off marginis selected so that it avoids most of the occurrences ofhigh peaks falling in the non-linear region of the ampli-fier. Of course, it is desirable to have a minimum back-off margin since operating the amplifier below fullpower reduces the range of the system, as well as theefficiency of the amplifier.PAPR reduction is a well-known signal processing

topic in multi-carrier transmission and large number oftechniques have been proposed in the literature duringthe past decades. These techniques include amplitudeclipping and filtering, coding [1], tone reservation (TR)[2,3] and tone injection (TI) [2], active constellationextension (ACE) [4,5], and multiple signal representa-tion methods, such as partial transmit sequence (PTS),selected mapping (SLM), and interleaving [6]. The exist-ing approaches differ in terms of requirements andrestrictions they impose on the system. Therefore, care-ful attention must be paid to choose a proper techniquefor each specific communication system.* Correspondence: [email protected]

1Department of Signal and Systems, Chalmers University of Technology, P.C-412 96 Gothenburg, SwedenFull list of author information is available at the end of the article

Khademi et al. EURASIP Journal on Advances in Signal Processing 2011, 2011:38http://asp.eurasipjournals.com/content/2011/1/38

© 2011 Khademi et al; licensee Springer. This is an Open Access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.

mailto:[email protected]

http://creativecommons.org/licenses/by/2.0

WiMAX mobile devices (MS) are commercially avail-able and for the system to work, both mobile devicesand basestations need to adhere to the WiMAX stan-dard. Hence, it is not possible to modify the basestationtransmission technique if it makes the transmissionnon-compliant to the standard since existing MS wouldnot be able to decode the transmissions correctly. Forexample, phase manipulation techniques such as PTSand SLM [7-9], which require coded side information tobe transmitted would not be compatible or compliant tothe standard. One technique of inserting a PAPR redu-cing sequence is part of the IEEE 802.16e standard. It isactivated using the PAPR reduction/sounding zone/safety zone allocation IE. Using this technique reducesthe throughput since it requires sending additionalPAPR bits. It is also not a part of the WiMAX profile soit is likely not supported by the majority of handsets.Accordingly, each of the discussed techniques is asso-

ciated with a cost in terms of bandwidth or/and power.The proposed technique in this paper neither requireadditional bandwidth nor power while delivering equalor better PAPR reduction gain compared with otherexisting methods. The proposed algorithm makes use ofthe antenna beamforming weights and dedicated pilotsat the transmitter [10]. It reduces the PAPR by modify-ing the cluster weights in the WiMAX data structure ina manner similar to the PTS method [7,8]. The mainbenefits of the proposed technique are:

• It preserves the transmitted power by adjustingonly the phase of the beamforming weights percluster.• No extra side information regarding the phasechange needs to be transmitted due to the propertyof dedicated pilots.• Not sending the phase coefficients allows for arbi-trary phase shifts instead of a quantized set such asused for PTS.• A novel search algorithm based on gradient opti-mization to find the optimum cluster weights phaseshifts.

The following presentation focuses on WiMAX, butthe same technique applies to any OFDM/OFDMA sys-tem that uses a concept similar to dedicated pilots anddoes not explicitly announce the multiplied weights tothe receiver.The paper is organized as follows: in Sect.2 the PAPR

in an OFDM system is defined, also the data structurein WiMAX profile and potential capabilities of the stan-dard is explained. In Sect.3, the proposed PAPR reduc-tion method is described based on the PTS techniquemodel and the phase optimization problem is formu-lated. The optimization problem is written as a

conventional minimax problem with nonequality con-straints in Sect.4 and then a sequential quadratic pro-gramming (SQP) technique is proposed to solve theminimax optimization. This approach breaks the com-plex original problem into several convex quadratic sub-problems with linear constraints. A pseudo code for atailored SQP approach is given in sect.4-C. Simulationresults in Sect.6 confirm the significant PAPR reductiongain applying the SQP algorithm over other techniques,and the complexity evaluation in Sect.5 reveals theadvantage of the new optimization method comparingthe exhaustive search approach in PTS. Finally, thepaper is concluded in Sect.7 with a summary and a briefdiscussion on further research.

2. System ModelConsider an OFDM system, where the data is repre-sented in the frequency domain. The time domain signals(n), n = 1, 2, ..., N, where N denotes the FFT size is cal-culated from the frequency domain symbols D(k) usingan IFFT as [10].

s(n) =1√N

N−1∑k=0

D(k)e

j2πknN . (1)

Note that the frequency domain signal D(k) typicallybelong to QAM constellations. In the case of WiMAX;QPSK, 16QAM and 64QAM constellations are used.The metric that will be used to measure the peaks inthe time-domain signal is the PAPR metric defined as

PAPR =max

0≤n≤N−1|sn|2

E{|sn|2} . (2)

Although not explicitly written in Equation(2), it iswell known that oversampling is required to accuratelycapture the peaks. In this paper, an oversampling offour times is used.The WiMAX protocol defines several different DL

transmission modes, of which the DL-PUSC mode isthe most widely used and is on focus here. The mini-mum unit of scheduling a transmission is a sub-chan-nel, which here spans multiple clusters. One clusterspans 14 sub-carriers over two OFDM symbols con-taining four pilots and 24 data symbols, which is illu-strated in Figure 1. For a 10MHz system, there are atotal of 60 clusters. A sub-channel is spread overeight or twelve clusters of which only two or threedata sub-carriers from each cluster are used. The sub-channel carries 48 data symbols. For example, logicalsub-channel zero uses two data sub-carriers from 12clusters over two OFDM symbols to reach 48 datasymbols.


of 18

To extract frequency diversity, the WiMAX protocolspecifies that the clusters in a sub-channel are spreadout across the band, i.e., a distributed permutation.The WiMAX standard further specifies two mainmodes of transmitting pilots: common pilots and dedi-cated pilots. Here, dedicated pilots allow per-clusterbeamforming since channel estimation is performedper-cluster, whereas for common pilots channel esti-mation across the whole band is allowed. The presen-tation so far has ignored a practical detail of guardbands which are inserted to reduce spectral leakage. InWiMAX, a number of sub-carriers in the beginningand the end of the available bandwidth do not carryany signal, leaving Nusable sub-carriers that carrie dataand pilots. Although this number depends on band-width and transmission modes, weights that are con-stant across each cluster are simply applied to only theNusable sub-carriers.

3. Proposed TechniqueThe proposed technique exploits dedicated pilots forbeamforming, which is a common feature in next gen-eration wireless systems. For example, in several 4G sys-tems such as WiMAX [10] precoding or beamformingweights is not explicitly announced, but instead bothpilots and data are beamformed using the same weights.In the WiMAX downlink (DL), beamforming weightsare applied in units of clusters (14 sub-carriers), and inthe uplink (UL) in units of tiles (four sub-carriers).Beamforming in this context is defined as sending thesame message from different antennas, but using differ-ent weights per antenna. For a four-antenna BS, theweights can be written as wo = [ejφo, 1 , ejφo, 2 , ejφo, 3 , ejφo, 4 ]T

where jo,1 usually is set to zero for normalization pur-poses. The beamforming gain for a 4 × 1 channel hbecomes |wH

o h|2. It is clear that we get the same beam-forming gain for the vector w = ejj wo since a phase

Figure 1 Structure of DL-PUSC permutation in WiMAX, where the transmission bandwidth is divided into 60 clusters of 14 sub-carriersover two symbols each.


of 18

rotation common to all elements does not changesquared product |wHh|2 = |wH

o h|2. However, the com-mon phase rotation has a large impact on the PAPR.Writing the resulting expression for the time-domainsignal of the first antenna at tone n using the normaliza-tion jo,1 = 0 yields

s1(n) =1√N

N−1∑k=0

D(k)Ws(k)e

j2πknN , (3)

where Ws(k) denotes the beamforming weight on sub-carrier k, i.e., Ws(k) = ejj(k). Since the channel is esti-mated using the pilots in each cluster, the beamformingweights need to be constant over each cluster, but canchange from cluster to cluster, i.e., Ws(k0) = Ws(k0 + 1)= ... = Ws(k0 + 13), where k0 denotes the first sub-carrierin a particular cluster. In the following, we will focus onthe scenario of a single transmission antenna since itsimplifies the expressions. However, the method caneasily be extended to scenarios with multiple transmitantennas, which is the normal mode of dedicated pilotsand beamforming.For the case of wideband weights, i.e., the beamforming

weights are the same across the whole band, the PAPRreduction method is identical and performed only once.For the typical case of narrowband weights, a differentbeamforming weight per cluster is used so that the PAPRreduction method is applied in a joint fashion over thetransmitted signal from all antennas. Furthermore, thetechnique is readily extendable to single and multi-userMIMO systems using the same concept of dedicatedpilots. Although there are now multiple streams, thebasestation has to transmit pilots beamformed in thesame way as the data. Hence, the same technique as out-lined above can be applied. For a basestation sendingmultiple streams to one or many receivers, the weightoptimization now has to be performed jointly over thestreams, but otherwise the concept is the same.The optimization problem of calculating the weights

that minimize the PAPR can now be formulated as

Ws = argminWs

maxn

∣∣∣∣∣∣∣N−1∑k=0

D(k)Ws(k)e

j2πknN

∣∣∣∣∣∣∣

2

. (4)

Note that for a 10 MHz WiMAX system, there are 60clusters so there are 60 phase shifts Ws(k) = ejj(k) wherej(k) Î [0, 2π) and k = 1, 2, ..., 60.The PAPR reduction technique proposed here is

transparent to the receiver and thus does not requireany modification to existing receivers and wireless stan-dards. This is clear by writing the received signal z atthe handset as

z = hejφs = h’s, (5)

where h’ = hejj denotes the effective channel. TheBER performance of the effective channel is identical tothe original channel. Furthermore, since both pilots anddata are transmitted with the same phase shift, thechannel estimation performance is also identical. In theproposed technique, the dedicated pilots for channelestimation is used, without interfering with their originaljob, as an indicator to inform the receiver about thephase rotation at the transmitter. So, the known symbolsat allocated subcarriers are phase rotated, as well as datasubcarriers. Note that pilot symbols already exists incurrent design of WiMAX and other similar wirelessstandards, so we do not reduce the bandwidth for PAPRreduction. The receiver is implicitly informed while theinformation is hidden at the known pilot symbols. Thechannel coefficients are estimated for equalization basedon received pilots while the PAPR phase rotation isinterpreted as the channel effect.Moreover, the proposed technique does not impact

the transmitted power since it is only a phase-modifica-tion. In essence, the technique is similar to partial-trans-mit-sequence (PTS), but without the drawback ofrequiring side-information which would make it impos-sible to apply in existing communication standards suchas WiMAX. These advantages makes it a very attractivetechnique to reduce PAPR.The dedicated pilot feature is designed for beamform-

ing and the standard explicitly states that only thebeamformed pilots inside the beamformed clusters canbe used for channel estimation and equalization. Theweights are different from cluster to cluster. Since onlythose pilots can be used, there is no other side informa-tion that could be used since in the WiMAX case, thephase-change is incorporated into the channel just asany other type of beamforming weights would. Remem-ber that there is no difference between our beamformingweights and normal beamforming weights from a chan-nel estimation perspective. In both cases, there is noneed for extra side information. Note that it is possibleto design a system different from the WiMAX dedicatedpilots setting that could use more side-information, butthat is outside the scope of the this paper since it isfocusing on WiMAX.In conclusion, cluster weights can be used to decrease

the PAPR of the OFDM symbol. To preserve the aver-age transmitted power, only the phase of the clustersare changed. These phase weights can be multipliedeither before IFFT blocks or after it, and the result willbe the same due to the linear property of the IFFToperation. However, it is more efficient for the optimiza-tion algorithm to apply the phase coefficients after theIFFT block. This is exactly the same approach as the


of 18

PTS which is explained with a description. However,there are still substantial differences regarding the phaseselection, sub-block partitioning, etc.

A. Partial Transmit Sequence (PTS)Based on the PTS technique, an input data block of Nsymbols is partitioned into several disjoint sub-blocks[6]. All elements in each sub-block are weighted by aphase factor associated with it, where these phase fac-tors are selected such that the PAPR of the combinedsignal is minimized. Figure 2 shows the block diagramof the PTS technique. In the conventional PTS, theinput data block D is partitioned into M disjoint sub-blocks Dm = [Dm,0, Dm,1, ..., Dm,N-1]

T, m = 1, 2, ..., M,such that

∑Mm=1 Dm = D, and the sub-blocks are com-

bined to minimize the PAPR in the time domain. TheL-times over-sampled time domain signal of Dm isobtained by taking an IDFT of length NL on Dm conca-tenated with (L - 1)N zeros, and is denoted by bm =[bm,0, bm,1, ..., bm,LN-1]

T, m = 1, 2, ..., M; these are calledthe partial transmit sequences. Complex phase factors,Wm = ejφm , m = 1, 2, · · · ,M are introduced to combinethe PTSs which are represented as a vector W = [W1,W2, ..., WM]T in the block diagram. The time domainsignal after combination is given by

s(n) =M∑m=1

Wmbm(n). (6)

The objective is to find a set of phase factors thatminimize the PAPR. In general, the selection of thephase factors is limited to a set with a finite number of

elements to reduce the search complexity. The set ofpossible phase factors is written as

P = e

j2π lK l = 0, 1, · · · ,K − 1,

where K is the number of

allowed phases. The first phase weight is set to 1 with-out any loss of performance, so a search for choosingthe best one is performed over the (M - 1) remainingplaces. The complexity increases exponentially with thenumber of sub-blocks M, since KM-1 possible phase vec-tors are searched to find the optimum set of phases.Also, PTS needs M times IDFT operations for each datablock, and the number of required side information bitsis log2(K

M-1) to send to the receiver. The amount ofPAPR reduction depends on the number of sub blocksand the number of allowed phase factors [9].For each sub-block which is rotated at the transmitter,

the applied phase coefficient is sent using a code bookto the receiver as an explicit side information whichreduce the spectral efficiency. on the other hand, thereceiver use the same code book to retrieve the appliedphase at the transmitter from side information bits. Sothe code book needs to be compromised between trans-mitter and receiver at the system design phase.PTS performs an exhaustive search among a combina-

tion of phase vectors to resolve the optimum weights.For example a permutation of ±1 for two allowed phasefactors is performed; in this case, the whole searchspace for 60 clusters will be 260 alternative vectors,which takes a tremendous amount of computations.Here, we propose a realistic optimization algorithmbased on the basic configuration of the PTS sub-blocks.

Figure 2 Block diagram of PTS technique with M disjoint sub-blocks and phase weights to produce a minimized PAPR signal,quantized phase weights W are selected by exhaustive search among possible combinations.


of 18

B. Formulation of the Phase Optimization ProblemThe proposed PAPR reduction method is establishedbased on the PTS model when beamforming weights inWiMAX are the alternatives for phase weights in PTSand the sub-blocks represent the clusters. The matrix Bis defined as a NL × M array; it contains the summationof IFFT weights within a cluster. The columns of B arethe IFFT output samples of PTS sub-blocks, whoselength shows the number of disjoint sub-blocks, andeach of them is multiplied with a separate phase weight.A direct calculation to form matrix B costs 60 IFFTblocks of size 1024 which means 60(1024/2) log2(1024)≈ 3 × 105 complex multiplications. This can be reducedeffectively by some interleaving and the Cooley-TukeyFFT algorithm, which is proposed in [11]. The trans-mitted sequence s is illustrated as a multiplication ofmatrices B and j in Equation(7).

s =

⎡⎢⎢⎢⎢⎢⎣

b1,1 b1,2 · · · b1,Mb2,1 b2,2 · · · b2,Mb3,1 b3,2 · · · b3,M...

.... . .

...bLN,1 bLN,2 · · · bLN,M

⎤⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎣

ejφ1

ejφ2

ejφ3

...ejφM

⎤⎥⎥⎥⎥⎥⎦ . (7)

Here, we rewrite the optimization problem to Iind theoptimum phase set j as

φ = argminφm

maxn

|s(n)|2, (8)

where

s(n) =M∑m=1

bn,mejφm . (9)

The s(n)s are complex values and jns are continuousphases between [0, 2π). Substituting bn,m = Rn,m + jIn,mand ejjm = cos jm + j sin jm in Equation(9) and taking thesquare of |s(n)| results in Equation(10), when Rn,m and In,m stands for ℜ{bn,m} and ℑ{bn,m} respectively. This is avery important equation, which shows the square of thenorm or the power of output sub-carriers that are trans-mitted; a multi-variable cost function to be minimizedwhen the largest |s(n)| specifies the PAPR of the system.To emphasis on the role of objective function, the |s(n)|2

is replaced with fn(j) as expressed in Equation(10).Clearly, the multi-variable objective function is contin-

uous and differentiable over [0, 2π), so its gradient canbe derived analytically and this is a key property todevelop a solution. Knowing the gradient of

fn(φ) =((Rn,1 cos φ1 + · · · + Rn,M cos φM) − (In,1 sinφ1 + · · · + In,M sinφM)

)2

︸︷︷︸A

+((Rn,1 sinφ1 + · · · + Rn,M sinφM) + (In,1 cos φ1 + · · · + In,M cos φM)

)2

︸︷︷︸B

(10)

∂fn(φ)∂φm

= −2A(Rn,m sinφm + In,m cos φm

)+ 2B

(Rn,m cos φm − In,m sinφm

)(11)

the objective function, the problem can be solvedusing a wide range of gradient - based optimizationmethods. The gradient of |s(n)|2 as a function of phasevector j = [j1, j2, ..., jM ] is defined as the vector

∇fn = [ ∂ fn∂φ1

, ∂ fn∂φ2

, · · · , ∂ fn∂φM

]T. The Jacobian matrix is

defined in Equation(12), where M is the number of sub-blocks and LN is the length of the vector s (oversampledOFDM symbol). The nth row of this matrix is the gradi-ent of the fn(j).

J =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

∂f1∂φ1

∂f1∂φ2

· · · ∂f1∂φM

∂f2∂φ1

∂f2∂φ2

· · · ∂f2∂φM

......

. . ....

∂fLN∂φ1

∂fLN∂φ2

· · · ∂fLN∂φM

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦. (12)

The elements of Jacobian matrix is expressed in Equa-tion (11).Minimax Approach. The minimax optimization in

Equation(8) minimizes the largest value in a set ofmulti-variable functions. An initial estimate of the solu-tion is made to start with, and the algorithm proceedsby moving towards the minimum; this is generallydefined as,

minimize max{fn(φ)}φ 1 ≤ n ≤ N

(13)

To minimize the PAPR, the objective of the optimiza-tion problem is to minimize the greatest value of |s(n)|2

in Equation(9) which is analogous to max{fn(j)} inEquation(13). Here, we reformulate the problem into anequivalent non-linear programming problem in order tosolve it using a sequential quadratic programming (SQP)technique

minimize f (φ)φ

subject tofn(φ) ≤ f (φ)

(14)

In agreement with this new setting, the objective func-tion f(j) is the maximum of fn(j), or equivalently it isthe greatest IFFT sample in the whole OFDM sequencewhich characterizes the PAPR value. The remainingsamples are appended as additional constraints, in theform of fn(j) ≤ f (j). In fact, the f (j) is minimized overj using SQP, and the additional constraints are consid-ered because we do not want other fns pop out whenthe maximum value is being minimized. In this way, the


of 18

whole OFDM sequence is kept smaller than the valuethat is being minimized during iterations.

4. Solving the Optimization ProblemThe proposed PAPR reduction technique has uniquefeatures of exploiting the dedicated pilots and channelestimation procedure while choosing the best phasecoefficients still is a new challenge. In PTS the optimumweights are selected by performing the exhaustive searchamong the quantized set of phase options, where herethere is no restriction on phase coefficients and theycan be selected between continuous interval of (0, 2π].So an efficient optimization algorithm should be used toextract the proper phase choices; the proposed algo-rithm is a gradient-based method and modified andadapted for the phase optimization problem of thePAPR reduction technique.

A. Sequential Quadratic ProgrammingSQP is one of the most popular and robust algorithmsfor non-linear constraint optimization. Here, it is modi-fied and simplified for the phase optimization problemof PAPR reduction, but the basic configuration is assame as general SQP. The algorithm proceeds based onsolving a set of subproblems created to minimize aquadratic model of the objective, subject to a lineariza-tion of the constraints. The SQP method has been usedsuccessfully to many practical problems, see [12-14] foran overview. An efficient implementation with good per-formance in many sample problems is described in [15].The Kuhn-Tucker (KT) equations are the necessary

conditions for optimality for a constrained optimizationproblem. If the problem is a convex programming pro-blem, then the KT equations are both necessary and suf-ficient for a global solution point [16]. The KTequations for the phase optimization problem are statedas the following expression, where lns are the Lagrangemultipliers of the constraints.

∇f (φ) +N∑n=1

λn · ∇fn(φ) = 0, (15)

λn ≥ 0. (16)

These equations are used to form quasi Newtonupdating step which is an important step outlinedbelow. The quasi Newton steps are implemented byaccumulating second-order information of KT criteriaand also checking for optimality during iterations.The SQP implementation consists of two loops: the

phase solution is updated at each fiiteration in majorloop with k as the counter, while itself contains aninner QP loop to solve for optimum search directiondk.

Major loop to find j which minimize the f(j):while k < maximum number of iterations dojk+1 = jk + dk,QP loop to determine dk for major loop:while optimal dk found dodl+1 = dl + adl,

end whileend whileThe step length a is determined within the QP itera-

tions which is distinguished from major iterations byindex l as the counter.The Hessian of the Lagrange function is required to

form the quadratic objective function. Fortunately, it isnot necessary to calculate this Hessian matrix explicitlysince it can be approximated at each major iterationusing a quasi Newton updating method, where the Hes-sian matrix is estimated using the information specifiedby gradient evaluations. The Broyden Fletcher GoldfarbShanno (BFGS) is one of the most attractive membersof quasi Newton methods and frequently used in non-linear optimization. It approximates the second deriva-tive of the objective function using Equation(17).Quasi Newton methods are a generalization of the

secant method to find the root of the first derivative formultidimensional problems [17]. Convergence of themulti-variable function f(j) can be observed dynamicallyby evaluating the norm of the gradient |∇f(j)|. Practi-cally, the first Hessian can be initialized with an identitymatrix (H0 = I), so that the first step is equivalent to agradient descent, while further steps are graduallyrefined by Hk, which is the approximation to the Hes-sian [18]. The updating formula for the Hessian matrixH in each major iteration is given by,

Hk+1 = Hk +qkq

Tk

qTk sk

− HTkHk

sTkHksk. (17)

where H is M × M matrix and ln is the Lagrangemultipliers of the objective function f (j).

qk = ∇f (φk+1) +∑N

n=1λn · ∇fn(φk+1)

− ∇f (φk) +∑N

n=1λn · ∇fn(φk).

(18)

sk = φk+1 − φk. (19)

The Lagrange multipliers [according to Equation (16)]is non-zero and positive for active set constraints, andzero for others. The ∇fn(jk) is the gradient of nth con-straints at the kth major iteration. The Hessian is main-tained positive definite at the solution point if qT

k sk ispositive at each update. Here, we modifya qk on an ele-ment-by-element basis so that qT

k sk > 0 as proposed in[19].


of 18

After the above update at each major iteration, a QPproblem is solved to find the step length dk, which mini-mizes the SQP objective function f(j). The complexnonlinear problem in Equation(14) is broken down toseveral convex optimization sub problems which can besolved with known programming techniques. The quad-ratic objective function q(d) can be written as

minimize q(d) =12dTHkd + ∇f (φk)

Td

d ∈ �n

subject to∇fn(φk)

Td + fn(φk) ≤ 0

(20)

We generally refer to the constraints of the QP sub-problem as G(d) = A d - a, where ∇fn(jk)

T and - fn(jk)are the nth row and element of the matrix A and vectora respectively.The quadratic objective function q(d) reflects the local

properties of the original objective function and themain reason to use a quadratic function is that suchproblems are easy to solve yet mimics the nonlinearbehavior of the initial problem. The reasonable choicefor the objective function is the local quadratic approxi-mation of f(jk) at the current solution point and theobvious option for the constraints is the linearization ofcurrent constraints in original problem around jk toform a convex optimization problem. In the next sectionwe explain the QP algorithm which is solved iterativelyby updating the initial solution. The notation in the fol-lowing section is summarized here for convince.

• dk is a search direction in the major loop while dlis the search direction in the QP loop.• k is used as an iteration counter in the major loopand l is the counter in the QP loop.• jk is the minimization variable in the major loop,it is the phase vector in this problem.• dl is the minimization variable in the QP problem.• fn(jk) is the nth constraint of the original minimaxproblem at a solution point jk.• G(dl) = A dl - a is the matrix represents the con-straint of the QP sub-problem at a solution point dland gn(dl) is the nth constraint.

B. Quadratic ProgrammingIn a quadratic programming (QP) problem, a multi-vari-able quadratic function is maximized or minimized, sub-ject to a set of linear constraints on these variables.Basically, the quadratic programming problem can beformulated as: minimizing f(x) = 1/2 xT C x+ cT x withrespect to x, with linear constraints Ax ≤ a ,whichshows that every element of the vector Ax is ≤ to thecorresponding element of the vector a .

The quadratic program has a global minimizer if thereexists some feasible vector x satisfying the constraints,provided that f(x) is bounded in constraints on the feasi-ble region; this is true when the matrix C is positivedefinite. Naturally, the quadratic objective function f(x)is convex, so as long as the constraints are linear we canconclude the problem has a feasible solution and aunique global minimizer. If C is zero, then the problembecomes a linear programming [20].A variety of methods are commonly used for solving a

QP problem; the active set strategy has been applied inthe phase optimization algorithm. We will see how thismethod is suitable for problems with a large number ofconstraints.In general, the active set strategy includes an objective

function to optimize and a set of constraints which isdefined as g1(d) ≤ 0, g2(d) ≤ 0, ..., gn(d) ≤ 0 here. That isa collection of all d, which introduce a feasible region tosearch for the optimal solution. Given a point d in thefeasible region, a constraint gn(d) ≤ 0 called active at dif gn(d) = 0 and inactive at d if gn(d) < 0.b. The activeset at d is made up of those constraints gn(d) that areactive at the current solution point.The active set specifies which constraints will parti-

cularly control the final result of the optimization, sothey are very important in the optimization. For exam-ple, in quadratic programming as the solution is notnecessarily on one of the edges of the bounding poly-gon, specification of the active set creates a subset ofinequalities to search the solution within [21-23]. As aresult, the complexity of the search is reduced effec-tively. That is why non-linearly constrained problemscan often be solved in fewer iterations than uncon-strained problems using SQP, because of the limits onthe feasible area.In the phase optimization problem, the QP subpro-

blem is solved to find the dk vector which is used toform a new j vector in the kth major iteration, jk+1 =jk + dk . The matrix Q in the general problem isreplaced with a positive definite Hessian as discussedearlier, the QP sub-problem is a convex optimizationproblem which has a unique global minimizer. This hasbeen tested practically in the simulation results, whenthe dk which minimizes a QP problem with specific set-ting is always identical, regardless of the initial guess.The QP subproblem is solved by iterations when at

each step the solution is given by dl+1 = dl + αdl. Anactive set constraints at lth iteration, Ál is used to set abasis for a search direction dl. This constitutes an esti-mate of the constraint boundaries at the solution point,and it is updated at each QP iteration. When a newconstraint joins the active set, the dimension of thesearch space is reduced as expected.


of 18

The dl is the notation for the variable in the QP itera-tion; it is different from dk in the major iteration of theSQP, but it has the same role which shows the directionto move towards the minimum. The search direction dlin each QP iteration is remaining on any active con-straint boundaries while it is calculated to minimize thequadratic objective function.The possible subspace for dl is built from a basis Zl,

whose columns are orthogonal to the active set Ál, ÁlZl

= 0. Therefore, any linear combination of the Zl col-umns constitutes a search direction, which is assured toremain on the boundaries of the active constraints.The Zl matrix is formed from the last M - P columns

of the QR decomposition of the matrix ATlEquation(21)

and is given by: Zl = Q[:, P + 1: M ]. Here, P is thenumber of active constraints and M shows the numberof design parameters in the optimization problem,which is the number of sub-blocks in the PAPR pro-blem.

QT ATl =

[R0

]. (21)

The active constraints must be linearly independent,so the maximum number of possible independent equa-tions is equal to the number of design variables; inother words, P <M. For more details see [19].Finally, there exists two possible situations when the

search is terminated in QP subproblem and the mini-mum is found; either the step length is 1 or the opti-mum dl is sought in the current subspace whoseLagrange multipliers are all positive.

C. SQP Pseudo CodeHere, a pseudo code is provided for the SQP implemen-tation and we will refer to it in the complexity evalua-tion section. As discussed in the previous parts, thealgorithm consists of two loops.Step0 Initialization of the variables before starting the

SQP algorithm

• An extra element (slack variable) is appended tothe variables so j = [j0, j1, j2, ..., jM ]. The objec-tive function is defined as f(j) = jM and is initializedwith zero, other elements can be any random guess.• The initial Hessian is an identity matrix H0 = I,and the gradient of the objective function is ∇f(jK)

T

= [0, 0, ..., 1].

Step1 Enter the major loop and repeat until thedefined maximum number of iterations is exceeded.

• Calculate the objective function and constraintsaccording to Equation(10)

• Calculate the Jacobian matrix Equation(11)• Update the Hessian based on Equation(17) andmake sure it is positive definite.• Call the QP algorithm to find dk

Step2 Initialization of the variables before starting theQP iterations,

• Find a feasible starting point ford0 = [d00, d

10, · · · , dM0 ] and d0 = [d00, d

10, · · · , dM0 ];

Check that the constraints in the initial working setc

are not dependent, otherwise find a new initial point d0which satisfies this initial working set.Calculate the initial constraints A d0 - a,

if max(constraints) >ε thenThe constraints are violated and the new d0

needs to be searchedend if

• Initialize the Q, R and Z and compute initial pro-jected gradient ∇q(d0) and initial search direction d0Step3 Enter the QP loop and repeat until the mini-

mum is found• Find the distance in the search direction we can

move before violating a constraint

gsd = Adl (Gradient with respect to the searchdirection)ind = find (gsdn >threshold)if isempty(ind) then

Set the distance to the nearest constraint as zeroand put a = 1

elseFind the distance to the nearest constrain as fol-lows

α = min1≤n≤N

{−(Andl − an)

Andl

}. (22)

Add the constraint Aid to the active set Ál

Decompose the active set as (21)Compute the subspace Zl = Q[:, P + 1: M ]

end if

• Update dl+1 = dl + αdl• Calculate the gradient objective at this point Δq(dl)• Check if the current solution is optimale

if a = 1 || length (Ál) = M then


of 18

Calculate the l of active set by solving

−Rlλl = (QTl ∗ ∇q(dl)). (23)

end ifif all li >0 then

return dkelse

Remove the constraints with li < 0end if

• Compute the QP search direction according to theNewton step criteria,

dl = −Zl((ZT

l HkZl)\(ZTl ∇q(dl))

), (24)

Where the (ZTl HkZl) is projected Hessian, see A.

Step4 Update the solution j for the kth iteration; jk+1

= jk + dk and go back to Step 1

5. Complexity AnalysisThe SQP algorithm has a quite complicated mathemati-cal concept, and it can be implemented with differentmodifications. Therefore, the complexity evaluation isnot straightforward. The number of QP iterations is notfixedf and is different for each OFDM symbol; here, theaverage number of QP iterations is considered to evalu-ate the complexity. For 60 sub-blocks, 1024 sub-carriersand 64 QAM, the average is obtained as 80 iterationsfor each major SQP iteration.Another difficulty to compute the required operation

is the length of the active set, which alters during itera-tions starting from 1 to at most M at the end of loop.Consequently, the size of R in the QR decompositionand Z the basis for the search subspace are not fixedduring the process so the complexity cannot be assesseddirectly for each QP iteration and some numerical esti-mations are necessary.To evaluate the amount of computation needed for

this technique, all steps in the pseudopod are reviewedin detail and an explicit expression is given for eachpart. First, the complexity of the major loop is assessedin Steps 1 and 4, and then the QP loop is evaluatedseparately. Finally, the complexity is derived in terms ofthe number of sub-blocks and major iterations withsome approximation and numerical analysis.Major loop. Steps 1 & 41) Objective function and constraints from Equation

(10):

4M × N multiplications and the same amount ofaddition, N comparisons to find the maximum ofconstraints

2) Jacobian matrix from Equation(11):

6M × N multiplications, 4M × N additions

3) Hessian update Equation(17):

2M × N multiplications, 2M × (N + 1) additions tocalculate Equation(19),3(M + 1) additions and M multiplications formatrices of size M × 1 to compute qk and qk, 2Mdivisions and M additions are required to update H

4) The solution j is updated, which requires Madditions.QP loop. Step 31) Gradient with respect to the search direction:

4M × N multiplications and additions to calculategsd , N comparisons to find the maximum

2) Distance to the nearest constraint from Equation(22):

2M × N multiplications and additions, N compari-sons to find the minimum

3) Addition of constraint to the active set:

Assume the active set has length L - 1, then the newconstraint is inserted and the matrix size becomesM × L. To compute the QR decomposition of thismatrix, 2L2(M - L/3) operations are needed [24].

4) Update the solution dl which needs M additions.5) The gradient objective at the new solution point

needs M2 multiplications and M2 + 1 additions6) The Lagrange multipliers are obtained by solving a

linear system of equations, and this impose a complexityin the order of M3 [24].7) Remove the constraint in case of li < 0:Removing the constraint and recalculation of QR

decomposition requires 2L2(M - L/3) operations.8) Search direction according to Equation(24):It is a solution to a system of linear equations. The

size of Z varies during the iterations, and starts from M× M and reduces to an M × 1 matrix at the end.Accordingly, the complexity in a QP iteration can bestated as 2S2(M + S/3) where S is the number of col-umns in Z at each step.At first, the computation which is required for the

major loop is obtained as 22NM + 9M + N. Next, theamount of computation in the QP loop is divided intofixed and variable partsg; there are (6M + 2)N + 2M2 +M operations which are performed in parts numerated


of 18

by 1, 2, 4 and 5 in every iterations. Besides there areamount of uncertain operations in other parts which areevaluated separately.To resolve the search direction in Equation(24) two

states is possible: the first M times needs 0.4167M4+0.6667M3 + 0.25M2 operations, which is derived bynumerical analysis and polynomial fitting, and forfurther iterations each needs 2M operations. Thereforethe required number of flops can be approximated as0.4M3+0.7M2+0.2M for each iteration. In the QRdecomposition part, which is certainly done in everyiterations, the procedure is the same. It means that forthe first M iteration, 0.25M4 - 0.3333M3 + 0.0833M2

operations and for the extra ones 4/3M3 flops are done.So the amount of major computation is approximatedto be 0.25M3 for each QP iteration by dividing the totaloperations over M.With an acceptable approximation, we claim that the

Lagrange multipliers calculation can be neglected incomparison with other dominant parts of computations,because it mostly appears after M +1 iterations; thisoccurs when the active constraints are full (M con-straints are added to the active set), or sometimes whenthe exact step to the minimum is found. To sum up, thetotal number of operations needed for each QP iterationis roughly expressed as 0.65M3 + 2.7M2 + 6NM + 2N,and the total complexity is shown in Table 1, where kand l are the number of major and QP iterationsrespectively.There are other optimization methods that can be

used to find the best phase weights. PSO is one of theproposed methods for PTS phase search algorithm andmany modifications have been introduced to simplifythe technique [25]. But the numerical optimization tech-niques like PSO are only applicable for PTS with limitednumber of sub-blocks and subcarriers (at most 256 sub-carriers and 16 sub-blocks) so that the algorithm con-verges fast enough to the optimal solution. But herethere are 60 sub-blocks and when the allowed phase setis just ±1, the initial generated solutions span 260 possi-ble options. To reduce the convergence time of the

optimization technique, the number of randomly gener-ated solutions needs to be a reasonable proportion of allpossible solutions, while the complexity is increased lin-early with the number of particles in the initial swarmpopulation. The continuous version of PSO is imple-mented and simulation result is shown in Figure 7 whenthe number of computations is almost equal to the gen-erated SQP curve.The complexity of PSO is expressed as the number of

required flops in Table 1 where k is the number of itera-tions and n is the number of initial solutions or theswarm population. For more details on the complexityof PSO, see [26].The complexity of SQP is graphically illustrated,

showing the number of operations in the SQP algorithmfor two OFDM symbols in time with 1024 sub-carriers.Figure 3 indicates the trend when the number of itera-tions increases. Predictably, when more sub-blocks arechosen to be phase rotated, then the complexity israised with sharper slope versus the number of itera-tions, because M3 is the coefficient which dominantlydefines the slope of l. Figure 4 shows how the complex-ity grows almost linearly with the number of sub-blocksfor less number of iterations, while it tends to a cubiccurve for larger number of iterations.The exhaustive search whose complexity is shown in

the first row of Table 1 is used in conventional optimalPTS and has a significantly higher cost compared to theproposed algorithm. Moreover, the performance is notas good as SQP, since the phase coefficients are opti-mized among a quantized phase set. The whole calcula-tion in Equation(7) has to be repeated for everycombination of phase vectors, and this requires KM ×MN times additions and multiplications, where K is thenumber of allowed phases and M is the number of sub-blocks. Additionally, KM × (N + 1) comparisons areneeded to find the largest sample among each producedtransmit sequence, and also between all PAPRs tochoose the minimum.To have a better perception of the PTS complexity in

this context, assume the allowed phase set is ±1, so K = 2and no phase rotation required. Also, the number of sub-blocks is M = 60 and the same setting preserved as theSQP; then approximately, 1023 additions and 1021 compar-isons have to be performed to find the optimum phasewhich is clearly impractical. In contrast, the SQP requires108 flops for 60 sub-blocks which is roughly equivalent tothe PTS exhaustive search with only 12 sub-blocks andtwo phase options. According to the recent developmentsin DSP technology and time schedule in WiMAX andLTE standard, this amount of computation is affordable.There are many methods in the literature which is

dedicated to develop sub-optimal PTS schemes toreduce the complexity of exhaustive search in

Table 1 The complexity of different algorithms to searchoptimum phase set.

Algorithm Operations

OPT PTS 2KM × MNPSO 2kn × (M + 1)N

SQPk ×

((0.65M3 + 2.7M2 + 6NM + 2N) × l

+(22NM + 9M +N))


of 18

conventional PTS technique, in cost of performancedegradation. In this paper, we introduced a systematicoptimization technique to achieve the optimal solu-tion of phase rotation approach for PAPR reduction,

which has not been studied before. Also, the proposedtechnique does not require any common costs interms of increasing BER in the receiver or transmitpower, so the costly part is just the optimization

5 10 15 20 25 30 35 40 45 50 55 600

1

2

3

4

5

6

7x 10

8

1 Iter

3 Iters

5 Iters

7 Iters

9 Iters

Number of sub-blocks

Ope

ration

s

Figure 4 Complexity of the SQP algorithm versus the number of sub-blocks for different number of iterations.

1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8x 10

8

M=60

M=20

M=12

M=5

Number of iterations

Ope

ration

s

Figure 3 Complexity of the SQP algorithm versus the number of iterations for different number of sub-blocks.


of 18

procedure. While in every other PTS techniques, theside information is sent to the receiver which causethe spectral efficiency reduction, increasing the trans-mit power or even BER degradation in case of trans-mission error.There are not many options for PAPR reduction tech-

niques without side information and it is not fair tocompare SQP technique with other PTS phase optimiza-tion approaches which require explicit information to besent to the receiver.

6. Simulation ResultsThe proposed PAPR reduction technique for anOFDMA system with 1024 sub-carriers and 64 QAMmodulation is simulated for a WiMAX data structure asexplained in Figure 1. The cumulative distribution func-tion (CDF) of the PAPR is one of the most frequentlyused performance measures for PAPR reduction techni-ques. The complementary CDF (CCDF) is used here toevaluate different methods, which denotes the probabil-ity that the PAPR of a data block exceeds a giventhreshold and is expressed as CCDF = 1 - CDF.To have a better perception of the PAPR cost func-

tion, a 3-D plot is provided in Figure 5, which illustratesthe variation of PAPR, or equivalently the maximumamplitude of one OFDM symbol partitioned into twodisjoint sub-blocks, versus two phase coefficients. Pre-dictably, two sub-blocks cannot do much for the PAPRreduction purpose and this is just to give a visual

impression of the cost function to be minimized in theSQP optimization algorithm.As can be seen, there are many local minima which

have slightly different levels; that is one of the promisingproperties of this optimization problem because reach-ing a local minimum satisfies the PAPR reduction aimeven though the global minimum is not found. As aresult, the performance of the proposed algorithm isrelatively insensitive to the initialization of theoptimization.The time domain signal of two 1024-OFDM symbol is

shown in Figure 6, before and after the signal processingalgorithm. It is clear that the proposed method reducesthe magnitude variations dramatically and that the back-off margin can be much smaller.

A. Performance of Different AlgorithmsThe performance of four different optimization tech-niques is illustrated in Figure 7 by CCDF curves.Once the Jacobian of the cost function is defined, theoptimization problem can be treated with differentoptimization methods. The SQP is the best solutionfor the problem in terms of PAPR reduction perfor-mance, but the least square error (LSE) approach canalso be used to reduce the peak amplitude of the sig-nal with much less complexity. However, the perfor-mance is not as good as the SQP algorithm but stillcomparable with existing PAPR reduction techniques[6].

−3−2

−10

12

3

−3−2

−10

12

360

80

100

120

140

160

φ1(Radian)φ2(Radian)

Figure 5 3D power cost function of PAPR for a random OFDM symbol which is divided into two sub-blocks with corresponding phasecoefficients j1 and j2.


of 18

The LSE algorithm minimizes the objective function f(x) = (f1(x))

2+(f2(x))2+ ... +(fN (x))2, which is the sum of

the OFDM sub-carriers amplitudesh. The componentsare forced to be equal to minimize the sum, so the large

samples are pushed to a specific level, whereas the smal-ler ones become larger. One of the examined optimiza-tion methods to search the phase coefficients in PTS isparticle swarm optimization (PSO) [27]. The achieved

a)

b)

Figure 6 The comparison between the time domain OFDM symbol before and after the PAPR reduction procedure for 60 clusters andten iterations. (a) Original OFDM signal (b) Reduced PAPR signal


of 18

gain for PSO is slightly better than LSE, but it is expen-sive to implement especially when the number of sub-blocks is large. The simulation results shows for thesame amount of computation the PSO is 2dB worsethan SQP, when the initial particle number is n = 100and k = 50 iterations [26].If the search for the global minimum can be per-

formed in each OFDM symbol, then the CCDF curveimproves to some degree. In our test, each OFDM sym-bol has been processed 100 times with different initialguesses and the one with the smallest PAPR is selected.The result in Figure 7 (advanced SQP) shows an overallimprovement of about 0.5 dB. In this case, the PAPR ofthe system can almost be considered as a deterministicvalue since the CCDF curve is almost vertical.

B. Evaluation of Effective Parameters in SQP PerformanceFigure 8 shows the performance of the SQP algorithm atthe point Pr{PAPR > PAPR0} = 10-4 for 10,000 randomOFDM symbol with 64 QAM modulation versus differ-ent number of major iterations. The vertical axis repre-sents the PAPR reduction gain in dB, which is thedifference between the original CCDF curve and theprocessed signal curve at the probability as indicated inFigure 7. As noticed here, most of the job is done in thefirst iteration and after more than ten iterations the pro-gress tends to be slower.Figure 9 shows the PAPR reduction degradation, when

the number of sub-blocks are reduced. As explained

earlier, each cluster can be phase rotated and this willbe reversed at the receiver in the channel equalizationprocess. To bring down the complexity, the same phasecoefficients are assigned to several adjacent clusters tosimplify the optimization algorithm. In fact, 30 sub-blocks means two clusters within one sub-block andeach sub-block is weighted with specific phase coeffi-cient. In practice, there cannot be 120 phase coefficientsor sub-blocks, because it means that one cluster has twophase weights and this is not possible to compensate atthe receiver according to the WiMAX standard. But inFigure 9, a 120 sub-blocks configuration is simulated toshow the trend of PAPR reduction gain versus the num-ber of sub-blocks.

0 2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

PAPR

redu

ctionga

in(dB)

Number of iterations

Figure 8 SQP PAPR reduction gain versus number of majoriterations when Pr{PAPR > PAPR0} = 10-4.

5 6 7 8 9 10 11 12 1310

−4

10−3

10−2

10−1

PAPR (dB)

P(P

AP

R>

PA

PR

0)

PAPR reduction gain (dB)

DefaultAdvanced-SQPSQPPSOLSE

Figure 7 The comparison between CCDF curves of different PAPR reduction algorithms, advanced-SQP outperforms other methodswith 6.2dB gain while the LSE gives the least PAPR reduction gain of 3.4dB.


of 18

Finally, the PAPR reduction performance in terms ofCCDF curve is not changed with different initial guesses,because the maximum of all 10, 000 simulated OFDMsymbols defines the CCDF curve in low probability ofPr{PAPR > PAPR0}, and this does not depend on theinitial solution. But in each OFDM symbol the mini-mum can be found by examination of various startingpoints and the performance can be improved as Figure7 illustrates in advanced-SQP curve.

7. Concluding RemarksWe introduced a precoding PAPR reduction techniquethat is applicable to OFDM/A communication systemsusing dedicated pilots. We developed the technique fora WiMAX system but it is applicable to OFDM/A sys-tems in general where dedicated pilots and data bothare beamformed. Beamforming performance depends onthe relative phase shift between antennas but is unaf-fected by a phase shift common to all antennas. PAPR,on the other hand, changes with a common phase shift,and the PAPR reduction technique proposed in thispaper was based on this property. Each cluster withinthe WiMAX data structure are weighted with properphase coefficients, which are optimized to minimize thePAPR of the time domain transmitted signal.The proposed technique comes with interesting

unique features, making it a very appealing methodespecially for standard constrained applications. No sideinformation is sent to the receiver so the throughput isnot affected and transmitted power and bit error ratedoes not increase which otherwise are common draw-backs in many PAPR reduction techniques. Moreover,an optimization technique for finding the best weightswas proposed. The PAPR reduction problem was formu-lated as a minimax problem that was solved by derivingthe gradient analytically and modifying the SQP algo-rithm to solve the optimization.The SQP algorithm works effectively with a large

PAPR reduction gain. At the cost of a smaller PAPR

reduction gain, it is possible to reduce the computa-tional complexity of the technique by using other gradi-ent-based optimization techniques. Even lowercomplexity can be achieved using a least squares-basedformulation, but simulation results indicated a substan-tial performance loss compared with the SQP approach.The SQP itself can be implemented in different ways tosimplify the algorithm and several steps can be done inparallel for a more practical hardware implementation.

Appendix ACalculation of the search direction dlThe procedure of deriving search direction of the QP isexplained in [19] and included here for convenience.Once Zl is derived, a new search direction dl is updatedthat minimizes the QP objective function q(d), which isa linear combination of the columns of Zl and locatedin the null space of the active constraints. Thus, thequadratic objective function can be reformulated as afunction of some vector b by substituting for dl = Zlb,in general QP problem.

q(b) =12bTZT

l HZlb + cTZlb, (25)

Differentiating with respect to b yields,∇q(b) = ZT

l HZlbT + ZTl c where ∇q(b) is referred to as

the projected gradient of the quadratic function, becauseit is the gradient projected to the subspace defined byZl. The minimum of the function q(b) in the subspacedefined by Zl occurs when ∇q(b) = 0, which is the solu-tion of the system of linear equations.

ZTl HZlb = −ZT

l c. (26)

Solving Equation(26) for b at each QP iteration givesthe dl, then the step is taken as dl+1 = dl + αdl. Since theobjective is a quadratic function, there are only twochoices of step length a; it is either 1 along searchdirection dl or < 1. If the step length 1 can be takenwithout violation of the constraints, then this is theexact step to the minimum of the quadratic function.Otherwise, the distance to the nearest constraint shouldbe found and the solution is moved along it as in Equa-tion(22).

EndnotesaThe general aim of this modification is to distort theelements of qk, which contribute to a positive definiteupdate, as little as possible. Therefore, in the initialphase of the modification, the most negative element ofqTk sk is repeatedly halved. This procedure is continued

until qTk sk is greater than or equal to a small negative

tolerance. If, after this procedure, qTk sk is still not

0 20 40 60 80 100 1200

1

2

3

4

5

6

7

Number of subblocks

PAPR

redu

ctionga

in(dB)

Figure 9 SQP PAPR reduction performance versus number ofsub-blocks when Pr{PAPR > PAPR0} = 10-4


of 18

positive, modify qk by adding a vector v multiplied by aconstant scalar w, and increase w systematically untilqTk sk becomes positive see [19]. bEquality constraints are

always active but there is no equality constraints in thisphase optimization problem. cWhen it is not the firstmajor iteration, the active set is not empty. dWhere i isthe index of minimum in (22) which indicates the activeconstraint to be added. eThe term “length” indicates thenumber of rows in Al or equivalently the number ofactive constraints. fThe QP is a convex optimizationproblem, so the iterations proceed till the optimum isfound, but a modification of the algorithm can be usedwhen the number of iterations are fixed. gThe fixedoperations belong to those matrices whose sizes do notchange during the iterations while there are othermatrices like Z that has variable size and hence differentcomplexity during iterations. hThis is the simplest sce-nario, but other modifications can be made to develop amore elaborate version of LSE.

Author details1Department of Signal and Systems, Chalmers University of Technology, P.C-412 96 Gothenburg, Sweden 2ArrayComm, LLC, 2025 Gateway Place 348,San Jose, CA, 95110, USA

Competing InterestsThe authors declare that they have no competing interests.S K was born on September 22, 1981 in Kermanshah, Iran. She receivedthe B.S. degree in Electrical Engineering with Communication minor, in2005 from State University of Tabriz in Iran with dissertation in the fieldof satellite communication titled as frequency reuse in dual polarizedsatellite systems. She got her M.S. degree in Communication Engineeringfrom Chalmers University of Technology in Gothenburg, Sweden in 2010.She is a PhD student at Delft University of Technology now and workingon signal processing applications for wireless communication, her masterthesis was related to PAPR reduction techniques for WiMAX systems.T S (S’98,M’01) was born in Trollhättan, Sweden, in 1972. He received theM.S. degree in electrical engineering in 1996 and the Ph.D. degree insignal processing in 2001 from Chalmers University of Technology,Gothenburg, Sweden. During 2001-2005 he conducted research onadaptive antennas in wireless communications, channel modeling andprobing of systems employing transmit and receive antenna arrays atBrigham Young University (BYU) and University of California, San Diego(UCSD). Since 2005, he is with ArrayComm, San Jose, CA developingadaptive antenna algorithms for emerging broadband technologies suchas WiMAX and 3GPP LTE.M V received the PhD degree in Automatic Control from LinköpingUniversity, Sweden in 1989. He has held academic positions at LinköpingUniversity and visiting Scholarships at Stanford University and BrighamYoung University, USA. Since 1993, Dr. Viberg is a professor of SignalProcessing at Chalmers University of Technology, Sweden. During 1999-2004he served as Department Chair. Since May 2011, he holds a position as VicePresident of Chalmers university of Technology. Dr. Viberg’s researchinterests are in Statistical Signal Processing and its various applications,including Antenna Array Signal Processing, System Identification, WirelessCommunications, Radar Systems and Automotive Signal Processing. Dr.Viberg has served in various capacities in the IEEE Signal Processing Society,including chair of the Technical Committee (TC) on Signal Processing Theoryand Methods (2001-2003), vice-chair (2009-2010) and chair (from 2011) ofthe TC on Sensor Array and Multichannel, Associate Editor of theTransactions on Signal Processing (2004-2005), member of the Awards Board(2005-2007), and member at large of the Board of Governors (2010-). Dr.Viberg has received 2 Paper Awards from the IEEE Signal Processing Society(1993 and 1999 respectively), and the Excellent Research Award from the

Swedish Research Council (VR) in 2002. Dr Viberg is a Fellow of the IEEEsince 2003, and his research group received the 2007 EURASIP EuropeanGroup Technical Achievement Award. In 2008, Dr. Viberg was elected intothe Royal Swedish Academy of Sciences (KVA).T E was born on April 7, 1964 in Skovde, Sweden. He received the M.Sc.degree in Electrical Engineering in 1990, and the Ph.D. degree in InformationTheory in 1996, both from Chalmers University of Technology, Gothenburg,Sweden. He was at AT&T Labs - Research from 1997 to 1998, and in 1998and 1999 he was working on a joint research project with the Royal Instituteof Technology and Ericsson Radio Systems AB. In 2003 and 2004, he was aguest professor at Yonsei University in Seoul, South Korea. Currently, he is anAssociate Professor (docent) at the department of Signals and Systems,Chalmers University of Technology. His research interests includecommunication systems, source coding and information theory. Specificinterests include multiple description quantization, speaker recognition,channel quality feedback, and system modeling of non-ideal hardwarecomponents.

Received: 12 November 2010 Accepted: 9 August 2011Published: 9 August 2011

References1. K Patterson, Generalized reed-muller codes and power control in OFDM

modulation. IEEE Trans Inf Theory 46, 104–120 (1997)2. J Tellado, Multicarrier Modulation with Low Peak to Average Power

Applications to xDSL and Broadband Wireless (Kluwer Academic, Norwell,MA, 2000)

3. A Behravan, Evaluation and Compensation of Nonlinear Distortion inMulticarrier Communication Systems, PhD thesis, Chalmers University ofTechnology, Department of Signals and Systems, Communication SystemGroup, Gothenburg, Sweden (2006)

4. C Ciochina, F Buda, H Sari, An analysis of OFDM peak power reductiontechniques for WiMAX systems, in Proceedings of IEEE InternationalConference on Communications 46, 104–120 (2006)

5. BS Krongold, DL Jones, PAR reduction in OFDM via active constellationextension. IEEE Trans Broadcast. 3, 258–268 (2003)

6. SH Han, JH Lee, An overview of peak-to-average power ratio reductiontechniques for multicarrier transmission. IEEE Wirel Commun Mag. 12,56–65 (2005). doi:10.1109/MWC.2005.1421929

7. C Tellambura, Phase optimization criterion for reducing peak-to-averagepower ratio in OFDM. IET Electron Lett. 34, 169–170 (1998). doi:10.1049/el:19980163

8. LJ Cimini Jr, NR Sollenberger, Peak-to-average-power ratio reduction of anOFDM signal using partial transmit sequences. IEEE Commun Lett. 4, 86–88(2000). doi:10.1109/4234.831033

9. SH Mller, JB Huber, A novel peak power reduction scheme for OFDM, inProceedings of IEEE PIMRC. 3, 1090–1094 (1997)

10. JG Andrews, A Ghosh, R Muhamed, Fundamentals of WiMAX: UnderstandingBroadband Wireless Networking (Prentice Hall, 2007)

11. S Kang, J Kim, E Joo, A novel sub-block partition scheme for partial transmitsequence OFDM. IEEE Trans Commun. 45, 333–338 (1999)

12. R Fletcher, Practical Methods of Optimization, 2nd edn. (Wiley-Interscience,2000)

13. P Gill, W Murray, MH Wright, Practical Optimization (Academic Press, 1981)14. MJD Powell, Variable Metric Methods for Constrained Optimization (Springer

Verlag, 1983)15. K Schittkowski, NLQPL: a FORTRAN-subroutine solving constrained nonlinear

programming problems. Ann Oper Res. 5, 485–500 (1985)16. HW Kuhn, AW Tucker, Nonlinear programming, in Proceedings of Second

Berkeley Symposium on Mathematical Statistics and Probability, 481–492(1951)

17. Z Yi, Ab-initio Study of Semi-conductor and Metallic Systems: From DensityFunctional Theory to Many Body Perturbation Theory. PhD thesis, Universityof Osnabruck, Department of Physics, Osnabruck, Germany, (2009)

18. N Amjady, F Keynia, Application of a new hybrid neuro-evolutionary systemfor day-ahead price forecasting of electricity markets. Appl Soft Comput. 10,784–792 (2010). doi:10.1016/j.asoc.2009.09.008

19. Matlab optimization toolbox user guide, constrained optimization, Ch.6 (TheMathWorks. Inc 1984–2010) pp. 227–225

20. KG Murty, Linear Complementarity, Linear and Nonlinear Programming, SigmaSeries in Applied Mathematics, vol. 3. (Heldermann Verlag, Berlin, 1988)


of 18

21. P Gill, W Murray, M Wright, Numerical Linear Algebra and Optimization, vol. 1(Addison-Wesley, 1991)

22. J Nocedal, SJ Wright, Numerical Optimization. Operations Research andFinancial Engineering, 2nd edn. (Springer Verlag, 2006)

23. YJ Qu, BG Hu, RBF networks for nonlinear models subject to linearconstraints, in IEEE International Conference on Granular Computing, 482–487(2009)

24. GH Golub, CV Loan, Matrix Computations, 3rd edn. (Johns HopkinsUniversity Press, Baltimore, MD, 1996)

25. Y Wang, W Chen, C Tellambura, A PAPR reduction method based onartificial bee colony algorithm for OFDM signals. IEEE Trans Wirel Commun.9, 2994–2999 (2010)

26. S Khademi, OFDM peak-to-average-power-ratio reduction in WiMAXsystems. Master’s thesis, Chalmers University of Technology, Department ofSignals and Systems, Communication System Group, Gothenburg, Sweden,(2011)

27. J Kennedy, R Eberhart, Particle swarm optimization, in Proceedings of IEEEInternational Conference on Neural Networks. 46, 1942–1945 (1995)

doi:10.1186/1687-6180-2011-38Cite this article as: Khademi et al.: Peak-to-Average-Power-Ratio (PAPR)reduction in WiMAX and OFDM/A systems. EURASIP Journal on Advancesin Signal Processing 2011 2011:38.

Submit your manuscript to a journal and benefi t from:

7 Convenient online submission

7 Rigorous peer review

7 Immediate publication on acceptance

7 Open access: articles freely available online

7 High visibility within the fi eld

7 Retaining the copyright to your article

Submit your next manuscript at 7 springeropen.com


of 18

http://www.ncbi.nlm.nih.gov/pubmed/21850152?dopt=Abstract

http://www.ncbi.nlm.nih.gov/pubmed/21850152?dopt=Abstract

http://www.springeropen.com/

http://www.springeropen.com/

research open access peak-to-average-power-ratio (papr ... · many recent wide-band digital...

Documents