design4cstm_jrnl
TRANSCRIPT
-
7/29/2019 Design4CSTM_jrnl
1/25
1
Design of Coded Space-Time Modulation
Zhiyuan Wu and Xiaofeng Wang
Abstract
In this paper, we study the coded space-time modulation for multiple-input multiple-output wireless
channels. For simpler design and flexible rate-versus-performance tradeoff, conventional encoders are
used before a linear space-time modulator. A joint iterative receiver based on the turbo principle is
assumed that precludes the use of Tarokhs design criteria for space-time codes. Using the extrinsic
information transfer charts, design criteria that concern both the data rate and error performance are
developed. These criteria are much easier to apply than the well-known Tarokhs criteria. It is shown
that the use of outer encoders significantly simplifies the design of linear space-time coding/modulation.
Based on the new design criteria, an optimal space-time linear dispersion modulation scheme is pre-
sented. In addition, the tradeoff between constellation size and symbol rate for a given overall data rate
is discussed. Simulation results are provided to verify the new design criteria and to demonstrate the
merits of the proposed coded space-time modulation.
Index Terms
MIMO, space-time coding, EXIT chart, linear dispersion codes.
I. INTRODUCTION
Recently, there has been a major research thrust of developing multiple-input multiple-output
(MIMO) transmission schemes to exploit the increased capacity of multiple-antenna wireless
channels [1][2]. As a result, numerous MIMO transmission schemes (e.g., [3]-[13]) have been
developed. Among them, most existing designs mainly fall into two categories: performance-
oriented schemes by exploiting the spatial diversity, such as space-time trellis codes (STTCs)
[3], space-time block codes (STBCs) [4][5] and space-time turbo trellis codes (ST Turbo TCs)
[6]-[8], and rate-oriented schemes by capitalizing the MIMO fading channel capacity, such as
Bell-labs layered space-time (BLAST) architectures [9]-[11] and linear dispersion codes (LDCs)
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
2/25
2
[12][13]. However, many of these existing space-time (ST) codes suffer from the design difficulty,
performance loss and/or high decoding complexity.
With recent progress in MIMO transmission, it has been recognized that the idea of the
powerful Turbo codes, first proposed in [14], can also be applied in MIMO systems to achieve
near-capacity performance. Both parallel and serial concatenated schemes have been proposed.
In parallel concatenated systems [6]-[8], the information bit stream is passed though two or
more encoders with different permutation and then punctured and multiplexed at the transmit
antennas. In serial concatenated systems such as [15], some form of outer encoding is applied
before space-time coding/modulation. Such a serial concatenation is often preferable due to its
simpler design and greater flexibility in rate-versus-performance tradeoff. In fact, any space-time
transmission scheme can be considered as an outer encoder serially concatenated before an inner
space-time mapper or modulator that maps a number of input symbols onto a space-time matrix
before transmission. For example, a ST Turbo TC can be viewed as an outer turbo channel
encoder serially concatenated before an inner V-Blast ST modulator [10].
To emphasize the modulation flavor of the inner space-time process when applied after outer
encoding, we will call such a serial concatenated system as the coded space-time modulation
(CSTM). Often, conventional encoders such as convolutional codes, trellis-coded modulation
(TCM), and turbo codes designed for single-input single-output (SISO) channels can be used to
provide extra redundancy and to simplify the design of the inner space-time modulator. In order
to decouple the correlation between outer encoding and inner space-time modulation, interleaving
is often applied to the encoded bits or symbols. Such a concatenated coding system possesses
many advantages of both the conventional codes and the inner space-time modulation. On the
one hand, conventional outer codes can provide large coding gain and time diversity; on the
other hand, space-time coding/modulation provides guaranteed spatial diversity gain to combat
fading. Together, they enable a variety of design targets in performance, bandwidth efficiency,
complexity, and tradeoffs among them.
In a CSTM system, the use of interleaver makes it impractical to evaluate the coding gain and
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
3/25
3
diversity gain of the overall system based on Tarokhs criteria. Furthermore, Tarokhs criteria are
developed based on maximum-likelihood (ML) decoding which is too complex to be practical
for a concatenated system. At best, a joint iterative receiver based on the turbo principle can be
used [15]. In such a case, Tarokhs criteria no longer apply either to the overall concatenated
system or to the inner space-time modulation alone. Hence, it is important to study the design
of the inner space-time modulator when used in a concatenated system.
Although any existing space-time code can be a potential candidate for the inner space-time
modulation, a particular desirable choice is linear dispersion (LD) codes. This is because it
subsumes many existing block codes as its special cases, allows suboptimal linear receivers with
greatly reduced complexity, and provides flexible rate-versus-performance tradeoff [12]. Hence,
in this paper, using the idea of the extrinsic information transfer (EXIT) chart pioneered by S. ten
Brink [16][17], we consider the design of the inner LD space-time modulator when concatenated
with an outer code under the assumption of a joint iterative (turbo) receiver. Although the EXIT
chart technique developed for SISO systems has been used in the study of some specific MIMO
systems such as [18], it cannot be directly used for more general cases. Unlike in SISO systems,
the outputs of the inner ST modulator often have different statistics. By extending the existing
EXIT chart technique to MIMO transmission systems, it is shown that the inner space-time
modulator shall a) maximize the average mutual information between a bit and the received
signal and b) minimize the pair-wise error performance of the codeword pairs that differ at only
one symbol within a modulation block. Criterion a) is similar to those used in the existing high-
rate schemes while b) is unique for CSTM. These two criteria concern the channel capacity
and performance, respectively, and together reflect a joint optimization of both data rate and
error rate. It is worthy of noting that criterion b) is much simpler to apply than Tarokhs criteria
set since the later requires an optimization over all the possible codeword pairs. Based on the
proposed criteria, an optimal LD space-time modulator can be obtained. Several design examples
are provided to demonstrate the merits of the proposed CSTM scheme and to verify the design
criteria as well.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
4/25
4
The rest of the paper is organized as follows. In section II, preliminaries of this research
including the system model and the joint iterative receiver are introduced. In section III, we
extend the existing EXIT chart to MIMO transmission systems. In section IV, we propose design
criteria for the inner ST modulation in a CSTM system, provide design examples, and discuss
the design of constellation versus symbol rate for a given data rate. In section V, simulation
results are presented. Finally, conclusions are drawn in section VI.
II. PRELIMINARIES
A. System Model
In this study, a block fading channel model is assumed where the channel keeps constant in
one modulation block but may change from block to block. That is, the channel is not necessarily
constant within a coding frame which often consists of a large number of modulation blocks.
Furthermore, the channel is assumed to be a Rayleigh flat fading channel with Nt transmit and
Nr receive antennas. Lets denote the complex gain from transmit antenna n to receiver antenna
m by hmn and collect them to form an Nr Nt channel matrix H = [hmn], known perfectly to
the receiver but unknown to the transmitter. The entries in H are assumed to be independently
identically distributed (i.i.d.) symmetrical complex Gaussian random variables with zero meanand unit variance.
The CSTM under investigation is a serial concatenation of an outer encoder and an inner
ST modulator as shown in Fig. 1(a), which subsumes many MIMO transmission schemes as its
special cases. In a CSTM system, the information bits are first encoded, shuffled by an interleaver
and then mapped into symbols. After that, the symbol stream is parsed into blocks of length L.
A symbol vector associated with one modulation block is denoted by x = [x1, x2, . . . , xL]T with
xi {m|m = 0, 1, . . . , 2Q
1, Q 1} (i.e., a complex constellation of size 2Q
, such as
2Q-QAM). The average symbol energy is assumed to be 1, i.e., 12Q
2Q1m=0
|m|2 = 1. Each block
of symbols will be mapped by the inner ST modulator to a dispersion matrix of size Nt T and
then transmitted over the Nt transmit antennas over T channel uses. The system model in Fig.
1(a) is often called bit-interleaved coded modulation (BICM) [19][20]. Another CSTM scheme
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
5/25
5
under consideration is to apply a constellation mapper right after the outer encoder and before
a symbol-level interleaver, which will be called symbol-interleaved coded modulation (SICM).
As mentioned before, we consider LD ST modulation for various reasons. An LD ST modulator
is defined by its L Nt T dispersion matrices Mi = [mi1,mi2, . . . ,miT] and the corresponding
output matrix of one modulation block is given by
X =L
i=1
Mixi (1)
With a constellation of size 2Q, the data rate of the inner space-time modulator is Rm = Q L/T
bits per channel use and the data rate of the overall concatenated system is R = RcRm bits per
channel use, where Rc 1 is the coding rate of the outer encoder.
Hence, one can adjust symbol rate L/T, constellation size Q, and coding rate Rc to meet
different requirements on data rate and performance. Since the inner ST modulation is linear,
suboptimal linear receivers can be used for demodulation [12]. It can also be observed that the
space-time mapping schemes used in the existing layered space-time architectures, e.g., [9][11],
are LD modulation. Hence the proposed CSTM with LD ST modulation subsumes existing
layered space-time schemes as special cases.
At the receiver, the received signals associated with one modulation block can be written as
Y =
P/NtHX+ Z =
P/NtHL
i=1
Mixi + Z (2)
where Y is a complex matrix of size Nr T whose (m, n)-th entry is the received signal at
receive antenna m and time instant n, Z is the additive white Gaussian noise (AWGN) matrix
with i.i.d. symmetrical complex Gaussian elements of zero mean and variance 2z , and P is the
average energy per channel use at each receive antenna. Let vec() be the operator that forms a
column vector by stacking the columns of a matrix and define y = vec(Y), z = vec(Z), and
mi = vec(Mi), then (2) can be rewritten as
y =
P/NtHGx+ z =
P/NtHx+ z (3)where H = IT H with as the Kronecker product operator and G = [m1,m2, . . . ,mL] will
be referred to as the modulation matrix. Since the average energy of the signal per channel use
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
6/25
6
at a receive antenna is assume to be P, we have tr(GGH) = NtT. Denoting hi = Hmi as the
i-th column vector ofH, the above equation can also be written asy = P/Nt
L
i=1 hixi + z (4)B. Joint Iterative Demodulation/Decoding Receiver
The computational complexity of an optimal ML-based receiver can be impractical for a
CSTM system due to the use of the interleaver between the inner ST modulator and the outer
encoder. At best, an iterative joint decoding and demodulation receiver, as illustrated in Fig.
1(b), can be employed. In the figure and the following discussion, variables with subscript 1
are associated with the inner ST demodulator and variables with subscript 2 are associated
with the outer decoder, and subscripts a (or A), e (or E) and d stand for a priori, extrinsic,
and a posteriori, respectively.
The joint demodulation and decoding is an iterative process. In each iteration, the extrinsic
information, which is the a posteriori information less the a priori information, is exchanged
between the two constituent components, the inner demodulator and the outer decoder. The
extrinsic information from one component is used as the a priori of the other component. After
a sufficient number of iterations, neither the inner ST demodulator nor the outer decoder can
benefit from the exchange of the extrinsic information any longer and this phenomenon is often
called convergence. Once convergence is reached, the outer decoder will perform hard decision
to generate the decoded information bits.
Normally, log-likelihood-ratios (LLRs) are used in information exchange. For BICM, the bit
LLRs can be calculated as Lb = ln[Pr(b = 1)/Pr(b = 0)]. For SICM, since the symbols are
taken from a constellation of size 2Q, each symbol has a (2Q 1)-tuple whose m-th element
is the logarithm of the ratio of the probability of the symbol taking value m over that of the
symbol taking value 0, i.e. Ls(x = m) = ln [Pr(x = m)/Pr(x = 0)] .
Below, we consider the extrinsic LLRs for the inner ST demodulator and the outer decoder.
1) The extrinsic LLR of the inner ST demodulator
Given the a priori bit LLRs Lba1 = [Lba1,1, L
ba1,2, . . . , L
ba1,LQ] from the outer decoder at the last
iteration and the corresponding channel observation y, using MAP criterion, the extrinsic LLR
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
7/25
7
of the j-th bit in symbol xi can be calculated as
Lbe1,(i1)Q+j
= ln
Pr(b(i1)Q+j = 1|y,Lba1)
Pr(b(i1)Q+j = 0|y,Lba1) Lbd1,(i1)Q+j
L
b
a1,(i1)Q+j (5)
Similarly, given the a priori symbol LLRs Lsa1 = [Lsa1,1, L
sa1,2, . . . , L
sa1,L] and the observation
y, one can compute the extrinsic LLR of symbol xi in the block using the MAP criterion as
Lse1,i(m)
= lnPr(xi = m|y,L
sa1)
Pr(xi = 0|y,Lsa1)
Lsd1,i(m)Lsa1,i(m)
= ln
xI
exp
y
PNtHIxIPNt him2
2z
+ k
Lsa1,k(xk)
xI
exp
y
PNtHIxIPNt hi02
2z
+ k
Lsa1,k(xk)
(6)
where
HI is the matrix by removing the i-th column from
H in (3), xI = [x1, . . . , xi1, xi+1, . . . , xL]T,
is the set of2Q(L
1)
column symbol vectors and is the set of indices ofxI, i.e. = {k|1 k L and k = i}.
Let Lsa1,i(m) =
jmLba1,j , where m is the set of indices of the bits in m equal to 1. Then
from (5) and (6), one can write
Lbe1,(i1)Q+j
= ln
+mbj=1
exp
Lse1,i(+m) + L
sa1,i(
+m)
mbj=0 exp Lse1,i(m) + Lsa1,i(m) Lba1,(i1)Q+j
(7)
where bj=1 is the set of constellation points whose j-th bit in m equal to 1 while bj=0 is
the set of constellation points whose j-th bit equal to 0. From (7), the MAP inner ST detector
for a BICM scheme can be implemented by adding a process following the symbol-level MAP
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
8/25
8
detector as defined in (6). This process calculates the extrinsic bit LLR using the output extrinsic
LLR of the corresponding symbol from the symbol-level detector and the a priori LLRs of the
bits in that symbol. This observation will be useful in the later development of this study. In
addition to the optimal MAP algorithm, other linear suboptimal methods are also available for
reduced complexity such as [21][22].
2) The extrinsic LLR of the outer decoder
Unlike the inner ST modulator, the computation of extrinsic LLRs is only based on the input
a priori LLRs from the inner ST demodulator in a CSTM system. The associated optimal MAP
algorithm has been developed by Bahl et al in [23] and will not be described in this paper.
Moreover, other suboptimal algorithms are also available such as [24][25].
III. EXIT CHART FOR MIMO TRANSMISSION
In this section, by means of the EXIT chart, we examine the convergence behavior of the
iterative demodulation/decoding procedure for a CSTM system as described in the last section.
In Fig. 2, a typical EXIT chart of the type of iterative receivers for a CSTM with two input
symbols per LD modulation block is given. The EXIT chart illustrates the trajectory of the
exchange of extrinsic information measured as the mutual information between the LLRs and
the bits in the corresponding symbols. For instance, the extrinsic information of the outer decoder
is IE2 = I(x; Le2). It is important to notice that the extrinsic LLR outputs of the symbols in
a modulation block may have different statistics, depending on the channel H, the dispersion
matrices {Mi}. For BICM, the statistics of extrinsic bit LLR outputs will also depend on the
constellation pattern. Hence, different symbols may have different extrinsic information transfer
functions and IE2 is a function of the two a priori inputs, i.e., IE2 = T(I
E1, I
E1).
The following observations are useful for the analysis of the MIMO transmission system.
Observation 1: If the interleaver is random and the constraint length is sufficiently large, the
outer decoder yields the same amount of extrinsic information for all the symbols (or bits for
bit-level decoder) and the exact amount depends on the average a priori information of the
inputs.
Remarks: The above observation tells that IE2 in the example shown in Fig. 2 is the same for
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
9/25
9
all the outputs and is a function ofIE1 = (I
E1 + I
E1)/2. This is because the following reasons.
If the constraint length is large, the output extrinsic LLRs will depend, of about equal degree,
on the a priori information of a large number of its neighbors. Furthermore, if the interleaver is
random, these neighbors are evenly distributed among the input groups. Hence, different symbols
tend to have the same amount ofextrinsic information. In addition, a specific neighbor (e.g., the
symbol immediately next to the symbol of concern at the decoder output) may be associated
with any of the input groups with equal probability. Consequently, the a priori information of
any neighbor is the average a priori information of all the input groups and, hence, in Fig. 2,
IE2 = T
(I
E1 + I
E1)/2.
To verify the above hypothesis, two BCJR/MAP decoders as depicted in Fig. 3 were set
up. For the decoder (a) in Fig. 3(a), two streams of independent a priori symbol LLRs with
mutual information I(x; La1) and I(x; La2), respectively, were sent to a BCJR/MAP decoder and
then two corresponding streams of extrinsic LLRs were calculated by the decoder. BCJR/MAP
decoder (b) in Fig. 3(b), only had one stream of a priori symbol LLRs with mutual information
I(x; La) = [I(x; La1) + I(x; La2)] /2 (8)
Since there is a one-to-one correspondence between the LLRs and the corresponding soft output
x of the transmitted symbols measured as
x =
1 + 2Q1m=1
exp(L(m))
1 2Q1m=0
m exp(L(m)) (9)
mutual information between the symbols and the corresponding LLRs is equal to that between
the symbols and the corresponding soft outputs x, i.e., I(x; Le) = I(x; x(Le)). Hence, instead
of using extrinsic LLRs directly, histograms of x were generated to reflect the statistics nature
of corresponding extrinsic LLRs for convenience. In the simulation, BPSK constellation and
a convolutional code with coding rate 1/2 and constraint length 5 were used. Following the
Gaussian consistency [17], the a priori LLRs were generated as symbols corrupted by zero-
mean AWGN with different variances. The simulation results are shown in Fig. 4. From the
figure, the histograms of the output extrinsic averages x of the two streams of input LLRs for
decoder (a) coincide with each other and they also coincide with the histogram for decoder (b).
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
10/25
10
In summary, the simulation results suggest that the extrinsic information output of the outer
BCJR/MAP decoder only depends on the average a priori information input.
Observation 2: (a) The convergence point is where the average EXIT curve of the inner ST
demodulator and the EXIT curve of the outer decoder meet. (b) The a posteriori LLRs associated
with different symbols in a modulation block sent to the final decision device may have different
statistics determined by their own EXIT curves.
Remarks: This observation is just a direct result from Observation 1. For instance, at conver-
gence point in Fig. 2, let us denote IcE2 as the value of the extrinsic information of the outer
decoder, Ic
E1 and Ic
E1 as the values of extrinsic information corresponding to the two symbols
of the inner demodulator, we have the following relationship
IcE2 = TE2((Ic
E1 + Ic
E1)/2)
Ic
E1 = T
E1(IcE2)
Ic
E1 = T
E1(IcE2) (10)
where I = T(x) indicates that I is a function ofx. Hence, the a posteriori LLRs of the symbols
associated with the dashed curve and dash-dot curve at convergence are
L
d2 = Lc
e2 + Lc
e1
L
d2 = Lc
e2 + Lc
e1 (11)
where Lc
e1 and Lc
e1 correspond to Ic
E1 and Ic
E1 respectively, and both Lc
e2 and Lc
e2 correspond to
IcE2. In (11), Lc
e2 and Lc
e2 have the same statistics as predicted by Observation 1, but Lc
e1 and
Lc
e1 may have different statistics.
Although the above observations were described for a symbol-level decoder such as a decoder
for TCM, they also apply to bit-level decoders if all the quantities associated with symbols in
the above observations are replaced by the corresponding quantities associated with bits. To this
end, we are ready to consider the design of inner ST modulation. As illustrated in Fig. 2, the
curve of a powerful conventional outer code often presents two flat plateaus at the two ends and
a sharp cliff in the middle [17]. On the contrary, the EXIT curves of the inner ST demodulator
are close to a straight line due to its shorter encoding block length which is L.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
11/25
11
By Observation 1, to let the trajectory of the iterative receiver snake through the bottleneck in
the middle and thus reach the second plateau region of the IE2 curve, one can seek to maximize
1LQ
E(L
i=1I(xi; L
se1,i)) for SICM and
1LQ
E(L
i=1Q
j=1I(b(i1)Q+j ; Lbe1,(i1)Q+j)) for BICM, where the
expectation E() is taken over the channel H. As described in section II-B, the extrinsic LLR of a
bit can be calculated using the extrinsic LLR of its associated symbol and the a priori LLRs of the
other bits in the symbol. For a given constellation and a priori information, maximizing the sum
extrinsic information of the bits in symbol x, i.e.Q
j=1I(bj ; L
be1,j ), can be well-approximated by
maximizing the extrinsic information of the symbol I(x; Lse1). In summary, we seek to maximize
1LQ E(
Li=1
I(xi; Lse1,i)) for both BICM and SICM.
By Observation 2, one should minimize the outage probability of the a posteriori information
(i.e. I(x; Ld2)) for each symbol (or bits for BICM). Let IE1(a) denote the extrinsic information
in the inner ST modulation block when the input IA1 = a in the following discussion. From (11),
when IcE2 is large, i.e. the convergence point is located at the second plateau of the outer decoder,
IcE2 will change little regardless of the a priori information IE1. Hence, minimizing the outage
probability of the a posteriori information I(x; Ld2) can be approximated by minimizing the
outage probability of the extrinsic information IE1 at convergence, or equivalently, maximizing
Pr(IE1(IcE2) ) for a certain value . Noting that when the trajectory of the iterative receiver
reaches the second plateau, IcE2 Q, we seek to maximize Pr(IE1(Q) ) for any symbol.
Again, for a given constellation, minimizing the outage probability of extrinsic information on
bits for BICM can be approximated by minimizing the extrinsic information on symbols.
In summary, we have two optimization problems concerning the design of the inner ST
modulator:
maximizing the average extrinsic information per bit for any given a priori information,
i.e. maximizing IE1(IA1) =1
LQ EL
i=1 I(xi; Lse1,i); minimizing the outage probability of the extrinsic information of a symbol in the modulation
block at perfect a priori information, i.e. maximizing Pr(IE1(Q) ).
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
12/25
12
IV. DESIGN OF INNER ST MODULATION
In this section, we consider the design of the inner ST modulator by solving the two opti-
mization problems as described in the last section.
A. Maximizing the Average Extrinsic Information Per Bit
To make the optimization problems tractable and independent of constellation, we assume
i.i.d. Gaussian inputs. The results can be used as design guidelines for practical input symbols
drawn from finite alphabets. In general, maximizing the average extrinsic information for any
given a priori information is difficult due to the unknown statistics of the a priori symbol LLRs.
However, noting that the EXIT curves of the inner ST demodulator are monotonic and close
to a straight line, we seek to maximize the average extrinsic information at the starting point,
i.e. IE1(0). This will ensure that the trajectory of the iterative receiver pass through the narrow
tunnel in the middle to reach the second plateau region of the EXIT chart.
Under the assumption of i.i.d. Gaussian inputs, the extrinsic information of the input symbol
xi when there is no a priori information is
I(xi;y|H) = log(1 + P/Nt hHi
R1i hi) (12)
where Ri is the autocorrelation matrix of the interference and AWGN given byRi = P/Nt HIHHI + 2zI (13)
where HI is defined in (6). Apparently, the mutual information given in (12) is a function ofthe channel H. Hence, we seek to find the modulation matrix G to maximize
IE1(0) =1
LQE
L
i=1
I(xi;y)
(14)
where the expectation is taken with respect to channel H. When the number of bits in a
modulation block (i.e. LQ) is fixed, it is equivalent to maximizing EL
i=1 I(xi;y), which is
the ergodic sum capacity of channel H when the L data streams are independently demodulated.This capacity will be called uncooperated sum capacity which is always smaller than or equal
to the conventional cooperated sum capacity. Concerning the uncooperated sum capacity, we
have the following theorem.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
13/25
13
Theorem 1: The ergodic uncooperated sum capacity of channel H is achieved if and only ifthe modulation matrix G satisfies
GGH = INtT (15)
Proof: See Appendix I.
Noting that matrix G is of size NtT L, equation (15) holds only if L NtT. For simpler
complexity of demodulation, we will only consider L = NtT in the sequel. In this case, (15)
implies that
tr(MHmMn) =
1, m = n0, m = n (16)Note, it can be shown that the above matrix G satisfying (15) also maximize the average
mutual information for any given a priori information in SICM or BICM. It is also interesting
to note that the modulation matrix G satisfying (15) also achieves the cooperated sum capacity
given by [1][2]
C = E
log(det(INr +
P
Nt2zHHH))
(17)
B. Minimizing the Outage Probability of the Extrinsic Information
Theorem 2: The probability Pr(IE1(Q) ) is maximized only when MiMHi is full rank
with identical nonzero eigenvalues for all i.
Proof: See Appendix II.
Interestingly, the set of dispersion matrices satisfying the conditions in Theorem 2 also
optimize the pairwise error performance. With perfect feedback (IA1 Q) for the symbols
other than the symbol of concern, say xi, the detection is based on the following observation
obtained by perfectly cancelling interference from other symbols within the same block, i.e.,
yi =
y P/Nt l=i hlxl = P/Nthixi + z (18)
Since hi = vec(HMi), then the modified Euclidean distance between a pair of transmitted
symbols differing at position i in a modulation block is
d2(xi, xi|H) = |xi xi|2
Nrm=1
hHmMiMHi hm (19)
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
14/25
14
where hHm is the m-th row vector ofH. Following the similar procedure by Tarokh et al in [3],
it can be readily found that, to minimize the error probability of xi, one needs to maximize
the rank ofMi and the product of the nonzero eigenvalues ofMiMHi just like the Rank and
Determinant criteria in [3]. When tr(MiMHi ) is fixed, the maximum coding gain is achieved
when all the eigenvalues ofMiMHi are equal.
In summary, we have the following two criteria for the design of the LD ST modulator.
Capacity Criterion: The symbol rate of the LD ST modulator must be Nt symbols per channel
use. Furthermore, the dispersion matrices shall be chosen such that their F-norms to be 1 and
the trace of the Hermitian product of any pair of distinct dispersion matrices to be 0.
Error-Performance Criterion: For the best error performance, the dispersion matrices MiMHi
for any i must be full rank with identical eigenvalues.
If the modulation block length T is Nt or greater, full rank can be easily guaranteed; while
ifT is less than Nt, full rank is impossible. Error-Performance Criterion also suggests that the
minimum modulation block length shall be Nt.
C. Design Examples
To demonstrate our design criteria, three inner ST modulation design examples are provided be-
low. In all the three schemes, T = Nt and L = N2t . Lets denote P =
01(Nt1) 1INt1 0(Nt1)1
,F = [fmn] as the DFT matrix of size Nt Nt with (m, n)-th entry fmn =
1Nt
exp(2j(m
1)(n 1)/Nt) and S =
1 01(Nt1)0(Nt1)1 0(Nt1)(Nt1)
. The dispersion matrices of the threeschemes are listed below.
Scheme 1: This is an optimal design. The associated dispersion matrices are
M(k1)Nt+i = diag[fk]P(i1) (20)
for k = 1, 2, . . . , N t and i = 1, 2, . . . , N t, where fk denotes the k-th column vector ofF.
Scheme 2: The dispersion matrices of the full-diversity and full-rate scheme [13] are
M(k1)Nt+i = k1diag[fk]i1P(i1) (21)
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
15/25
15
for k = 1, 2, . . . , N t and i = 1, 2, . . . , N t, where and are constellation dependent and are
chosen to guarantee full diversity [13].
Scheme 3: The dispersion matrices of the threaded BLAST scheme [11] are
M(k1)Nt+i = P(k1)+(i1)S P(i1) (22)
for k = 1, 2, . . . , N t and i = 1, 2, . . . , N t.
It can be readily checked that although all the three schemes satisfy Capacity Criterion
and preserve the original MIMO channel capacity, only the first two schemes satisfy Error-
Performance Criterion. Without an outer encoder, Scheme 2 achieves full diversity gain with
appropriate and and hence outperforms Scheme 1. However, the presence of an outer coder,
the two schemes shall perform closely.
D. Trade-Off Between Constellation Size and Modulation Symbol Rate
For a given inner ST modulation rate in bit Rm = LQ/T, there exists a trade-off between
constellation size Q and symbol rate L/T. From Theorem 1, IE1(0) is maximized only when
L/T Nt. In fact, it can also be shown that IE1(0) is monotonic with respect to L/T. Hence,
for a given Rm, the minimal integer satisfying Q Rm/Nt shall be selected. If Rm < NtQ
with the chosen Q, a subset of dispersion matrices shall be selected from the optimal design.
To verify this design method, simulation was set up over a system with Nt = Nr = T = 2
under the Rayleigh ergodic flat fading channel. In this simulation, we compared the uncooperated
sum capacity of Scheme 1 which is an optimal design, Alamouti scheme [5] and an orthogonal
scheme by selecting a pair of orthogonal dispersion matrices from Scheme 1. The corresponding
numerical results are presented in Fig. 5 when the target modulation rate Rm is 3 and 4 bits
per channel use, respectively. When Rm = 3, the optimal design uses QPSK modulation and
chooses 3 out of the 4 possible dispersion matrices in (20). As can be seen from these figures, the
performance difference between the optimal and the other two schemes grows more significantly
as Rm increases.
In summary, for a given inner ST modulation rate, one seeks to choose a constellation size
as small as possible till the modulation rate in symbol reaches Nt.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
16/25
16
V. SIMULATION RESULTS AND DISCUSSIONS
The EXIT characteristics of the three design examples as well as their error performance
will be compared in this section to demonstrate our design criteria. Instead of prohibitive MAP
detection as (6) and (5), the soft-interference-cancellation minimum mean square error (SIC-
MMSE) detection algorithm [22] was used as the inner demodulator in the simulation.
In Fig. 6, the EXIT characteristics of the three schemes were compared under several fixed
channel conditions. In the figure, Eb/N0 = 0dB (Eb is the transmitted power per bit here),
Nt = Nr = 2, and QPSK constellation were applied and the channel coefficient matrix was
assumed to have the form H =
1 cos
0 sin
. Results were obtained for various values of
magnitude and angle . In the EXIT charts, the magnitude determines the ending point of the
transfer function of the inner ST demodulator, while the angle affects the starting point when
the magnitude is fixed. As can be observed, all the schemes have similar starting points under
the same channel condition. The small variation in the starting points among different schemes
is due to the use of suboptimal demodulation algorithm (i.e. SIC-MMSE). This is expected
since they all satisfy Capacity Criterion. However, since Scheme 3 does not comply with Error-
Performance Criterion, under most channel conditions, it has smaller extrinsic information than
Scheme 2 and 3, i.e., its outage probability of the extrinsic information is larger. It can also be
seen from the figure, the first two schemes significantly outperform the third scheme particularly
when the gain of the second transmit antenna is significantly smaller than that of the first transmit
antenna.
We now consider fading channels. In Fig. 7(a) and 7(b), cumulative distribution functions
(CDFs) of extrinsic information at two extreme cases, i.e. IE1(0) and IE1(Q), under Rayleigh
flat fading channels are given, respectively. In the figure, a channel is said to be a slow fading
channel if it keeps constant in a coding frame but changes from frame to frame independently,
while a channel is a fast channel if it keeps constant in a modulation block but varies from block
to block independently. In the simulation, Eb/N0 = 3dB, Nt = Nr = 2, QPSK constellation
were applied. It can be seen from the simulation results that all of the three schemes have similar
statistics for IE1(0), but significantly different statistics for IE1(Q). The simulation results for
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
17/25
17
IE1(Q) in Fig. 7(b) demonstrate that the first two schemes outperforms Scheme 3 significantly
under both slow and fast fading channels. Although Scheme 2 outperforms Scheme 1 in an
uncoded system, their performance curves in a coded system are almost indistinguishable. For
both IE1(0) and IE1(Q), the performance in a fast fading channel is significantly better than in
a slow fading channel. This is expected because only when the channel varies significantly in
a coding frame, the ergodic capacity is possible and the temporal diversity is achieved by the
outer encoder.
Finally, in Fig. 8, the frame error rate (FER) of the three schemes under different Rayleigh
ergodic flat fading channels are compared. In the simulation, Nt = Nr = T = 2, QPSK
constellation, 200 symbols per coding frame were assumed. A convolutional code with coding
rate 1/2 and constraint length 4 is used as the outer encoder. Its generator polynomials are
H1(D) = 04, H2(D) = 13. In consistence with the EXIT chart analysis, the first two schemes
perform indistinguishable to each other but outperform the third scheme significantly after
sufficient iterations. It is also clear that the three schemes perform closely after the first iteration
since they all comply with Capacity Criterion. Again, the performance of all these schemes in
a fast fading channel is significantly better than that in a slow fading channel.
VI. CONCLUSION
A coded space-time modulation scheme with conventional outer encoder for MIMO wireless
communications has been investigated. Using the EXIT chart technique, the design of the inner
space-time modulator has been studied under the assumption of a joint iterative receiver. Two
design criteria are derived that relate to the channel capacity and error performance, respectively.
To guarantee the convergence of the iterative receiver, the uncooperated sum capacity of the inner
ST modulations must be maximized. Once convergence is achieved, the error performance is
optimized by maximizing the rank and determinant of the dispersion matrix of each individual
symbol. The latter Error-Performance Criterion is much easier to apply than the well-known
Tarokhs Rank and Determinant Criteria in [3]. The proposed two criteria together allow a
complete design of the system concerning both the data rate and error performance. Specifically,
it is shown that for a given inner ST modulation rate in bit, constellation size shall be minimized
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
18/25
18
till the maximum symbol rate, i.e. Nt, is reached. The proposed design criteria have been verified
by design examples and simulation results.
APPENDIX I
PROOF OF THEOREM 1
Under the assumption ofi.i.d. Gaussian input symbols {xi}, we have RG E(GxxHGH) =
GGH. Since RG is nonnegative definite, it can be decomposed as RG = QQH, where Q is
unitary matrix and = diag[i], i = 1, 2,...,NtT. Since RG = QQH will have the same
uncooperated sum capacity as , we need only consider
RG = GGH = . (23)
For the energy constraint, we haveNtTi=1
i = NtT.
By substituting (13) into (12), we obtain
I(xi;y|H) = log
1 + P/Nt hHi
P/Nt HIHHI + 2zI1 hi (24)
where HI is defined in (6). Further, substituting (4) and (23) into (24) and using the inverse ofa small-rank adjustment of a matrix in [30], we can further write
I(xi;y|H) = log 1 P/Nt ihHi R1hi (25)where hi is the i
th column vector ofH in (3) and
R = P/Nt NtTj=1
jhjhH
j + 2zI = P/Nt HH
H+ 2zI (26)
Hence, the uncooperated sum capacity Isum can be expressed as
Isum = max
Elog
NtTi=1
1 P/Nt ih
Hi R
1hi
(27)
Noting that 1P/NtihHi R1hi is the ith diagonal entry of matrix IP/Nt1/2HHR1H1/2,
we seek to minimize
J() = ElogI P/Nt
1/2HHR1H1/2
(28)
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
19/25
19
where (X) denotes the product of the diagonal entries of matrix X. Substituting (26) into (28)
and using the inverse of a small-rank adjustment in [30], we have
J() = ElogI+P
Nt2z1/2H
HH1/2
1
(29)It can be easily proven that (i j)
Ji
Jj
0, 1 i, j NtT. This, with the fact
that J() is symmetric with respect to i, 1 i, j NtT, shows that J() is Schur-convex
[27][28]. Hence, must be of the form INtT. By the energy constraint tr(GGH) = NtT, this
is possible only if (15) is satisfied.
APPENDIX II
PROOF OF THEOREM 2
If the a priori information is perfect, we can obtain from (6)
IE1(Q) = log
1 +
P
Nt2zhHi hi
(30)
From (30), maximizing Pr(IE1(Q) ) is equivalent to maximizing Pr(hi2 ). Noting that
hi = Hmi, we have
hi2 =
Nr
m=1hHmMiM
Hi hm (31)
where hm is the m-th column vector ofHH. The following decomposition is assumed
MiMHi = UU
H (32)
where = diag[n], n = 1, 2,...,Nt. By Lemma 5 in [1], sinceUHhm has the same distribution
as hm, we need only consider MiMHi = . Then we have
hi2 =
Ntn=1
nNr
m=1
|hmn|2 (33)
For the power constraint, we have Nt
n=1n = 1.
By [29], since the random variableNr
m=1|hmn|
2 in (33) is identically chi-square distributed
with 2Nr degrees of freedom, the outage probability Pr(hi2 ) is maximized if all the
eigenvalues ofMiMHi are equal. With tr(MiM
Hi ) = 1, it implies that i = 1/Nt, 1 i Nt.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
20/25
20
REFERENCES
[1] I. E. Telatar, Capacity of multi-antenna Gaussian channels, Eur. Trans. Telecom., vol 10, pp. 585-595, Nov. 1999.
[2] G. J. Foschini, M. J. Gans, On limits of wireless communications in a fading environment when using multiple antennas,
Wireless Personal Communications, vol. 6, no. 3, pp. 311-335, 1998.
[3] V. Tarokh, N. Seshadri, and A. Calderbank, Space-time codes for high data rate wireless communications: Performance
criterion and code construction, IEEE Trans. Inform. Theory, vol. 44, pp. 744-765, Mar. 1998.
[4] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, Space-time block code from orthogonal designs, IEEE Trans. Inform.
Theory, vol. 45, pp. 1456-1467, July 1999.
[5] S. Alamouti, A simple transmitter diversity scheme for wireless communications, IEEE J. Select. Areas Commun., vol.
16, pp. 1451-1458, Oct. 1998.
[6] Y. Liu and M. Fitz, space-time turbo codes, 13th Annual Allerton Conf. on Commun. Control and Computing , Sep. 1999.
[7] D. Cui and A. M. Haimovich, Design and performance of turbo space-time coded modulation, IEEE GLOBECOM00,
vol. 3, pp1627-1631, Nov. 2000.
[8] D. Tujkovic, Recursive space-time trellis codes for turbo coded modulation, Proc. of GlobeCom 2000, San Francisco.[9] G. J. Foschini, Layered space-time architecture for wireless communication in fading environments when using multiple
antennas, Bell labs. Tech. J.,vol. 1, no. 2, pp. 41-59, 1996.
[10] G. D. Golden, G. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky, Detection algorithm and initial laboratory results
using V-BLAST space-time communication architecture, Electron. Lett., vol. 35, pp. 14-16, Jan. 1999.
[11] H. El Gamal and A. R. Hammons Jr., A new approach to layered space-time coding and signal processing, IEEE Trans.
Inf. Theory, vol. 47, pp. 2321-2334. Sep. 2001.
[12] B. Hassibi and B. Hochwald, High-rate codes that are linear in space and time, IEEE Trans. Inform. Theory, vol. 48,
pp. 1804-1824, July 2002.
[13] X. Ma and G. B. Giannakis, Full-Diversity Full-Rate Complex-Field Space-Time Coding, IEEE Trans. Signal Processing,
vol. 51, no. 11, pp. 2917-2930, July 2003.
[14] C. Berrou, A. Glavieux, and P. Thitimajshima, Near Shannon limit error correcting coding and decoding: Turbo codes,
in Proc. IEEE Int. Conf. Commun., vol. 2, pp. 1064-1070, Geneva, Switzerland, May 1993.
[15] B. M. Hochwald and S. ten Brink Achieving Near-Capacity on a Multiple-Antenna Channel, IEEE Trans. Comm., vol.
51, pp. 389-399, Mar 2003.
[16] S. ten Brink, Convergence of iterative decoding, Electron. Lett., vol. 35, no. 13, pp. 1117-1118, Jun. 1999.
[17] S. ten Brink, Convergence behavior of iteratively decoded parallel concatenated codes, IEEE Trans. Commun., vol. 40,
pp. 1727-1737, Oct. 2001.
[18] A. van Zelst, R. van Nee, and G. A. Awater, Turbo-BLAST and its performance, in Proc. Vehicle Technology Conf., vol.
2, pp. 1282-1286, May 2001.
[19] X. Li and J. A. Ritcey, Bit-interleaved coded modulation with iterative decoding, in Proc. Int. Conf. Communications,
pp. 858-862, June 1999.
[20] A. M. Tonello, Space-time bit-interleaved coded modulation with an iterative decoding strategy, in Proc. Vehicle
Technology Conf., pp. 473-478, Sept. 2000.
[21] X. Wang and H. Poor, Iterative (turbo) soft interference cancellation and decoding for coded CDMA, IEEE Trans. Comm.,
vol. 47, pp. 1046-1061, July 1999.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
21/25
21
[22] M. Tuchler, A. Singer, and R. Koetter, Minimum mean square error equalization using a priori information, IEEE Trans.
Signal Processing, vol. 50, pp. 673-683, Mar. 2002.
[23] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, Optimal decoding of linear codes for minimizing symbol error rate, IEEE
Trans. Inf. Theory, vol 20, Issue: 2, pp.284-287, Mar. 1974.
[24] U. Fincke and M. Pohst, Improved methods for calculating vectors of short length in a lattice, including a complexity
analysis, in Math. Comput., vol. 44, pp. 463-471, Apr. 1985
[25] B. Vucetic and J. Yuan, Turbo Codes: Principles and Applications, Kluwer, 2000
[26] L. Zhang and D. Tse, Diversity and mutiplexing: A fundamental tradeoff in multiple antenna channels IEEE Trans.
Inform. Theory, vol. 49, pp. 1073-96, May 2003.
[27] A W. Marshall and I. Olkin, Ineqalities: Theory of Majorization and Its Application , Academic Press, Inc. (London)
Ltd., 1979.
[28] H. Boche and E. A. Jorswieck, On Schur-convexity of expectation of weighted sum of random variables with applications,
Journal of Inequalities in Pure Applied Mathematics, vol. 5, Issue 2, Article 46, 2004.
[29] M. E. Bock, P. Diaconis, F. W. Huffer and M. D. Perlman, Inequalities for linear combinations of Gamma random
variables, Canada J. Statistics, vol. 15, pp. 387-395, 1987.
[30] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, 1985.
[31] T. M. Cover and J. A. Thomas, Elements of Information Theory, New York: Wiley, 1991.
[32] T. Rapaport, Wireless Communications: Principles and Practice, 2nd ed. Prentice Hall, 2001
[33] J. Proakis, Digital Communications, 4th ed. New York: McGraw-Hill.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
22/25
22
Fig. 1. System Block Diagram.
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
IE1
(IA1)w
hereoutputIE
1becomesinputIA
2
IE2
(IA2
) where output IE2
becomes input IA1
IE1
=(IE1
+I
E1
)/2
IE2
c
Outer Decoder
IE1
cIE1
of symbol group 2
IE1
of symbol group 1
IE1
c
IE1
c
Fig. 2. A typical EXIT chart of joint iterative demodulation/decoding receiver for QPSK constellation: two input symbols per
modulation block.
Fig. 3. BCJR/MAP decoding with different a priori LLR inputs: I(La; x) = 12 [I(La1; x) + I(La2;x)]
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
23/25
23
(a) Case 1: var1=0.4,var2=9.0 and var=1.4 (b) Case 2: var1=2.0,var2=4.0 and var=2.7
Fig. 4. Histograms ofx for the BCJR/MAP decoder (a) and (b). x1, x2 and x are the soft outputs corresponding to
Le1, Le2 in BCJR/MAP decoder (a) and Le in BCJR/MAP decoder (b), respectively. var1,var2 and var are the associated
variances of the AWGN.
0 2 4 6 8 10 121
1.5
2
2.5
3
3.5
4
4.5
P/2
z(dB)
UncooperatedSumC
apacity(bits/channeluse) Gaussianopt
QPSKoptGaussianorth1
8PSKorth1
Gaussianorth2
8PSKorth2
(a) Target Rm = 3 bits per channel use
0 2 4 6 8 10 121
1.5
2
2.5
3
3.5
4
4.5
P/2
z(dB)
UncooperatedSumC
apacity(bits/channeluse)
Gaussianopt
QPSKoptGaussianorth1
16QAMorth1
Gaussianorth1
16QAMorth2
(b) Target Rm = 4 bits per channel use
Fig. 5. Uncooperated sum capacity versus P/2z for Nt = Nr = T = 2 under the Rayleigh ergodic flat fading channel. Dash
curves correspond to Gaussian inputs and solid lines correspond to specific constellations. opt, orth1 and orth2 correspond
to Scheme 1, Alamouti scheme, the orthogonal scheme (a subset of Scheme 1), respectively.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
24/25
24
0 0.5 1 1.5 20.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
IA
(a priori info.)
AverageIE(extrinsicinfo.)
=0.2
Scheme 1:=10o
Scheme 2:=10o
Scheme 3:=10o
Scheme 1:=90o
Scheme 2:=90o
Scheme 3:=90o
=90o
=10o
(a) when = 0.2
0 0.5 1 1.5 20.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
IA
(a priori info.)
AverageIE(extrinsicinfo.)
=1.0
Scheme 1:=10o
Scheme 2:=10o
Scheme 3:=10o
Scheme 1:=90o
Scheme 2:=90o
Scheme 3:=90o
=90o
=10o
(b) when = 1.0
Fig. 6. The EXIT characteristics of the three schemes under various channels H.
1 1.2 1.4 1.6 1.8 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
IE1
(0) (extrinsic info. without a priori)
CDF
Scheme 1slowScheme 2slow
Scheme 3slow
Scheme 1fast
Scheme 2fast
Scheme 3fast
(a) IE1(0)
1.6 1.65 1.7 1.75 1.8 1.85 1.9 1.95 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
IE1
(Q) (extrinsic info. with perfect a priori)
CDF
Scheme 1slowScheme 2slow
Scheme 3slow
Scheme 1fast
Scheme 2fast
Scheme 3fast
(b) IE1(Q)
Fig. 7. CDFs ofIE1(0) and IE1(Q) comparisons for the three schemes under the different Rayleigh ergodic flat fading
channels. fast indicates fast fading channel and slow indicates slow fading channel.
October 26, 2006 DRAFT
-
7/29/2019 Design4CSTM_jrnl
25/25
25
0 2 4 6 8 1010
3
102
101
100
Eb/No (dB)
FER
Scheme 1:1st it
Scheme 1: 8th it
Scheme 2:1st it
Scheme 2: 8th it
Scheme 3:1st it
Scheme 3: 8th it
fast
slow
slowfast
Fig. 8. FER comparison of the three schemes under different Rayleigh ergodic flat fading channel. fast indicates fast fading
channel and slow indicates slow fading channel.