11.9 zero-mean gaussian noise is the worst additive noise · worst additive noise • we will show...
TRANSCRIPT
11.9 Zero-Mean Gaussian Noise is the Worst Additive Noise
Worst Additive Noise
• We will show that in terms of the capacity of the system, the zero-mean
Gaussian noise is the worst additive noise given that the noise vector has
a fixed correlation matrix.
• The diagonal elements of the correlation matrix specify the power of the
individual noise variables.
• The other elements in the matrix give a characterization of the correlation
between the noise variables.
Worst Additive Noise
• We will show that in terms of the capacity of the system, the zero-mean
Gaussian noise is the worst additive noise given that the noise vector has
a fixed correlation matrix.
• The diagonal elements of the correlation matrix specify the power of the
individual noise variables.
• The other elements in the matrix give a characterization of the correlation
between the noise variables.
Worst Additive Noise
• We will show that in terms of the capacity of the system, the zero-mean
Gaussian noise is the worst additive noise given that the noise vector has
a fixed correlation matrix.
• The diagonal elements of the correlation matrix specify the power of the
individual noise variables.
• The other elements in the matrix give a characterization of the correlation
between the noise variables.
Worst Additive Noise
• We will show that in terms of the capacity of the system, the zero-mean
Gaussian noise is the worst additive noise given that the noise vector has
a fixed correlation matrix.
• The diagonal elements of the correlation matrix specify the power of the
individual noise variables.
• The other elements in the matrix give a characterization of the correlation
between the noise variables.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
___________________
___________________
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
__________________________
__________________________
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
_________________
_________________
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Zero-Mean Gaussian System
1. Consider a system of correlated Gaussian channels
with noise vector Z
⇤ ⇠ N(0, K), and so
˜KZ
⇤ = K.
2. Let C⇤be the capacity of the system.
3. Let X
⇤be the zero-mean Gaussian input vector that
achieves the capacity.
4. Let Y
⇤be the output of the system with X
⇤as
input, i.e.,
Y
⇤= X
⇤+ Z
⇤.
Alternative System
1. Consider a system exactly the same as the Zero-
Mean Gaussian System except that the noise vector
Z, which has the same correlation matrix as Z
⇤, may
neither be zero-mean nor Gaussian.
2. Assume that the joint pdf of Z exists.
3. Let C be the capacity of the system.
4. Let Y be the output of the system with X
⇤as input,
i.e.,
Y = X
⇤+ Z.
Main Idea
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
• For the Zero-Mean Gaussian System, C⇤ = I(X⇤;Y⇤).
• For the alternative system, I(X⇤;Y) C.
• We will show that I(X⇤;Y⇤) I(X⇤;Y).
• Hence,C⇤ = I(X⇤;Y⇤) I(X⇤;Y) C.
Theorem 11.32 For a fixed zero-mean Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if and only if Z = Z
⇤ ⇠ N (0, K).
Lemma 11.33 Let X be a zero-mean random vector and
Y = X+ Z
where Z is independent of X. Then
˜KY =
˜KX +
˜KZ.
Remark The scalar case has been proved in the proof of Theorem 11.21.
Lemma 11.34 LetY⇤ ⇠ N (0,K) andY be any random vector with correlation
matrix K. Then
ZfY⇤
(y) log fY⇤(y)dy =
Z
SY
fY(y) log fY⇤(y)dy.
Remark A similar technique has been used in proving Theorems 2.50 and
10.41 (maximum entropy distributions).
Lemma 11.33 Let X be a zero-mean random vector and
Y = X+ Z
where Z is independent of X. Then
˜KY =
˜KX +
˜KZ.
Remark The scalar case has been proved in the proof of Theorem 11.21.
Lemma 11.34 LetY⇤ ⇠ N (0,K) andY be any random vector with correlation
matrix K. Then
ZfY⇤
(y) log fY⇤(y)dy =
Z
SY
fY(y) log fY⇤(y)dy.
Remark A similar technique has been used in proving Theorems 2.50 and
10.41 (maximum entropy distributions).
Lemma 11.33 Let X be a zero-mean random vector and
Y = X+ Z
where Z is independent of X. Then
˜KY =
˜KX +
˜KZ.
Remark The scalar case has been proved in the proof of Theorem 11.21.
Lemma 11.34 LetY⇤ ⇠ N (0,K) andY be any random vector with correlation
matrix K. Then
ZfY⇤
(y) log fY⇤(y)dy =
Z
SY
fY(y) log fY⇤(y)dy.
Remark A similar technique has been used in proving Theorems 2.50 and
10.41 (maximum entropy distributions).
Lemma 11.33 Let X be a zero-mean random vector and
Y = X+ Z
where Z is independent of X. Then
˜KY =
˜KX +
˜KZ.
Remark The scalar case has been proved in the proof of Theorem 11.21.
Lemma 11.34 LetY⇤ ⇠ N (0,K) andY be any random vector with correlation
matrix K. Then
ZfY⇤
(y) log fY⇤(y)dy =
Z
SY
fY(y) log fY⇤(y)dy.
Remark A similar technique has been used in proving Theorems 2.50 and
10.41 (maximum entropy distributions).
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Theorem 2.50 Let
p⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
X
x2Sp
p(x)ri(x) = ai for 1 i m. (1)
Then p⇤ maximizes H(p) over all probability distribu-
tion p on S subject to (1).
Sketch of Proof
H(p⇤) � H(p)
= �X
x2Sp⇤(x) ln p
⇤(x) +
X
x2Sp
p(x) ln p(x)
= �X
x2Sp
p(x) ln p⇤(x) +
X
x2Sp
p(x) ln p(x)
=
X
x2Sp
p(x) ln
p(x)
p⇤(x)
= D(pkp⇤)
� 0.
Remark The key step is to establish that
X
x2Sp⇤(x) ln p
⇤(x) =
X
x2Sp
p(x) ln p⇤(x).
Theorem 10.41 Let
f⇤(x) = e
��0
�Pm
i=1
�iri(x)
for all x 2 S, where �0
,�1
, · · · ,�m are chosen such
that
Z
Sf
ri(x)f(x)dx = ai for 1 i m. (2)
Then f⇤maximizes h(f) over all pdf f defined on S,
subject to the constraints in (2).
Remark The key step is to establish that
Z
Sf⇤(x) ln f
⇤(x)dx =
Z
Sf
f(x) ln f⇤(x)dx. (3)
Theorem 10.45 Let X be a vector of n continuous
random variables with correlation matrix
˜K. Then
h(X) 1
2
log
h(2⇡e)
n| ˜K|i
with equality if and only if X ⇠ N(0, ˜K).
Then (3) and Theorem 10.45 together imply
Lemma 11.34 Let Y
⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY
⇤ (y) log fY
⇤ (y)dy =
Z
SY
fY
(y) log fY
⇤ (y)dy.
Data Processing Inequality for Informational Divergence
Theorem Let X,X 0, Y , and Y 0be real random variables such that fXY and
fX0Y 0exist, and fY |X = fY 0|X0
. Then
D(fXkfX0) � D(fY kfY 0
).
Proof Exercise.
fY |XX
X 0 Y 0
Y
fY |X
Data Processing Inequality for Informational Divergence
Theorem Let X,X 0, Y , and Y 0be real random variables such that fXY and
fX0Y 0exist, and fY |X = fY 0|X0
. Then
D(fXkfX0) � D(fY kfY 0
).
Proof Exercise.
fY |XX
X 0 Y 0
Y
fY |X
Data Processing Inequality for Informational Divergence
Theorem Let X,X 0, Y , and Y 0be real random variables such that fXY and
fX0Y 0exist, and fY |X = fY 0|X0
. Then
D(fXkfX0) � D(fY kfY 0
).
Proof Exercise.
fY |XX
X 0 Y 0
Y
fY |X
Data Processing Inequality for Informational Divergence
Theorem Let X,X 0, Y , and Y 0be real random variables such that fXY and
fX0Y 0exist, and fY |X = fY 0|X0
. Then
D(fXkfX0) � D(fY kfY 0
).
Proof Exercise.
fY |XX
X 0 Y 0
Y
fY |X
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Z� � N (0, K)
+X⇤ Y⇤
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Lemma 11.33 Let X be a zero-mean random vector
and
Y = X + Z
where Z is independent of X. Then
˜KY =
˜KX +
˜KZ.
Z� � N (0, K)
+X⇤ Y⇤
Z : K̃Z = K
+X⇤ Y
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_____
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_____
_____________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_______
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_______
___________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
______
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
______
__________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
____
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
____
____________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Lemma 11.34 Let Y⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY⇤ (y) log fY⇤ (y)dy =
Z
SYfY(y) log fY⇤ (y)dy.
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Lemma 11.34 Let Y⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY⇤ (y) log fY⇤ (y)dy =
Z
SYfY(y) log fY⇤ (y)dy.
______
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Lemma 11.34 Let Y⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY⇤ (y) log fY⇤ (y)dy =
Z
SYfY(y) log fY⇤ (y)dy.
______
_____
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Lemma 11.34 Let Y⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY⇤ (y) log fY⇤ (y)dy =
Z
SYfY(y) log fY⇤ (y)dy.
______
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Lemma 11.34 Let Y⇤ ⇠ N(0, K) and Y be any ran-
dom vector with correlation matrix K. Then
ZfY⇤ (y) log fY⇤ (y)dy =
Z
SYfY(y) log fY⇤ (y)dy.
______
_____
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
____________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
____________________
__________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
____________________
__________________
_____________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
___________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
___________________
____________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
___________________
____________________
________________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_____________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_____________________
__________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_______________________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
_______________________
__________
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
+ Y
X⇤
Z
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
+ Y
X⇤
Z
+ Y⇤
X⇤
Z⇤
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
+ Y
X⇤
Z
+ Y⇤
X⇤
Z⇤
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
+ Y
X⇤
Z
+ Y⇤
X⇤
Z⇤
Theorem 11.32 For a fixed zero-mean
Gaussian random vector X
⇤, let
Y = X
⇤+ Z,
where the joint pdf of Z exists and Z is
independent of X
⇤. Under the constraint
that the correlation matrix of Z is equal
to K, where K is any symmetric positive
definite matrix, I(X
⇤;Y) is minimized if
and only if Z = Z
⇤ ⇠ N(0, K).
Proof
1. Since EZ
⇤= 0,
˜KZ
⇤ = KZ
⇤ = K. Therefore, Z
⇤and Z have the same correlation matrix.
2. By noting that X
⇤has zero mean, we apply
Lemma 11.33 to see that Y
⇤and Y have the same
correlation matrix.
3. The theorem is proved by considering
I(X
⇤;Y
⇤) � I(X
⇤;Y)
= h(Y
⇤) � h(Z
⇤) � h(Y) + h(Z)
= �Z
fY
⇤ (y) log fY
⇤ (y)dy +
ZfZ
⇤ (z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
= �Z
fY
(y) log fY
⇤ (y)dy +
Z
SZ
fZ
(z) log fZ
⇤ (z)dz
+
ZfY
(y) log fY
(y)dy �Z
SZ
fZ
(z) log fZ
(z)dz
=
Zlog
0
@fY
(y)
fY
⇤ (y)
1
A fY
(y)dy �Z
SZ
log
0
@fZ
(z)
fZ
⇤ (z)
1
A fZ
(z)dz
= D(fY
kfY
⇤ ) � D(fZ
kfZ
⇤ )
0.
+ Y
X⇤
Z
+ Y⇤
X⇤
Z⇤