sensitivity in communication networks: an optimization...

Sensitivity in Communication Networks: AnOptimization Theoretic Perspective

Malcolm EganINSA Lyon, INRIA

29th June 2018

1 / 33

Motivation

Figure: Zimmermann, IEEE Transactions on Communications, 28(4),1980.

Communication network design is aided by conceptual modelswith general structure:

I Open Systems Interconnection (OSI)

I TCP/IP

2 / 33

Motivation

I Another complementary model of communication:

Figure: Roth, Introduction to Coding Theory.

3 / 33

Motivation

I These models simplify design by breaking it into three steps:

1. Optimize each component.2. Optimize the coupling between each component.3. Repeat.

I Why? Joint optimization of the whole system is oftenintractable.

I Think of user scheduling + energy harvesting + power control+ security + ...

Question: Is there a way to characterize the impact of thecoupling between optimized components?

4 / 33

A Generic Problem in Networking

Consider two sets X and Y.

maximizex∈X

f (x ; y)

subject to gi (x ; y) ≤ 0, i ∈ Ihj(x ; y) = 0, j ∈ J

f , gi , hj are continuous on X × Y.

The parameter x ∈ X is the optimization variable.

The parameter y ∈ Y is the coupling variable or sideinformation.

The function f (x ; y) is the network utility function.

5 / 33

Example I: Power Control in Parallel Channels

Figure: LaSorte et al., Proc. GLOBECOM 2008.

Figure: Lozano et al., IEEE Trans. Info. Theory, 20066 / 33


maximizepk∈R, k=1,2,...,n

n∑k=1

1

2log

(1 +|hk |2pkσ2N + Ik

)

subject ton∑

k=1

pk ≤ Pmax

pk ≥ 0, k = 1, 2, . . . , n.

The optimization variable is the power vector p = [p1, . . . , pn]

The coupling variables are Ik ,Pmax, σN .

The network utility function is the sum-rate.

7 / 33

Example II: Queue Stability

Figure: Moghadam et al., IEEE ICC, 2015

Figure: Afshang et al., Asilomar, 2015

8 / 33


maximizep0,β0,m0

p0ρ0 log(1 + β0)e−ρpλ0κd2β2/α0

s.t.(

1− e−ρpλ0κd2β2/α0

)1+m0

≤ ε, θ0 = p0e−ρpλ0κd2β

2/α0

1−(


)1+m0

θ0 ≥ p0

[1+m0∑i=1

(1 + m0

i

)(−1)i+1e−pλ0(i−1)κd2β

2/α0

]−1

> µ0.

Nardelli, P. et al., “Throughput optimization in wireless networks understability and packet loss constraints,” IEEE Transactions on MobileComputing, vol. 13, no. 8, pp. 1883-1895, 2014.

9 / 33


maximizep0,β0,m0


s.t.(


)1+m0


2/α0

1−(


)1+m0

θ0 ≥ p0

[1+m0∑i=1

(1 + m0

i

)(−1)i+1e−pλ0(i−1)κd2β

2/α0

]−1

> µ0.

The optimization variables are the channel access probability,SINR threshold and number of retransmissions.

The coupling variables include the density of the network andpacket arrival rates.

The network utility function is the throughput. 10 / 33

Example III: Information Capacity in Non-Gaussian Noise

y(t) = r−η/20 h0(t) x0(t)︸︷︷︸

user

+ N(t)︸︷︷︸CN (0,σ2)

+∑

i∈Φ\{0}

r−η/2i hi (t)xi (t)

︸︷︷︸I(t)

Egan, M., et al., “Wireless communication in dynamic interference”, Proc.GLOBECOM, 2017.

de Freitas, M. et al., “Capacity bounds for additive symmetric alpha-stablenoise channels,” IEEE Trans. Info. Theory, 2017.

11 / 33


Consider a stationary, memoryless additive noise channel

Y = X + N

supµ∈P

I (µ,PY |X )

subject to µ ∈ Λ

The optimization variable is µ an element of the set ofprobability measures on (C,B(C)).

The coupling variables are constraints Λ and the channel PY |X .

The network utility function is the mutual information.

12 / 33

A Strategy

I Consider a differentiable function f : Rn → R, which admits aTaylor series representation

f (x + ‖e‖e) = f (x) + ‖e‖Def (x)T e + o(‖e‖).

(e is unit norm).

I This yields

|f (x + ‖e‖e)− f (x)| ≤ ‖Def (x)‖‖e‖+ o(‖e‖),

i.e., the sensitivity.

Question: what is the directional derivative of the optimalvalue function of an optimization problem?

13 / 33

A Strategy

I In the case of vector, smooth optimization problems there is agood theory.

I E.g., the following classical proposition.

Proposition

Let the real valued function f (x, y) : Rn × R→ R be twicedifferentiable on a compact convex subset X of Rn+1, strictlyconcave in x. Let x∗(y) be the optimal value of f on X and denoteψ(y) = f (x∗(y), y). Then, the derivative of ψ(y) is

ψ′(y) = fy (x∗(y), y).

I Generalizations due to Danskin and Gol’shtein.I Non-convexity.I Vector coupling parameters.

14 / 33

A Strategy

The strategy applies directly to problems in networking withoptimization and coupling parameters in Rn.

For example:

I Example I: Power control

I Example II: Queue stability

Example III has optimization variables that do not lie in Rn. Wewill return to this case later.

15 / 33


Recall we are interested in the optimization problem:

R∗(I ) = maximizepk∈R, k=1,2,...,n

n∑k=1

1

2log

(1 +|hk |2pkσ2N + I

)

subject ton∑

k=1

pk ≤ Pmax

pk ≥ 0, k = 1, 2, . . . , n.

Let p∗k be the optimal solution for I = 0.

Applying the strategy, we obtain

|R∗(I )− R∗(0)| ≤

∣∣∣∣∣n∑

k=1

1

2

|hk |2p∗kσ2N

1

σ2N + |hk |2p∗k

∣∣∣∣∣ |I |+ o(I ).

16 / 33


|R∗(I )− R∗(0)| ≤

∣∣∣∣∣n∑

k=1

1

2

|hk |2p∗kσ2N

1

σ2N + |hk |2p∗k

∣∣∣∣∣ |I |+ o(I ).

Remarks:

1. To understand the impact of the interference, theoptimization problem only needs to be solved once.

2. The impact of the interference is an approximately linearchange in the bound.

3. Approach generalizes to MIMO channels (more later).

17 / 33


Recall we are now concerned with the optimization problem

maximizep0,β0,m0


s.t.(


)1+m0


2/α0

1−(


)1+m0

θ0 ≥ p0

[1+m0∑i=1

(1 + m0

i

)(−1)i+1e−pλ0(i−1)κd2β

2/α0

]−1

> µ0.

This is a non-convex problem. What is the impact of theprobability the transmitter has a non-empty queue, ρ0?

18 / 33


Theorem (Danskin’s Theorem)

Let X be a metric space and U be a normed space and let X be acompact subset of X . Suppose that for all x ∈ X , the functionf (x , ·) is differentiable, that f (x , u) and Duf (x , u) are continuouson X × U. If X is finite dimensional and the directional derivativeof the optimal value function v(u) = infx∈X f (x , u) is continuous,then

v ′(u, d) = minx∈S(u)

Duf (x , u)d ,

where v ′(u, d) is the directional derivative of v(u) in the directiond , and S(u) ⊂ X is1 the set of optimal x ∈ X for fixed u ∈ U.

1Note that the feasible set X is independent of u.19 / 33


Translating this result into our setting, let Λ(ρ0,r ) be the set ofoptimal p∗0 , β

∗0 ,m

∗0 (channel access probability, SINR threshold and

number of retransmissions), for a given ρ0,r .

Then, for any ρ0 ∈ [0, 1],

|R∗(ρ0,r )− R∗(ρ0)|

≤∣∣∣∣ max(p∗0 ,β

∗0 ,m

∗0 )∈Λ(ρ0,r )

p∗0 log(1 + β∗0)e−ρpλ0κd2(β∗0 )2/α

∣∣∣∣× |ρ0,r − ρ0|+ o(|ρ0,r − ρ0|)

20 / 33


Remarks:

1. It is possible to apply sensitivity analysis techniques fornon-convex problems.

2. Provides a means of decoupling the problem of ensuring thequeue is occupied to the problem of designing the accessprotocol.

3. In principle possible to work in the infinite dimensional settingif an appropriate derivative exists.

When do infinite dimensional problems arise?

21 / 33


I Consider general constraints (allowing for continuous inputs).

C (Λ) = supµ∈P

I (X ;Y )

subject to µ ∈ Λ,

I E.g., Λ = Λp = {µ : Eµ[|X |p] ≤ b}.

I The discrete approximation of C (Λ) is then defined as

C (Λ∆) = supµ∈P

I (X ;Y )

subject to µ ∈ Λ∆,

where

Λ∆ = ∪∆>∆P(∆Z) ∩ Λ

∆Z = {∆z : z ∈ Z}

22 / 33


I The capacity sensitivity is:

CΛ→Λ∆= |C (Λ)− C (Λ∆)|,

I I.e., the cost of discreteness.

Question: how do we bound the capacity sensitivity when theconstraint is perturbed?

Egan, M., Perlaza, S.M. and Kungurtsev, V., “Capacity sensitivity inadditive non-Gaussian noise channels,” Proc. IEEE InternationalSymposium on Information Theory, Aachen, Germany, Jun. 2017.

Egan, M. and Perlaza, S.M., “Capacity approximation of continuouschannels by discrete inputs,” Proc. CISS Invited Paper, 2018.

23 / 33


Question: how do we bound the capacity sensitivity when theconstraint is perturbed?

A recipe:

(i) Establish continuity.

(ii) Obtain a bound via regular subgradients.

24 / 33


Theorem (Egan, Perlaza 2018)

Let Λ be a non-empty compact subset of P. If the mutualinformation I (·, pN) is weakly continuous on Λ, thenC (Λ∆)→ C (Λ) as ∆→ 0.

(i) Gaussian modelI pN(x) = 1√

2πσ2exp

(−x2/(2σ2)

), σ > 0.

I Λ = {µ : Eµ[X 2] ≤ b}, b > 0.

(ii) Cauchy modelI pN(x) = 1

πγ(

1+( xγ )2

) , γ > 0.

I Λ = {µ : Eµ[|X |r ] ≤ b}, b > 0.

(iii) Inverse Gaussian model

I pN(x) =√

λ2πx3 exp

(−λ(x−γ)2

2γ2x

), x > 0, λ, γ > 0.

I Λ = {µ : Eµ[X ] ≤ b}, b > 0.

25 / 33


DefinitionConsider a function f : Rn → R and a point x ∈ Rn with f (x)finite. For a vector, v ∈ Rn, v is a regular subgradient of f at x,denoted by v ∈ ∂f (x), if there exists δ > 0 such that for allx ∈ Bδ(x)

f (x) ≥ f (x) + vT (x− x) + o(|x− x|).

I Related to subgradients in convex optimization.

I What are conditions for existence?

26 / 33


Theorem (Rockafellar and Wets 1997)

Suppose f : Rn → R is finite and lower semicontinuous at x ∈ Rn.Then, there exists a sequence xk →

fx with ∂f (xk) 6= ∅ for all k.

Rockafellar, R. and Wets, R., Variational Analysis. Berlin Heidelbeg:Springer-Verlag, 1997

27 / 33


Theorem (Egan, Perlaza 2018)

Suppose that Λ is a non-empty compact subset of P and themutual information I : P → R is weakly continuous on Λ . IfC = supµ∈Λ I (µ, pN) <∞, then for all ε > 0 there exists v ∈ Rsuch that for ∆ sufficiently small,

C (Λ)− C (Λ∆)− ε ≤ |v |∆ + o(∆)

holds.

28 / 33

Discussion: Design Implications

29 / 33


I Desirable to optimize the power control p:

R∗(vec(W)) = maxp

k∑i=1

log

1 +|h†iwi |2 pi

‖wi‖2

σ2 +∑k

j=1,j 6=i |h†iwj |2

pj‖wj‖2

subject to

k∑i=1

pi ≤ pmax

pi ≥ 0, i = 1, . . . , k.

(1)

I No simple solution! (Except when W = I)

I What is the effect of changing the precoder?

30 / 33


10−3

10−2

10−1

100

0

0.1

0.2

0.3

0.4

0.5

0.6

Error Norm

Avera

ge R

ate

Loss (

nats

)

Approximate UB, n = 4

Approximate UB, n = 2

LB, n = 4

LB, n = 2

Figure: Plot of rate-loss for varying error norm ‖e‖, with pmax = 5dB andn = k antennas.

31 / 33


In the MIMO optimization problem, using more complicated (buthigher performance) signal processing leads to higher complexityalgorithms.

Sensitivity analysis provides a means of understanding thecomplexity-performance tradeoffs involved in system optimization

Sensitivity analysis allows us to ask when theperformance gains are worth the complexity costs.

32 / 33

Conclusions

The design of modern communication networks involves theoptimization of many coupled components.

Sensitivity analysis of optimization problems provides tools in orderto understand how coupling impacts optimized components.

Key applications of sensitivity analysis:

I Coupling parameter selection.

I Impact of imperfect models.

Well developed theory waiting to be applied.

33 / 33

sensitivity in communication networks: an optimization...

Documents