experiments in robust optimization - columbia …dano/talks/orc06.pdfmotivation robust optimization...
TRANSCRIPT
Experiments in Robust Optimization
Daniel Bienstock
Columbia University
10-26-06
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 1 / 52
Motivation
Robust Optimization
Optimization under parameter (data) uncertainty
Ben-Tal and Nemirovsky, El Ghaoui et al
Bertsimas et al
Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.
Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].
Typically, a minimization problem becomes a min-max problem.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 2 / 52
Motivation
Robust Optimization
Optimization under parameter (data) uncertainty
Ben-Tal and Nemirovsky, El Ghaoui et al
Bertsimas et al
Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.
Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].
Typically, a minimization problem becomes a min-max problem.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 2 / 52
Motivation
Robust Optimization
Optimization under parameter (data) uncertainty
Ben-Tal and Nemirovsky, El Ghaoui et al
Bertsimas et al
Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.
Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].
Typically, a minimization problem becomes a min-max problem.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 2 / 52
Motivation
Robust Optimization
Optimization under parameter (data) uncertainty
Ben-Tal and Nemirovsky, El Ghaoui et al
Bertsimas et al
Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.
Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].
Typically, a minimization problem becomes a min-max problem.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 2 / 52
Motivation
Robust Optimization
Optimization under parameter (data) uncertainty
Ben-Tal and Nemirovsky, El Ghaoui et al
Bertsimas et al
Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.
Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].
Typically, a minimization problem becomes a min-max problem.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 2 / 52
Motivation
Robust Optimization
Optimization under parameter (data) uncertainty
Ben-Tal and Nemirovsky, El Ghaoui et al
Bertsimas et al
Uncertainty is modeled by assuming that data is not knownprecisely, and will instead lie in known sets.
Example: a coefficient ai is uncertain. We allow ai ∈ [l i , u i ].
Typically, a minimization problem becomes a min-max problem.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 2 / 52
Motivation
Example: Linear Programs with Row-Wise uncertainty
Ben-Tal and Nemirovsky, 1999
min ctxSubject to:
Ax ≥ b for all A ∈ Ux ∈ X
U = uncertainty set→ the i th row of A belongs to an ellipsoidal set Ei
e.g.∑
j α2ij (aij − aij )
2 ≤ 1
→ can be solved using SOCP techniques
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 3 / 52
Motivation
Other forms of optimization under uncertainty
Stochastic programming
Adversarial queueing, online optimization
“Risk-aware” optimization
Optimization of utility functions as a substitute for handlinginfeasibilities
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 4 / 52
Motivation
Other forms of optimization under uncertainty
Stochastic programming
Adversarial queueing, online optimization
“Risk-aware” optimization
Optimization of utility functions as a substitute for handlinginfeasibilities
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 4 / 52
Motivation
Other forms of optimization under uncertainty
Stochastic programming
Adversarial queueing, online optimization
“Risk-aware” optimization
Optimization of utility functions as a substitute for handlinginfeasibilities
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 4 / 52
Motivation
Other forms of optimization under uncertainty
Stochastic programming
Adversarial queueing, online optimization
“Risk-aware” optimization
Optimization of utility functions as a substitute for handlinginfeasibilities
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 4 / 52
Motivation
Scenario I: Stability
Data is fairly accurate, though possibly noisy – small errors arepossible
�������
→ Idiosyncratic decisions and small changes in data could havemajor impact
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 5 / 52
Motivation
Scenario I: Stability
Data is fairly accurate, though possibly noisy – small errors arepossible
������ � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �
� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � �
� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �
� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �
� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �
� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �� � � � �
� � � � �� � � � �� � � � �� � � � �� � � � �
� � � � �� � � � �� � � � �� � � � �� � � � �
→ Idiosyncratic decisions and small changes in data could havemajor impact
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 5 / 52
Motivation
Scenario II: Hedging
Significant, but within order-of-magnitude, data uncertainty
Example:A certain parameter, α, is volatile. Its long-term average is 1.5 but itwe could expect changes of the order of .3.
Possibly more than just noise
Could use deviations to our advantage, especially if there areseveral uncertain parameters that act “correlated”
Are we guarding against risk or are we hedging?
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 6 / 52
Motivation
Scenario III: InsuranceReal world data can exhibit undesirable and unexpected behavior
Classical goal: how can we protect without becoming too riskaverse
Need to clearly spell out desired tradeoff between risk andperformanceMagnitude and geometry of risk are not the same
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 7 / 52
Motivation
Scenario III: InsuranceReal world data can exhibit undesirable and unexpected behavior
Classical goal: how can we protect without becoming too riskaverse
Need to clearly spell out desired tradeoff between risk andperformanceMagnitude and geometry of risk are not the same
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 7 / 52
Motivation
Application: Portfolio Optimization
min λx T Qx − µT x
Subject to:
Ax ≥ b
µ = vector of “returns”, Q = “covariance” matrix
x = vector of “asset weights”
Ax ≥ b : general linear constraints
λ ≥ 0 = “risk-aversion” multiplier
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 8 / 52
Motivation
Application: Portfolio Optimization
min λx T Qx − µT x
Subject to:
Ax ≥ b
µ = vector of “returns”, Q = “covariance” matrix
x = vector of “asset weights”
Ax ≥ b : general linear constraints
λ ≥ 0 = “risk-aversion” multiplier
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 8 / 52
Motivation
Application: Portfolio Optimization
min λx T Qx − µT x
Subject to:
Ax ≥ b
µ = vector of “returns”, Q = “covariance” matrix
x = vector of “asset weights”
Ax ≥ b : general linear constraints
λ ≥ 0 = “risk-aversion” multiplier
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 8 / 52
Motivation
Application: Portfolio Optimization
min λx T Qx − µT x
Subject to:
Ax ≥ b
µ = vector of “returns”, Q = “covariance” matrix
x = vector of “asset weights”
Ax ≥ b : general linear constraints
λ ≥ 0 = “risk-aversion” multiplier
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 8 / 52
Motivation
Application: Portfolio Optimization
min λx T Qx − µT x
Subject to:
Ax ≥ b
µ = vector of “returns”, Q = “covariance” matrix
x = vector of “asset weights”
Ax ≥ b : general linear constraints
λ ≥ 0 = “risk-aversion” multiplier
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 8 / 52
Motivation
Robust Portfolio Optimization
Goldfarb and Iyengar, 2001
→ Q and µ are uncertain
Robust Problem
min x{
maxQ∈Q λx T Qx − min µ∈E µT x}
Subject to:∑j x j = 1, x ≥ 0
→ When Q is an ellipsoid and E is a product of intervals the robustproblem can be solved as an SOCP
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 9 / 52
Motivation
Linear Programs with Row-Wise uncertainty
Bertsimas and Sim, 2002
max ctxSubject to:
Ax ≤ b for all A ∈ Ux ≥ 0
→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij
→ for all other coefficients: aij = aij .
Robust problem can be formulated as a (larger) linear program.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 10 / 52
Motivation
Linear Programs with Row-Wise uncertainty
Bertsimas and Sim, 2002
max ctxSubject to:
Ax ≤ b for all A ∈ Ux ≥ 0
→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij
→ for all other coefficients: aij = aij .
Robust problem can be formulated as a (larger) linear program.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 10 / 52
Motivation
Linear Programs with Row-Wise uncertainty
Bertsimas and Sim, 2002
max ctxSubject to:
Ax ≤ b for all A ∈ Ux ≥ 0
→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij
→ for all other coefficients: aij = aij .
Robust problem can be formulated as a (larger) linear program.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 10 / 52
Motivation
Linear Programs with Row-Wise uncertainty
Bertsimas and Sim, 2002
max ctxSubject to:
Ax ≤ b for all A ∈ Ux ≥ 0
→ in every row i at most Γi coefficients can change:aij − δij ≤ aij ≤ aij + δij
→ for all other coefficients: aij = aij .
Robust problem can be formulated as a (larger) linear program.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 10 / 52
Motivation
Equivalent formulation
max ctxSubject to:
Ax ≤ b for all A ∈ Ux ≥ 0
→ in every row i exactly Γi coefficients change: aij = aij + δij
→ for all other coefficients: aij = aij .
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 11 / 52
Data-driven Model definition
Robust Portfolio Optimization
A different uncertainty model
→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant
A simple example
Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)
Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj
Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj
At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 12 / 52
Data-driven Model definition
Robust Portfolio Optimization
A different uncertainty model
→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant
A simple example
Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)
Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj
Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj
At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 12 / 52
Data-driven Model definition
Robust Portfolio Optimization
A different uncertainty model
→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant
A simple example
Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)
Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj
Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj
At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 12 / 52
Data-driven Model definition
Robust Portfolio Optimization
A different uncertainty model
→ Want to model that deviations of the returns µj from their nominalvalues are rare but could be significant
A simple example
Parameters: 0 ≤ γ ≤ 1, integer N ≥ 0,for each asset j :µj = expected return, 0 ≤ δj small (possibly zero)
Well-behaved asset j: µj − δj ≤ µj ≤ µj + δj
Misbehaving asset j: (1 − γ)µj ≤ µj ≤ µj
At most N assets misbehaveDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 12 / 52
Data-driven Model definition
A more comprehensive setting
Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ K
for each asset j : µj = expected return
between n i and Ni assets j satisfy:
(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj , for each i ≥ 1 ( γ0 = 0)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 13 / 52
Data-driven Model definition
A more comprehensive setting
Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ K
for each asset j : µj = expected return
between n i and Ni assets j satisfy:
(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj , for each i ≥ 1 ( γ0 = 0)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 13 / 52
Data-driven Model definition
A more comprehensive setting
Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return
between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑
j µj ≥ Γ∑
j µj ; Γ > 0 a parameter
(R. Tutuncu) For 1 ≤ h ≤ H,
a set (“tier”) Th of assets, and a parameter Γh > 0
for each h,∑
j∈Thµj ≥ Γh
∑j∈Sh
µj
Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 14 / 52
Data-driven Model definition
A more comprehensive setting
Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return
between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑
j µj ≥ Γ∑
j µj ; Γ > 0 a parameter
(R. Tutuncu) For 1 ≤ h ≤ H,
a set (“tier”) Th of assets, and a parameter Γh > 0
for each h,∑
j∈Thµj ≥ Γh
∑j∈Sh
µj
Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 14 / 52
Data-driven Model definition
A more comprehensive setting
Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return
between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑
j µj ≥ Γ∑
j µj ; Γ > 0 a parameter
(R. Tutuncu) For 1 ≤ h ≤ H,
a set (“tier”) Th of assets, and a parameter Γh > 0
for each h,∑
j∈Thµj ≥ Γh
∑j∈Sh
µj
Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 14 / 52
Data-driven Model definition
A more comprehensive setting
Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return
between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑
j µj ≥ Γ∑
j µj ; Γ > 0 a parameter
(R. Tutuncu) For 1 ≤ h ≤ H,
a set (“tier”) Th of assets, and a parameter Γh > 0
for each h,∑
j∈Thµj ≥ Γh
∑j∈Sh
µj
Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 14 / 52
Data-driven Model definition
A more comprehensive setting
Parameters: 0 ≤ γ1 ≤ γ2 ≤ . . . ≤ γK ≤ 1,integers 0 ≤ n i ≤ Ni , 1 ≤ i ≤ Kfor each asset j : µj = expected return
between n i and Ni assets j satisfy:(1 − γi )µj ≤ µj ≤ (1 − γi−1)µj∑
j µj ≥ Γ∑
j µj ; Γ > 0 a parameter
(R. Tutuncu) For 1 ≤ h ≤ H,
a set (“tier”) Th of assets, and a parameter Γh > 0
for each h,∑
j∈Thµj ≥ Γh
∑j∈Sh
µj
Note: only downwards changes are modeledDaniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 14 / 52
Data-driven Model definition
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 15 / 52
Data-driven Model definition
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 15 / 52
Data-driven Model definition
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 15 / 52
Data-driven Model definition
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 15 / 52
Data-driven Benders
General methodology:
Benders’ decomposition (= cutting-plane algorithm)
Generic problem: min x∈X maxd∈D f (x , d )
→ Maintain a finite subset D of D (a “model” )
GAME
1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗
2 Adversary: solve maxd∈D f (x ∗, d ), with solution d
3 Add d to D, and go to 1.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 16 / 52
Data-driven Benders
General methodology:
Benders’ decomposition (= cutting-plane algorithm)
Generic problem: min x∈X maxd∈D f (x , d )
→ Maintain a finite subset D of D (a “model” )
GAME
1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗
2 Adversary: solve maxd∈D f (x ∗, d ), with solution d
3 Add d to D, and go to 1.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 16 / 52
Data-driven Benders
General methodology:
Benders’ decomposition (= cutting-plane algorithm)
Generic problem: min x∈X maxd∈D f (x , d )
→ Maintain a finite subset D of D (a “model” )
GAME
1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗
2 Adversary: solve maxd∈D f (x ∗, d ), with solution d
3 Add d to D, and go to 1.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 16 / 52
Data-driven Benders
General methodology:
Benders’ decomposition (= cutting-plane algorithm)
Generic problem: min x∈X maxd∈D f (x , d )
→ Maintain a finite subset D of D (a “model” )
GAME
1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗
2 Adversary: solve maxd∈D f (x ∗, d ), with solution d
3 Add d to D, and go to 1.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 16 / 52
Data-driven Benders
General methodology:
Benders’ decomposition (= cutting-plane algorithm)
Generic problem: min x∈X maxd∈D f (x , d )
→ Maintain a finite subset D of D (a “model” )
GAME
1 Implementor: solve min x∈X maxd∈D f (x , d ),with solution x ∗
2 Adversary: solve maxd∈D f (x ∗, d ), with solution d
3 Add d to D, and go to 1.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 16 / 52
Data-driven Benders
Why this approach
Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems
Decoupling allows us to use more sophisticated uncertaintymodels
If number of iterations is small, implementor’s problem is a small“convex” problem
Most progress will be achieved in initial iterations – permits “soft”termination criteria
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 17 / 52
Data-driven Benders
Why this approach
Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems
Decoupling allows us to use more sophisticated uncertaintymodels
If number of iterations is small, implementor’s problem is a small“convex” problem
Most progress will be achieved in initial iterations – permits “soft”termination criteria
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 17 / 52
Data-driven Benders
Why this approach
Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems
Decoupling allows us to use more sophisticated uncertaintymodels
If number of iterations is small, implementor’s problem is a small“convex” problem
Most progress will be achieved in initial iterations – permits “soft”termination criteria
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 17 / 52
Data-driven Benders
Why this approach
Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems
Decoupling allows us to use more sophisticated uncertaintymodels
If number of iterations is small, implementor’s problem is a small“convex” problem
Most progress will be achieved in initial iterations – permits “soft”termination criteria
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 17 / 52
Data-driven Benders
Why this approach
Decoupling of implementor and adversary yields considerablysimpler, and smaller, problems
Decoupling allows us to use more sophisticated uncertaintymodels
If number of iterations is small, implementor’s problem is a small“convex” problem
Most progress will be achieved in initial iterations – permits “soft”termination criteria
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 17 / 52
Data-driven Algorithm
Implementor’s problem
A convex quadratic program
At iteration m, solve
min λx T Qx − r
Subject to:
Ax ≥ b
r ≤ µT(i )x , i = 1, . . . , m
Here, µ(1), . . . , µ(m) are given return vectors
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 18 / 52
Data-driven Algorithm
Adversarial problem: A mixed-integer program
x ∗ = given asset weights
min∑
j x ∗j µj
Subject to:
µj (1 −∑
i γi−1 y ij ) ≤ µj ≤ µj (1 −∑
i γi y ij ) ∀i ≥ 1∑i y ij ≤ 1, ∀ j (each asset in at most one segment)
n i ≤∑
j y ij ≤ Ni , 1 ≤ i ≤ K (segment cardinalities)∑j∈Th
µj ≥ Γh∑
j∈Thµj , 1 ≤ h ≤ H (tier ineqs.)
µj free, y ij = 0 or 1 , all i, j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 19 / 52
Data-driven Experiments
Example: 2464 assets, 152-factor model. CPU time: 500 seconds
10 segments (a: “heavy tail”)6 tiers: the top five deciles lose at most 10% each, total loss ≤ 5%
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 20 / 52
Data-driven Experiments
Same run
2464 assets, 152 factors;10 segments, 6 tiers
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 21 / 52
Data-driven Experiments
Summary of average problems with 3-4 segments, 2-3 tiers
columns rows iterations time imp. time adv. time(sec.)
1 500 20 47 1.85 1.34 0.462 500 20 3 0.09 0.01 0.033 703 108 1 0.29 0.13 0.044 499 140 3 3.12 2.65 0.055 499 20 19 0.42 0.21 0.176 1338 81 7 0.45 0.17 0.087 2019 140 8 41.53 39.6 0.368 2443 153 2 12.32 9.91 0.079 2464 153 111 100.81 60.93 36.78
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 22 / 52
Data-driven Approximation
Why the adversarial problem is “easy”
( x ∗ = given asset weights)
min∑
j x ∗j µj
Subject to:
µj (1 −∑
i γi−1 y ij ) ≤ µj ≤ µj (1 −∑
i γi y ij )∑i y ij ≤ 1, ∀ j (each asset in at most one segment)
n i ≤∑
j y ij ≤ Ni , ∀i (segment cardinalities)∑j∈Th
µj ≥ Γh (∑
j∈Thµj ), ∀h (tier inequalities)
µj free, y ij = 0 or 1 , all i, j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 23 / 52
Data-driven Approximation
Why the adversarial problem is “easy”
( K = no. of segments, H = no. of tiers)
Theorem . For every fixed K and H, and for every ε > 0, there is analgorithm that finds a solution to the adversarial problem with optimalityrelative error ≤ ε, in time polynomial in ε−1 and n (= no. of assets).
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 24 / 52
Data-driven Approximation
The simplest case
max∑
j x ∗j δj
Subject to:∑j δj ≤ Γ
0 ≤ δj ≤ u j y j , y j = 0 or 1 , all j∑j y j ≤ N
· · · a cardinality constrained knapsack problemB. (1995), DeFarias and Nemhauser (2004)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 25 / 52
Data-driven More experiments
What is the impact of the uncertainty model
All runs on the same data set with 1338 columns and 81 rows
1 segment: (200, 0.5)robust random return = 4.57, 157 assets
2 segments: (200, 0.25), (100, 0.5)robust random return = 4.57, 186 assets
2 segments: (200, 0.2), (100, 0.6)robust random return = 3.25, 213 assets
2 segments: (200, 0.1), (100, 0.8)robust random return = 1.50, 256 assets
1 segment: (100, 1.0)robust random return = 1.24, 281 assets
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 26 / 52
VAR Definition
Ambiguous chance-constrained models
1 The implementor chooses a vector x ∗ of assets
2 The adversary chooses a probability distribution P for the returnsvector
3 A random returns vector µ is drawn from P
→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)
Erdogan and Iyengar (2004), Calafiore and Campi (2004)
→ We want to model correlated errors in the returns
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 27 / 52
VAR Definition
Ambiguous chance-constrained models
1 The implementor chooses a vector x ∗ of assets
2 The adversary chooses a probability distribution P for the returnsvector
3 A random returns vector µ is drawn from P
→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)
Erdogan and Iyengar (2004), Calafiore and Campi (2004)
→ We want to model correlated errors in the returns
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 27 / 52
VAR Definition
Ambiguous chance-constrained models
1 The implementor chooses a vector x ∗ of assets
2 The adversary chooses a probability distribution P for the returnsvector
3 A random returns vector µ is drawn from P
→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)
Erdogan and Iyengar (2004), Calafiore and Campi (2004)
→ We want to model correlated errors in the returns
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 27 / 52
VAR Definition
Ambiguous chance-constrained models
1 The implementor chooses a vector x ∗ of assets
2 The adversary chooses a probability distribution P for the returnsvector
3 A random returns vector µ is drawn from P
→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)
Erdogan and Iyengar (2004), Calafiore and Campi (2004)
→ We want to model correlated errors in the returns
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 27 / 52
VAR Definition
Ambiguous chance-constrained models
1 The implementor chooses a vector x ∗ of assets
2 The adversary chooses a probability distribution P for the returnsvector
3 A random returns vector µ is drawn from P
→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)
Erdogan and Iyengar (2004), Calafiore and Campi (2004)
→ We want to model correlated errors in the returns
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 27 / 52
VAR Definition
Ambiguous chance-constrained models
1 The implementor chooses a vector x ∗ of assets
2 The adversary chooses a probability distribution P for the returnsvector
3 A random returns vector µ is drawn from P
→ Implementor wants to choose x∗ so as to minimize value-at-risk(conditional value at risk, etc.)
Erdogan and Iyengar (2004), Calafiore and Campi (2004)
→ We want to model correlated errors in the returns
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 27 / 52
VAR Definition
Uncertainty set
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 28 / 52
VAR Definition
Uncertainty set
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 28 / 52
VAR Definition
Uncertainty set
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 28 / 52
VAR Definition
Uncertainty set
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 28 / 52
VAR Definition
Uncertainty set
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 28 / 52
VAR Definition
Uncertainty set
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ 1for all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj (1 − δw j ) (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 28 / 52
VAR Definition
OR ...
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ Wfor all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj − δw j (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 29 / 52
VAR Definition
OR ...
Given a vector x ∗ of assets, the adversary
1 Chooses a vector w ∈ Rn (n = no. of assets) with 0 ≤ w j ≤ Wfor all j.
2 Chooses a random variable 0 ≤ δ ≤ 1
→ Random return: µj = µj − δw j (µ = nominal returns).
Definition: Given reals ν and 0 ≤ θ ≤ 1 the value-at-risk of x∗ is thereal ρ ≥ 0 such that
Prob(ν − µT x ∗ ≥ ρ) ≥ θ
→ The adversary wants to maximize VAR
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 29 / 52
VAR Definition
The classical factor model for returns
µ = µ + V T f + ε
where
µ = expected return,
V = “factor exposure matrix”,
f = a bounded random variable,
ε = residual errors
V is r × n with r << n.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 30 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
→ Random returnj = µj (1 − δw j ) where 0 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
A discrete distribution:
We are given fixed values 0 = δ0 ≤ δ2 ≤ ... ≤ δK = 1example: δi = i
K
Adversary chooses πi = Prob(δ = δi ), 0 ≤ i ≤ K
The πi are constrained: we have fixed bounds, πli ≤ πi ≤ πu
i(and possibly other constraints)
Tier constraints: for sets (“tiers”) Th of assets, 1 ≤ h ≤ H, werequire:E(δ
∑j∈Th
w j ) ≤ Γh (given)
or, (∑
i δi πi )∑
j∈Thw j ≤ Γh
Cardinality constraint: w j > 0 for at most N indices j
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 31 / 52
VAR Adversarial problem
The adversarial problem is “easy”
K = no. of points in discrete distribution, H = no. of tiers
Theorem
Without the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number oflinear programs.
With the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number ofknapsack problems.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 32 / 52
VAR Adversarial problem
The adversarial problem is “easy”
K = no. of points in discrete distribution, H = no. of tiers
Theorem
Without the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number oflinear programs.
With the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number ofknapsack problems.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 32 / 52
VAR Adversarial problem
The adversarial problem is “easy”
K = no. of points in discrete distribution, H = no. of tiers
Theorem
Without the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number oflinear programs.
With the cardinality constraint, for each fixed K and H theadversarial problem can be solved as a polynomial number ofknapsack problems.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 32 / 52
VAR Adversarial problem
Adversarial problem as an MIP
Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w
min π,w ,V min 1≤i≤k Vi
Subject to
0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu
i , all i,∑i πi = 1,
Vi =∑
j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ
Vi = M (large ), otherwise
(∑
i δi πi )∑
j∈Thw j ≤ Γh , for each tier h
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 33 / 52
VAR Adversarial problem
Adversarial problem as an MIP
Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w
min π,w ,V min 1≤i≤k Vi
Subject to
0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu
i , all i,∑i πi = 1,
Vi =∑
j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ
Vi = M (large ), otherwise
(∑
i δi πi )∑
j∈Thw j ≤ Γh , for each tier h
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 33 / 52
VAR Adversarial problem
Adversarial problem as an MIP
Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w
min π,w ,V min 1≤i≤k Vi
Subject to
0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu
i , all i,∑i πi = 1,
Vi =∑
j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ
Vi = M (large ), otherwise
(∑
i δi πi )∑
j∈Thw j ≤ Γh , for each tier h
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 33 / 52
VAR Adversarial problem
Adversarial problem as an MIP
Recall: random returnj µj = µj (1 − δw j )where δ = δi (given) with probability πi (chosen by adversary),0 ≤ δ0 ≤ δ1 ≤ . . . ≤ δK = 1 and 0 ≤ w
min π,w ,V min 1≤i≤k Vi
Subject to
0 ≤ w j ≤ 1, all j, πli ≤ πi ≤ πu
i , all i,∑i πi = 1,
Vi =∑
j µj (1 − δi w j )x ∗j , if πi + πi+1 + . . . + πK ≥ 1 − θ
Vi = M (large ), otherwise
(∑
i δi πi )∑
j∈Thw j ≤ Γh , for each tier h
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 33 / 52
VAR Adversarial problem
Approximation
(∑
i δi πi )∑
j∈Thw j ≤ Γh , for each tier h (∗)
Let N > 0 be an integer. For 1 ≤ k ≤ N , write
kN
∑j∈Th
w j ≤ Γh + M (1 − zhk ), where
zhk = 1 if k −1N <
∑i δi πi ≤ k
N
zhk = 0 otherwise∑k zhk = 1
and M is large
Lemma. Under reasonable conditions, replacing (∗) with this systemchanges the value of the problem by at most a factor of (1 + 1
N )
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 34 / 52
VAR Adversarial problem
Approximation
(∑
i δi πi )∑
j∈Thw j ≤ Γh , for each tier h (∗)
Let N > 0 be an integer. For 1 ≤ k ≤ N , write
kN
∑j∈Th
w j ≤ Γh + M (1 − zhk ), where
zhk = 1 if k −1N <
∑i δi πi ≤ k
N
zhk = 0 otherwise∑k zhk = 1
and M is large
Lemma. Under reasonable conditions, replacing (∗) with this systemchanges the value of the problem by at most a factor of (1 + 1
N )
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 34 / 52
VAR Adversarial problem
Approximation
(∑
i δi πi )∑
j∈Thw j ≤ Γh , for each tier h (∗)
Let N > 0 be an integer. For 1 ≤ k ≤ N , write
kN
∑j∈Th
w j ≤ Γh + M (1 − zhk ), where
zhk = 1 if k −1N <
∑i δi πi ≤ k
N
zhk = 0 otherwise∑k zhk = 1
and M is large
Lemma. Under reasonable conditions, replacing (∗) with this systemchanges the value of the problem by at most a factor of (1 + 1
N )
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 34 / 52
VAR
Implementor’s problem
Find a near-optimal solution with minimum value-at-risk
Nominal problem:
v ∗ = min x λx T Qx − µT x
Subject to:
Ax ≥ b
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 35 / 52
VAR
Implementor’s problem
Find a near-optimal solution with minimum value-at-risk
Nominal problem:
v ∗ = min x λx T Qx − µT x
Subject to:
Ax ≥ b
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 35 / 52
VAR
Implementor’s problem
Find a near-optimal solution with minimum value-at-risk
Given asset weights x , we have:
value-at-risk ≥ ρ, if the adversary can produce a return vector µwith
Prob(ν − µT x ≥ ρ) ≥ θ
where ν is a fixed reference value.
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 36 / 52
VAR
Implementor’s problem
Find a near-optimal solution with minimum value-at-risk
Implementor’s problem at iteration r:
min V
Subject to:
λx T Qx − µT x ≤ (1 + ε)v ∗
Ax ≥ b
V ≥ ν −∑
j µj
(1 − δi (t )w
(t )j
)x j , t = 1, 2, . . . , r − 1
Here, δi (t ) and w (t ) are the adversary’s output at iteration t < r .
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 37 / 52
VAR
Implementor’s problem
Find a near-optimal solution with minimum value-at-risk
Implementor’s problem at iteration r:
min V
Subject to:
λx T Qx − µT x ≤ (1 + ε)v ∗
Ax ≥ b
V ≥ ν −∑
j µj
(1 − δi (t )w
(t )j
)x j , t = 1, 2, . . . , r − 1
Here, δi (t ) and w (t ) are the adversary’s output at iteration t < r .
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 37 / 52
VAR Runs
First set of experiments
1338 assets, 41 factors, 81 rows
problem: find a “near optimal” solution with minimum value-at-risk,for a given threshold probability θ
experiment: investigate different values of θ
“near optimal”: want solutions that are at most 1% more expensivethan optimalrandom variable δ:
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 38 / 52
VAR Runs
First set of experiments
1338 assets, 41 factors, 81 rows
problem: find a “near optimal” solution with minimum value-at-risk,for a given threshold probability θ
experiment: investigate different values of θ
“near optimal”: want solutions that are at most 1% more expensivethan optimalrandom variable δ:
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 38 / 52
VAR Runs
1338 assets, 41 factors, 81 rows, ≤ 1% suboptimality
θ time iters. VAR VAR as %(sec.)
.89 1.18 2 3.43131 71.50
.90 1.42 3 3.74498 78.04
.91 1.42 3 3.74498 78.04
.92 3.47 11 4.05669 84.53
.93 6.35 29 4.05721 84.54
.94 6.37 20 4.05721 84.54
.95 26.59 51 4.35481 90.74
.96 26.25 51 4.35481 90.74
.97 26.20 51 4.35481 90.74
.98 33.07 58 4.63938 96.67
.99 33.11 58 4.63938 96.67
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 39 / 52
VAR Runs
Second set of experiments
Fix θ = 0.90 but vary suboptimality criterion
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 40 / 52
VAR Runs
Typical convergence behavior
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 41 / 52
VAR Runs
Heavy tail, proportional error (100 points):
Heavy tail, constant error (100 points):
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 42 / 52
VAR Runs
Heavy tail, proportional error (100 points):
Heavy tail, constant error (100 points):
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 42 / 52
VAR Runs
Proportional Constantθ Time Its Time Its
.85 4.79 8 11.84 11
.86 2.32 3 8.27 8
.87 6.40 10 9.55 11
.88 4.34 4 18.10 24
.89 8.00 14 5.85 6
.90 2.58 4 13.54 20
.91 4.79 9 16.31 23
.92 7.99 15 13.13 22
.93 13.43 27 22.47 40
.94 10.04 15 21.99 40
.95 9.59 16 11.90 25
.96 6.63 17 29.89 54
.97 48.43 110 16.45 35
.98 20.25 53 20.25 45
.99 22.02 52 21.89 47
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 43 / 52
VAR Harder case
A difficult case
2464 columns, 152 factors, 3 tiers
time = 6191 seconds
258 iterations
implementor time = 6123 seconds,adversarial time = 20 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 44 / 52
VAR Harder case
A difficult case
2464 columns, 152 factors, 3 tiers
time = 6191 seconds
258 iterations
implementor time = 6123 seconds,adversarial time = 20 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 44 / 52
VAR Harder case
A difficult case
2464 columns, 152 factors, 3 tiers
time = 6191 seconds
258 iterations
implementor time = 6123 seconds,adversarial time = 20 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 44 / 52
VAR Harder case
A difficult case
2464 columns, 152 factors, 3 tiers
time = 6191 seconds
258 iterations
implementor time = 6123 seconds,adversarial time = 20 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 44 / 52
VAR Harder case
A difficult case
2464 columns, 152 factors, 3 tiers
time = 6191 seconds
258 iterations
implementor time = 6123 seconds,adversarial time = 20 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 44 / 52
VAR Harder case
Implementor runtime
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 45 / 52
VAR Harder case
Implementor’s problem at iteration r
min V
Subject to:
λx T Qx − µT x ≤ (1 + ε)v ∗
Ax ≥ b
V ≥ ν −∑
j µj
(1 − δi (t )w
(t )j
)x j , t = 1, 2, . . . , r − 1
Here, δi (t ) and w (t ) are the adversary’s output at iteration t < r .
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 46 / 52
VAR Harder case
Implementor’s problem at iteration r
Approximate version
min V
Subject to:
2λ x T(k )Qx − λ x T
(k )Qx(k ) − µT x ≤ (1 + ε)v ∗, ∀k < r
Ax ≥ b
V ≥ ν −∑
j µj
(1 − δi (k )w
(k )j
)x j , ∀ k < r
Here, δi (k ) and w (k ) are the adversary’s output at iteration k < r ,and x(k ) is the implementor’s output at iteration k .
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 47 / 52
VAR Harder case
Does it work?
Before: 258 iterations, 6191 seconds
Linearized: 1776 iterations, 3969 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 48 / 52
VAR Harder case
Does it work?
Before: 258 iterations, 6191 seconds
Linearized: 1776 iterations, 3969 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 48 / 52
VAR Harder case
Averaging
x(k ) is the implementor’s output at iteration k .
Define y(1) = x(1)
For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1
Input y(k ) to the adversary
Old ideas, also Nesterov, Nemirovsky (2003)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 49 / 52
VAR Harder case
Averaging
x(k ) is the implementor’s output at iteration k .
Define y(1) = x(1)
For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1
Input y(k ) to the adversary
Old ideas, also Nesterov, Nemirovsky (2003)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 49 / 52
VAR Harder case
Averaging
x(k ) is the implementor’s output at iteration k .
Define y(1) = x(1)
For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1
Input y(k ) to the adversary
Old ideas, also Nesterov, Nemirovsky (2003)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 49 / 52
VAR Harder case
Averaging
x(k ) is the implementor’s output at iteration k .
Define y(1) = x(1)
For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1
Input y(k ) to the adversary
Old ideas, also Nesterov, Nemirovsky (2003)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 49 / 52
VAR Harder case
Averaging
x(k ) is the implementor’s output at iteration k .
Define y(1) = x(1)
For k > 1, y(k ) = λ x(k ) + (1 − λ) y(k −1),0 ≤ λ ≤ 1
Input y(k ) to the adversary
Old ideas, also Nesterov, Nemirovsky (2003)
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 49 / 52
VAR Harder case
Does it work?
Default: 258 iterations, 6191 seconds
Linearized: 1776 iterations, 3969 seconds
Averaging plus Linearized: 860 iterations, 530 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 50 / 52
VAR Harder case
Does it work?
Default: 258 iterations, 6191 seconds
Linearized: 1776 iterations, 3969 seconds
Averaging plus Linearized: 860 iterations, 530 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 50 / 52
VAR Harder case
Does it work?
Default: 258 iterations, 6191 seconds
Linearized: 1776 iterations, 3969 seconds
Averaging plus Linearized: 860 iterations, 530 seconds
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 50 / 52
Future work
Other robust models
Min-max expected loss with orthogonal missing factor
Random return = µ • (1 − δw ) where −1 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
normalization constraints, e.g.∑
j w j = 0
errors in covariance matrix Q
robust problem: → min x maxQ∈Q λx T Qx − µT x
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 51 / 52
Future work
Other robust models
Min-max expected loss with orthogonal missing factor
Random return = µ • (1 − δw ) where −1 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
normalization constraints, e.g.∑
j w j = 0
errors in covariance matrix Q
robust problem: → min x maxQ∈Q λx T Qx − µT x
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 51 / 52
Future work
Other robust models
Min-max expected loss with orthogonal missing factor
Random return = µ • (1 − δw ) where −1 ≤ w j ≤ 1 ∀ j , and0 ≤ δ ≤ 1 is a random variable.
normalization constraints, e.g.∑
j w j = 0
errors in covariance matrix Q
robust problem: → min x maxQ∈Q λx T Qx − µT x
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 51 / 52
Future work
On-going work: a provably good version of Benders’ algorithm
Daniel Bienstock ( Columbia University) Experiments in Robust Optimization 10-26-06 52 / 52