com s 672: advanced topics in computational models of...

28
COM S 672: Advanced Topics in Computational Models of Learning – Optimization for Learning Lecture Note 9: Higher-Order Methods – II Jia (Kevin) Liu Assistant Professor Department of Computer Science Iowa State University, Ames, Iowa, USA Fall 2017 JKL (CS@ISU) COM S 672: Lecture 9 1 / 26

Upload: ngotram

Post on 06-Mar-2018

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

COM S 672: Advanced Topics in Computational

Models of Learning – Optimization for Learning

Lecture Note 9: Higher-Order Methods – II

Jia (Kevin) Liu

Assistant Professor

Department of Computer Science

Iowa State University, Ames, Iowa, USA

Fall 2017

JKL (CS@ISU) COM S 672: Lecture 9 1 / 26

Page 2: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Outline

In this lecture:

Quasi-Newton methods

Interior-point methods

JKL (CS@ISU) COM S 672: Lecture 9 2 / 26

Page 3: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Quasi-Newton Theory

Key idea: Maintains an approximation to the Hessian that’s filled in usinginformation gained on successive steps and generate H-conjugate directions

Suppose f(x) = c>x+ 1

2x>Hx, where H ⌫ 0

Define pk = xk+1 � xk and qk = rf(xk+1)�rf(xk). Note that

Hpk = H(xk+1 � xk) = (c+Hxk+1)� (c+Hxk) = qk

Construct an estimate Bk for H satisfying: Bkpk = qk, 8k thus far. Thus:

H�1

Bkpj = H�1

qj = pj

This implies: (H�1Bk)pj = pj , 8j = 1, . . . , k � 1, i.e., p1, . . . ,pk�1 are

eigenvectors of (H�1Bk) with unit eigenvectors

Hence, (HBn+1)pk = pk, 8k = 1, . . . , n

JKL (CS@ISU) COM S 672: Lecture 9 3 / 26

[ BSS.ch 8.8 ]

- -

c-Hessian necc . condition )

§ g.Jzt :

- ikt

Page 4: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Quasi-Newton Theory

Suppose that p1, . . . ,pn are linear independent

Denote P = [p1 p2 · · · pn] 2 Rn⇥n. Then we have

(H�1Bn+1)P = P

which implies:

H�1

Bn+1 = I, i.e., Bn+1 = H

Thus, the goal of Quasi-Newton methods is to find a sequence {Bk} ofapproximate Hessian to satisfy that, for all k,

Bkpj = qj , 8j = 1, . . . , k � 1,

which is term quasi-Newton equation or secant equation

Once Bk is determined, find dk that satisfies Bkdk = �rf(xk). It can beshown that the generated d1, . . . ,dn are H-conjugate

JKL (CS@ISU) COM S 672: Lecture 9 4 / 26

¥7tp÷=t"est

.

⇐.- - '

I} Enfifth.

Hk tktl-

exactly recover it after n steps .

qd,

⇒dk=

- Irtftek)

Page 5: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Quasi-Newton Theory

From secant equations, designing Quasi-Newton methods boils down to:

I Given some Bk ⌫ 0 such that Bkpj = qj , 8j = 1, . . . , k � 1

I Want to find a Bk+1 ⌫ 0, such that Bk+1pj = qj , 8j = 1, . . . , k

Key Idea: Try Bk+1 = Bk +Ck for some correction matrix Ck

I This implies: Bkpj +Ckpj = qj , 8j = 1, . . . , k, i.e.,

Ckpj = 0, for j = 1, . . . , k � 1

Ckpk = qk �Bkpk

These two equations give rise to a variety of Quasi-Newton methods:

I Broyden family (Broyden-Fletcher-Goldfarb-Shanno (BFGS) update)

I Davidon-Fletcher-Powell Method (dual construct of Broyden family)

I See [BSS Ch. 8.8] for an excellent treatment on Quasi-Newton theory

JKL (CS@ISU) COM S 672: Lecture 9 5 / 26

Page 6: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Broyden-Fletcher-Goldfarb-Shanno (BFGS) Update

Try the following correction matrix CBFGSk :

CBFGSk =

qkq>k

q>k pk

�Bkpkp

>k Bk

p>k Bkpk

Obtained independently by Broyden, Fletcher, Goldfarb, and Shanno in theyear of 1970, hence the name BFGS

Highly successful due to its e�ciency & robustness, implemented in manynumerical optimizers (e.g., MATLAB, R, GNU C regression libraries...)

JKL (CS@ISU) COM S 672: Lecture 9 6 / 26

Page 7: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Implementing BFGS in Practice

Having found Bk+1, find dk+1 by solving Bk+1dk+1 = �rf(xk+1), i.e.,

dk+1 = �B�1k+1rf(xk+1)

Often more convenient to update inverse series {Dk} , {B�1k } directly:

I Let D1 = B�11 = I.

I In iteration k, given Dk, compute Dk+1 as follows:

Dk+1 = [Bk+1]�1 = [Bk +CBFGS

k ]�1

= [Bk + a1b>1 + a2b

>2 ]

�1, (1)

where a1=qk/(q>k pk), b1=qk, a2=�(Bkpk)/(p

>k Bkpk), and b2=Bkpk

I Eq. (1) shows that Bk+1 can be obtained from Bk with a rank-two update

JKL (CS@ISU) COM S 672: Lecture 9 7 / 26

Page 8: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Implementing BFGS in Practice

Therefore, Dk+1 can be computed by using two sequential applications ofSherman-Morrison-Woodbury (SMW) matrix inverse formula:

[A+ ab>]�1 = A

�1�

A�1

ab>A

�1

1 + b�1A�1a

Note: In general, SMW inverse formula is advantageous to use when A�1 is

known of cheap to compute (e.g., diagonal, sparse, structured, etc.)

As a result, we obtain the following BFGS update for the sequence {Dk}:

Dk+1 = Dk +pkp

>k

p>k qk

✓1 +

q>k Dkqk

p>k qk

◆�

Dkqkp>k + pkq

>k Dk

p>k qk| {z }

,C̄BFGSk

Can prove superlinear local convergence for BFGS (and other Quasi-Newtonmethods): kxk+1 � x

⇤k/kxk � x

⇤k ! 0. Not as fast as Newton, but fast!

JKL (CS@ISU) COM S 672: Lecture 9 8 / 26

s A. tut .

( k )Generalized SMW :(lA=tuE¥5" = # - AI'u=(Et+¥Atu5' IAI '

,A.ernxn.u.tk#e.v=elRM.EeRh

"

Page 9: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

L-BFGS

In BFGS (and other quasi-Newton methods), we need n⇥ n storage space tocompute approximate Hessian Bk (or approximate inverse Dk)

Still expensive when n is large. Enter the limited memory BFGS (L-BFGS)!

L-BFGS doesn’t store Bk or Dk. Rather, it only keeps track of pk and qk

from the last few iterations (say 5 to 10), and reconstruct matrices as needed

I Take an initial B0 or D0 and assume m steps have been taken since

I Compute Bkpk via a series of inner and outer products with matrices frompk�j and qk�j from last m iterations: j = 1, . . . ,m� 1

Attractive for problems when n is large (typical in machine learningproblems). Require 2mn storage and O(mn) linear algebra operations, pluscost of function and gradient evaluations, and line search

No superlinear convergence proof, but good behavior has been observed inmany applications (see [Liu & Nocedal, ’89], [Nocedal & Wright, Chap. 7.2])

JKL (CS@ISU) COM S 672: Lecture 9 9 / 26

Page 10: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Interior-Point Methods

Consider the following constrained minimization problem:

Minimize f(x)

subject to gi(x) 0, i = 1, . . . ,m

Ax = b

where:

I f and gi are convex, twice continuously di↵erentiable

I A 2 Rp⇥n with rank(A) = p

I Assume that p⇤. is finite and attainable

I Assume that the problem is strictly feasible (Slater’s condition), i.e., 9x̃ with

x̃ 2 dom{f}, gi(x̃) < 0, i = 1, . . . ,m, Ax̃ = b,

hence strong duality holds and dual optimum is attainable

JKL (CS@ISU) COM S 672: Lecture 9 10 / 26

Page 11: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Logarithmic Barrier Function

Reformulate the problem via indicator function:

Minimize f(x) +mX

i=1

�(gi(x))

subject to Ax = b,

where �(u) = 0 if u 0, �(u) = 1 otherwise (indicator function of R�)

Consider the approximation through logarithmic barrier

Minimize f(x)�⇣ 1

µ

⌘ mX

i=1

log(�gi(x))

subject to Ax = b

where µ > 0 is a parameter

JKL (CS@ISU) COM S 672: Lecture 9 11 / 26

¥

Page 12: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

The Log Barrier Approximate Problem

An equality constrained problem

For µ > 0, �(1/µ) log(�u) is a smooth approximation of �(·)

Approximation gets better as µ ! 1

JKL (CS@ISU) COM S 672: Lecture 9 12 / 26

Page 13: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Properties of Log Barrier Function

�(x) = �

mX

i=1

log(�gi(x)), dom{�} = {x|gi(x) < 0, i = 1, . . . ,m}

Convex (following composition rules of convexity)

Twice continuously di↵erentiable with derivatives:

r�(x) = �

mX

i=1

1

gi(x)rgi(x)

H�(x) =mX

i=1

1

gi(x)2rgi(x)gi(x)

>�

mX

i=1

1

gi(x)Hgi(x)

JKL (CS@ISU) COM S 672: Lecture 9 13 / 26

Y

Page 14: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Central Path

For µ > 0, define x⇤(µ) as the solution of

Minimize µf(x) + �(x)

subject to Ax = b,

Assume that x⇤(µ) exists and is unique for all µ > 0

Central path is defined as {x⇤(µ)|µ > 0}

Example: Central path for an LP

JKL (CS@ISU) COM S 672: Lecture 9 14 / 26

our

.ee#Ieeyminer ← '

is:

... -

ay- C- YX,

''

l'

'

sit Eirebi .Vi

, , p ,\ ;

' II.i' !

nyperpbneetn . Emmis¥¥tgEyM1i€tangential to the level carnyaadw

" '

it.

.¥==÷¥'

of 9 through 'z*4w) \ "i

,

Page 15: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Dual Points on Central Path

For x = x⇤(µ), if there exists a w such that

µrf(x)�mX

i=1

1

gi(x)rgi(x) +A

>w = 0, Ax = b

Then, x⇤(µ) minimizes the Lagrangian

L(x,u⇤(µ),v⇤(µ)) = f(x) +mX

i=1

u⇤i (µ)gi(x) + v

⇤(µ)>(Ax� b),

where u⇤i (µ) , 1/(�µgi(x⇤(µ))) and v

⇤(µ) , w/µ

This confirms the intuitive idea that f(x⇤(µ)) ! p⇤ as µ ! 1 since:

p⇤� ⇥(u⇤(µ),v⇤(µ)) = L(x⇤(µ),u⇤(µ),v⇤(µ)) = f(x⇤(µ))�m/µ,

which implies f(x⇤(µ))� p⇤

mµ # 0 as µ ! 1

JKL (CS@ISU) COM S 672: Lecture 9 15 / 26

r¥**÷

Page 16: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Interpretation as Perturbed KKT System

The primal-dual solutions x = x⇤(µ), u = u

⇤(µ), and v = v⇤(µ) satisfy:

(ST): rf(x) +Pm

i=1 uirgi(x) +A>v = 0

( 1µ -CS): uigi(x) = �1µ , i = 1, . . . ,m

(PF): gi(x) 0, i = 1, . . . ,m, Ax = b

(DF): u � 0, v unconstrained

That is, the di↵erence between KKT is that ( 1µ -CS) replaces (CS): uigi(x) = 0

JKL (CS@ISU) COM S 672: Lecture 9 16 / 26

Page 17: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Force Field Interpretation

Consider the following “centering” problem (without equality constraints)

Minimize µf(x)�mX

i=1

log(�gi(x))

It admits the following force field interpretation:

µf(x) is potential of force field F0(x) = �µrf(x)

� log(�gi(x)) is potential of force field Fi(x) = (1/gi(x))rgi(x)

The forces balance at x⇤(µ):

F0(x⇤(µ)) +

mX

i=1

Fi(x⇤(µ)) = 0

JKL (CS@ISU) COM S 672: Lecture 9 17 / 26

Page 18: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Force Field Interpretation

Example: Minimize c>x

subject to a>i x bi, i = 1, . . . ,m

Objective force field is a constant: F0(x) = �µc

Constraint force field decays as inverse distance to constraint hyperplane:

Fi(x) = �ai

bi � a>i x

, kFi(x)k2 =1

dist(x,Hi)

where Hi = {x|a>i x = bi}

JKL (CS@ISU) COM S 672: Lecture 9 18 / 26

µ= 1 µ⇒ .

Page 19: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

The Barrier Method

1 Initialization: A strictly feasible x (interior point), µ = µ0 > 0, � > 1,tolerance ✏ > 0.

2 Centering step: Compute x⇤(µ) by minimizing µf + �, subject to Ax = b.

Update x = x⇤(µ).

3 Stop if mµ < ✏. Otherwise, let µ = �µ and go to Step 2.

Remarks:

Terminates with f(x)� p⇤ ✏ (following from f(x⇤(µ))� p

mµ )

Centering usually done using Newton’s method, starting at current x

Choice of � involves a trade-o↵: large � means fewer outer iterations, moreinner (Newton) iterations; typical values: � 2 [10, 20]

As µ gets larger (nearer the optimal solution), it’s getting harder and harderfor Newton’s method to converge (due to ill condition with large µ)

JKL (CS@ISU) COM S 672: Lecture 9 19 / 26

It's not nrecc . to solve x.*( µ ) accurately

Page 20: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Convergence Analysis

Number of outer (centering) iterations:

⇠log(m/(✏µ0))

log �

plus the initial centering step (to compute x⇤(µ0))

Convergence of the centering problem

Minimize µf(x) + �(x)

follows the convergence analysis of Newton’s method:

I µf + � must have closed sublevel sets for µ > µ0

I Classical analysis requires strong convexity, Lipschitz condition

I Analysis via self-concordance requires self-concordance of µf + �

JKL (CS@ISU) COM S 672: Lecture 9 20 / 26

Page 21: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Feasibility and Phase I Methods

Feasibility problem: Find x such that

gi(x) 0, i = 1, . . . ,m, Ax = b (2)

Phase I: Computes strictly feasible starting point for barrier method

Minimizex,s

s

subject to gi(x) s, i = 1, . . . ,m (3)

Ax = b

I If x, s feasible, with s < 0, then x is strictly feasible for (2)

I If optimal value p̄⇤ of (3) is positive, then (2) is infeasible

I if p̄⇤ = 0 and attained, then problem (2) is feasible (but not strictly)

I if p̄⇤ = 0 and not attained, then problem (2) is infeasible

JKL (CS@ISU) COM S 672: Lecture 9 21 / 26

Page 22: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Primal-Dual Interior-Point Methods

Primal-dual interior-point methods are another class of interior-pointmethods powerful for linear and convex quadratic programming

Consider the following linear constrained quadratic programming problem:

Minimize c>x+

1

2x>Qx

subject to Ax = b, x � 0

where Q is symmetric PSD (LP is a special case with Q = 0)

KKT conditions are that there exist u and v such that:

Qx+ c�A>u� v = 0, Ax = b, (x,v) � 0, xivi = 0, i = 1, . . . , n

Defining:

X , Diag(x1, . . . , xn), S , Diag(s1, . . . , sn),

so we can rewrite the last condition as XSe = 0, where e = [1, 1, . . . , 1]>

JKL (CS@ISU) COM S 672: Lecture 9 22 / 26

Hs )

i i

CCS )

Page 23: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Primal-Dual Interior-Point Methods

Thus, KKT conditions can be rewritten as a square system of constrained,nonlinear equations:

2

4Qx+ c�A

>u� v

Ax� b

XSe

3

5 = 0, (x,v) = 0

Primal-dual interior-point methods generate iterates (xk,uk,vk) with:

I (xk,vk) > 0 (i.e., interior)

I Each step (�xk,�uk,�sk) is a Newton step on a perturbed version of theequations (the perturbation eventually goes to zero)

I Use step-size sk to maintain (xk+1, sk+1) > 0. Set

(xk+1,uk+1, sk+1) = (xk,uk, sk) + sk(�xk,�uk,�sk)

JKL (CS@ISU) COM S 672: Lecture 9 23 / 26

< s

Page 24: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Primal-Dual Interior-Point Methods

The perturbed Newton step is a linear system:

2

4Q �A

>�I

A 0 0

S 0 0

3

5

2

4�xk

�uk

�vk

3

5 =

2

64r(x)k

r(u)k

r(v)k

3

75

where r(x)k = �(Qxk + c�A

>uk � vk)

r(u)k = �(Axk � b)

r(v)k = �XkSke+ �k�ke

Here, r(x)k , r(u)k , r

(v)k are current residuals, �k = (x>

k vk)/n is the currentduality gap, and �k 2 (0, 1] is a centering parameter

A lot of structure in the system that can be exploited for algorithm design.More e�cient than barrier method if high accuracy is needed

See [Wright, ’97] for a description of primal-dual interior-point method

JKL (CS@ISU) COM S 672: Lecture 9 24 / 26

Page 25: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Interior-Point Methods for Learning Problems

Interior-point methods were used early for compressed sensing, regularized leastsquares, SVM:

SVM with hinge loss formulated as QP, solved with primal-dual interior-pointmethod (e.g., [Gertz & Wright, ’03], [Fine & Scheinberg, 01], [Ferris &Munson, ’02]

Compressed sensing & LASSO variable selection formulated as bound-constrained QP and solved by primal-dual; or SOCP solved by barrier (e.g.,[Cades & Romberg, ’05])

However, they were mostly superseded by first-order methods due to increasinglylarge size of machine learning problems

Stochastic gradient descent (low accuracy, simple data access)

Gradient projection with sparsity regularization and prox-gradient incompressed sensing (require only matrix-vector multiplications)

Perhaps just a few clever ideas away to revive interior-point methods?

JKL (CS@ISU) COM S 672: Lecture 9 25 / 26

Page 26: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Next Class

Sparse/Regularized Optimization

JKL (CS@ISU) COM S 672: Lecture 9 26 / 26

Page 27: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Check BFGS :

C=k= -9k¥ l Ekfk )l¥kFr5'

ftp.T.EE ( rank -2 update )

WTS : Ekfj =Q for JH ,- - iky

Ekfk=Fk -

Be kfk

check :

.pe#IyMHrrtYEIfhphpD=qk-Brpr✓

Eke-Epittitotjn atI¥ '

=P

= teeth-threat

'tFEFR FIEKFR

#Fj

=

.EE?EEE#*= - KEEFE =e . ✓

FEERFN-

BFGS Update is derived by solving the

following optimization problem :

"

minimal changes"

.

minHetI±±⇐""Tf"t.ge?grkgrjapmYjnnp#ei. .|st'

lE¥.EE#i9kaeonewituyewestasmmptionlsmpweitg

- should be selected.

Page 28: COM S 672: Advanced Topics in Computational Models of ...web.cs.iastate.edu/~jialiu/teaching/COMS672_F17/Lectures/LN9_b... · COM S 672: Advanced Topics in Computational Models of

Barrierpmonm method / Path following method :

[ Fiacco ,Macormrk ' 697 sequential unconstr

. minimisation techniqueI ( Saint ) .

a

karmarlea '

84 at Bell Labs .

"

Interior pt .method for LP

"

.

khachiyan '72

"

Ellipsoid Method' '

.

Nesterov & Nemiroski : special class of barriers ( self - concordant ) to encode

any= convex sets =) # of iterations bounded by a polynomial in both the

dim. of the problem and the accuracy ,