squarem - an r package for accelerating slowly convergent fixed-point ... · background...

15
Background Acceleration of Convergence Results SQUAREM An R package for Accelerating Slowly Convergent Fixed-Point Iterations Including the EM and MM algorithms Ravi Varadhan 1 1 Division of Geriatric Medicine & Gerontology Johns Hopkins University Baltimore, MD, USA UseR! 2010 NIST, Gaithersburg, MD July 22, 2010 Varadhan SQUAREM

Upload: dangcong

Post on 13-Dec-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

BackgroundAcceleration of Convergence

Results

SQUAREMAn R package for Accelerating Slowly Convergent

Fixed-Point Iterations Including the EM and MMalgorithms

Ravi Varadhan1

1Division of Geriatric Medicine & GerontologyJohns Hopkins University

Baltimore, MD, USA

UseR! 2010NIST, Gaithersburg, MD

July 22, 2010

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Speed Is Not All That It’s Cranked Up To Be

Evil deeds do not prosper; the slow man catches upwith the swift - Homer (Odyssey)

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Fixed-Point IterationsExamples

What is a Fixed-Point Iteration?

xk+1 = F (xk ), k = 0,1, . . . .

F : Ω ⊂ Rp 7→ Ω, and differentiable

Most (if not all) iterations are FPIWe are interested in contractive FPIGuaranteed convergence: xk → x∗

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Fixed-Point IterationsExamples

EM Algorithm

Let y , z, x , be observed, missing, and complete data,respectively.The k -th step of the iteration:

θk+1 = argmax Q(θ|θk ); k = 0,1, . . . ,

where

Q(θ|θk ) = E [Lc(θ)|y , θk ],

=

∫Lc(θ)f (z|y , θk )dz,

Ascent property: Lobs(θk+1) ≥ Lobs(θk )

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Fixed-Point IterationsExamples

MM Algorithm

A majorizing function, g(θ| θk ):

f (θk ) = g(θk | θk ),

f (θk ) ≤ g(θ| θk ), ∀ θ.

To minimize f (θ), construct a majorizing function andminimize it (MM)

θk+1 = argmax g(θ|θk ); k = 0,1, . . .

Descent property: f (θk+1) ≤ f (θk )

Is EM a subclass of MM or are they equivalent? It avoidsthe E-step.

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Fixed-Point IterationsExamples

Least Squares Multidimensional Scaling

Minimize : σ(X ) =12

n∑ n∑wij(δij − dij(X ))2

over all m × p matrices X , where: dij =√∑p

k=1(xik − xjk )2

Jan de Leeuw’s SMACOF algorithm: ξk+1 = F (ξ),Has descent property: σ(ξk+1) < σ(ξk )

An instance of MM algorithm

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Fixed-Point IterationsExamples

BLP Contraction Mapping

Previous Talk!

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Fixed-Point IterationsExamples

Power Method

To find the eigenvector corresponding to the largest (inmagnitude) eigenvalue of an n × n matrix, A.

Not all that academic - Google’s PageRank algorithm!xk+1 = A.xk/‖A.xk‖Stop if ‖xk+1 − xk‖ ≤ εDominant eigenvalue (Rayleigh quotient) = 〈A x∗,x∗〉

〈x∗,x∗〉

Geometric convergence with rate ∝ |λ1||λ2|

Power method does not converge if |λ1| = |λ2|, butSQUAREM does!

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

R PackageResults

Why Accelerate Convergence?

These FPI are globally convergentConvergence is linear: Rate = [ρ(J(x∗))]−1

Slow convergence when spectral radius, ρ(J(x∗)), is largeNeed to be accelerated for practical applicationWithout compromising on global convergenceWithout additional information (e.g. gradient, Hessian,Jacobian)

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

R PackageResults

SQUAREM

An R package implementing a family of algorithms forspeeding-up any slowly convergent multivariate sequenceEasy to useIdeal for high-dimensional problemsInput: fixptfn = fixed-point mapping FOptional Input: objfn = objective function (if any)Two main control parameter choices: order of extrapolationand monotonicityAvailable on R-forge under optimizer project.install.packages(”SQUAREM”, repos =”http://R-Forge.R-project.org”)

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

R PackageResults

Upshot

SQUAREM works great!Significant acceleration (depends on the linear rate of F )Globally convergent (especially, first-order locallynon-monotonic schemes)Finds the same or (sometimes) better fixed-points than FPI(e.g. EM, SMACOF, Power method)

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Multidimensional Scaling: SMACOFPower Method for Dominant Eigenvector

SMACOF Results

Mores code data (de Leeuw 2008). 36 Morse signals compared- 630 dissimilarities & 69 parameters

Table: A comparison of the different schemes.

Scheme # Fevals # ObjEvals CPU (sec) ObjfnValueSMACOF 1549 1549 471 0.0593SQ1 213 141 55 0.0593SQ2 140 57 32 0.0593SQ3 113 33 24 0.0457SQ3* 113 0 19 0.0457

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Multidimensional Scaling: SMACOFPower Method for Dominant Eigenvector

Power Method - Part I

Generated a 1000× 1000 (arbitrary) matrix with eigenvalues asfollows:eigvals <- c(2, 1.99, runif(997, 0, 1.9), -1.8)

A cool algorithm using the Soules matrix!

Table: A comparison of the different schemes: Average of 100simulations

Scheme # Fevals CPU (sec) ConvergedPower 1687 8.8 100SQ1 165 0.88 100SQ2 121 0.69 100SQ3 115 0.65 100

Varadhan SQUAREM

BackgroundAcceleration of Convergence

Results

Multidimensional Scaling: SMACOFPower Method for Dominant Eigenvector

Power Method - Part II

Generated a 100× 100 (arbitrary) matrix with eigenvalues asfollows:eigvals <- c(2, 1.99, runif(97, 0, 1.9), -2)

Table: A comparison of the different schemes: Average of 100simulations

Scheme # Fevals CPU (sec) ConvergedPower 50000 3.46 0SQ1 178 0.023 100SQ2 130 0.031 100SQ3 122 0.027 100

Varadhan SQUAREM

Appendix For Further Reading

For Further Reading I

R. Varadhan, and C. RolandScandinavian Journal of Statistics.2008.

C. Roland, R.Varadhan, and C.E. FrangakisNumerical Mathematics.2007.

Varadhan SQUAREM