extrapolation

Extrapolation Methods for Accelerating PageRank Computations

Sepandar D. Kamvar

Taher H. Haveliwala

Christopher D. Manning

Gene H. Golub

Stanford University

Results:

1. The Official Site of the San Francisco Giants

Search: Giants

Results:

1. The Official Site of the New York Giants

Motivation Problem:

Speed up PageRank

Motivation: Personalization “Freshness”

Note: PageRank Computations don’t get faster as computers do.

(k)1)(k Axx Repeat:

u1 u2 u3 u4 u5

Outline Definition of PageRank

Computation of PageRank

Convergence Properties

Outline of Our Approach

Empirical Results

Link Counts

Linked by 2 Important Pages

Linked by 2 Unimportant

Sep’s Home Page

Taher’s Home Page

Yahoo! CNNDB Pub Server CS361

Definition of PageRank

The importance of a page is given by the importance of the pages that link to it.

importance of page i

pages j that link to page i

number of outlinks from page j

importance of page j

Definition of PageRank

1/2 1/2 1 1

0.1 0.10.1

Yahoo!CNNDB Pub Server

Taher Sep

PageRank Diagram

Initialize all nodes to rank

PageRank Diagram

Propagate ranks across links(multiplying by link weights)

PageRank Diagram

)0()1( 1j

Bj ji x

PageRank Diagram

)1()2( 1j

Bj ji x

PageRank Diagram

After a while…

Computing PageRank Initialize:

Repeat until convergence:

)()1( 1 kj

importance of page i

pages j that link to page i

number of outlinks from page j

importance of page j

Matrix Notation

0 .2 0 .3 0 0 .1 .4 0 .1=

Matrix Notation

0 .2 0 .3 0 0 .1 .4 0 .1=

xPx TFind x that satisfies:

Power Method Initialize:

(k)T1)(k xPx

PageRank doesn’t actually use PT. Instead, it uses A=cPT + (1-c)ET.

So the PageRank problem is really:

A side note

AxxFind x that satisfies:

xPx TFind x that satisfies:

Power Method And the algorithm is really . . .

Initialize:

(k)1)(k Axx

(k)1)(k Axx Repeat:

u1 u2 u3 u4 u5

Empirical Results

Power Method

Express x(0) in terms of eigenvectors of A

Power Method

Power Method)2(x

Power Method

Why does it work?

Imagine our n x n matrix A has n distinct eigenvectors ui.

ii uAu i

n0 uuux n ...221)(

Then, you can write any n-dimensional vector as a linear combination of the eigenvectors of A.

Why does it work? From the last slide:

To get the first iterate, multiply x(0) by A.

First eigenvalue is 1.

Therefore:

...;1 211

n0 uuux n ...221)(

(0)(1)

AuAuAu

n(1) uuux nn ...2221

All less than 1

Power Method

n0 uuux n ...221)(

n(1) uuux nn ...2221

n)( uuux 2

2 ... nn u1

The smaller 2, the faster the convergence of the Power Method.

Convergence

n)( uuux k

nnkk ...2221

Our Approach

u1 u2 u3 u4 u5

Estimate components of current iterate in the directions of second two eigenvectors, and eliminate them.

Why this approach? For traditional problems:

A is smaller, often dense. 2 often close to , making the power method slow.

In our problem, A is huge and sparse More importantly, 2 is small1.

Therefore, Power method is actually much faster than other methods.

1(“The Second Eigenvalue of the Google Matrix” dbpubs.stanford.edu/pub/2003-20.)

Using Successive Iterates

u1 u2 u3 u4 u5

x’ = u1

u1 u2 u3 u4 u5

How do we do this? Assume x(k) can be written as a linear

combination of the first three eigenvectors (u1, u2, u3) of A.

Compute approximation to {u2,u3}, and subtract it from x(k) to get x(k)’

Assume Assume the x(k) can be represented by

first 3 eigenvectors of A

33322211 uuuAxx )()( kk

n)( uuux 3221 k

2 uuux )( k

3 uuux )( k

Linear Combination Let’s take some linear combination of

these 3 iterates.

)()()( xxx 33

)( 32332

22212 uuu

)( 33332

32213 uuu

)( 33322211 uuu

Rearranging Terms We can rearrange the terms to get:

)()()( xxx 33

1321 )( u

222212 )( u

232313 )( u

Goal: Find 1,2,3 so that coefficients of u2 and u3 are 0, and coefficient of u1 is 1.

Summary We make an assumption about the

current iterate. Solve for dominant eigenvector as a

linear combination of the next three iterates.

We use a few iterations of the Power Method to “clean it up”.

u1 u2 u3 u4 u5

(k)1)(k Axx Repeat:

Empirical Results

ResultsQuadratic Extrapolation speeds up convergence. Extrapolation was only used 5 times!

ResultsExtrapolation dramatically speeds up convergence, for high values of c (c=.99)

Take-home message Speeds up PageRank by a fair amount,

but not by enough for true Personalized PageRank.

Ideas are useful for further speedup algorithms.

Quadratic Extrapolation can be used for a whole class of problems.

The End Paper available at

http://dbpubs.stanford.edu/pub/2003-16

extrapolation

Technology

interpolation, extrapolation & polynomial approximation

extrapolation of indications for biosimilars

steps in geomorphic analysis and prediction observation...

numerical integration numerical differentiation richardson...

extrapolation techniques for very low cycle …

ema extrapolation framework regulatory tools · ema...

richardson’s extrapolation - university of...

3)extrapolation handbook

appendix 09.11 incenta bloomberg extrapolation

geographical extrapolation domain analysis

landsat calibration: interpolation, extrapolation, and...

quantitative cross-species extrapolation between humans

extrapolation des bioréacteurs ( bioreactor scale -up )

richardson extrapolation andthe bootstrap

application of richardson's extrapolation to numerical

chapter 3. interpolation and extrapolation

besiii track extrapolation & matching

data extrapolation- reed & associates, cpas

diversity through extrapolation of microevolutionary...

ec funded extrapolation...