1 on the eigenvalue power law milena mihail georgia tech christos papadimitriou u.c. berkeley &

Post on 15-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

On the Eigenvalue Power Law

Milena Mihail Georgia Tech

Christos PapadimitriouU.C. Berkeley

&

2

Network and application studies need properties and models of:

Internet graphs & Internet Traffic.

Shift of networking paradigm: Open, decentralized, dynamic.

Intense measurement efforts. Intense modeling efforts.

Internet Measurement and Models

Routers

WWW

P2P

3

Internet & WWW Graphs

http://www.etc

http://www.XXX.net

http://www.YYY.com

http://www.etc http://www.ZZZ.edu

http://www.XXX.com

http://www.etc

Routers exchanging traffic. Web pages and hyperlinks.

10K – 300K nodesAvrg degree ~ 3

4

Real Internet Graphs

CAIDA http://www.caida.org

Average Degree = Constant

A Few Degrees VERY LARGE

Degrees not sharply concentrated around their mean.

5

Degree-Frequency Power Law

degree1 3 4 5 102 100

freq

uen

cy

WWW measurement: Kumar et al 99Internet measurement: Faloutsos et

al 99

E[d] = const., but

No sharp concentration

6

Degree-Frequency Power Law

1 3 4 5 102 100

freq

uen

cy

E[d] = const., but

No sharp concentration

degree

E[d] = const., but

No sharp concentration

Erdos-Renyi sharp concentration

Models by Kumar et al 00, x Bollobas et al 01, x Fabrikant et al 02

7

Rank-Degree Power Law

rank

deg

ree

1 2 3 4 5 10

Internet measurement: Faloutsos et al 99

UUNET

SprintC&WUSA

AT&TBBN

8

Eigenvalue Power Law

rank

eig

en

valu

e

1 2 3 4 5 10

Internet measurement: Faloutsos et al 99

9

This Paper: Large Degrees & Eigenvalues

rank

eig

en

valu

es

1 2 3 4 5 10

UUNET

SprintC&WUSA

AT&TBBN2

34

2 3 4

deg

ree

s

10

This Paper: Large Degrees & Eigenvalues

11

Principal Eigenvector of a Star

11

1

11

1

1

1

d

12

Large Degrees

2

3

4

13

Large Eigenvalues

2

3

4

14

Main Result of the Paper

The largest eigenvalues of the adjacency martix of a graph whose large degrees are power law distributed (Zipf), are also power law distributed.

Explains Internet measurements.

Negative implications for the spectral filtering method in information retrieval.

15

Random Graph Model

let

Connectivity analyzed by Chung & Lu ‘01

16

Random Graph Model

17

Random Graph Model

18

Theorem :

Ffor large enough

Wwith probability at least

19

Proof : Step 1. Decomposition

Vertex Disjoint StarsLR-extra

RR

LL

LR =

-

20

Proof: Step 2: Vertex Disjoint Stars

Degrees of each Vertex Disjoint Stars Sharply Concentrated around its Mean d_iHence Principal Eigenvalue Sharply Concentrated around

21

Proof: Step 3: LL, RR, LR-extra

LR-extra has max degree

LL has

edges

RR has max degree

22

Proof: Step 3: LL, RR, LR-extra

LR-extra has max degree

RR has max degree

LL has

edges

23

Proof: Step 4: Matrix Perturbation Theory

Vertex Disjoint Stars have principal eigenvalues

All other parts have max eigenvalue QED

24

Implication for Info Retrieval

Spectral filtering, without preprocessing, reveals only the large degrees.

Term-Norm Distribution Problem :

25

Implication for Info Retrieval

Term-Norm Distribution Problem : Spectral filtering, without preprocessing, reveals only the large degrees.

Local information.

No “latent semantics”.

26

Implication for Information Retrieval

Application specific preprocessing (normalization of degrees) reveals clusters:

WWW: related to searching, Kleinberg 97

IR, collaborative filtering, …

Internet: related to congestion, Gkantsidis et al 02

Open : Formalize “preprocessing”.

Term-Norm Distribution Problem :

top related