1 a random-surfer web-graph model (joint work with avrim blum & hubert chan) mugizi rwebangira
TRANSCRIPT
![Page 1: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/1.jpg)
1
A Random-Surfer Web-Graph Model
(Joint work with Avrim Blum & Hubert Chan)
Mugizi Rwebangira
![Page 2: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/2.jpg)
2
The Web as a GraphConsider the World Wide Web as a graph, with web pages as nodes and hyperlinks between pages as edges.
links.html
resume.html
index.htmlhttp://cnn.com
![Page 3: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/3.jpg)
3
Studying the Web
Since the Web emerged there has been a lot of interest in:
1. Empirically studying properties of the Web Graph.
2. Modeling the Web Graph mathematically.
Benefits of Generative Models:
1. Simulation – When real data is scarce
2. Extrapolation – How will the graph change?
3. Understanding – Inspire further research on real data
![Page 4: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/4.jpg)
4
Power Law
The distribution of a random variable X follows a power law ifProb [X=k] ~ Ck-α
f(x) ~ g(x) if Limx→∞ f(x)/g(x) = 1
e.g (x+1) ~ (x+2)
Example: Prob [X=k] = k-2
![Page 5: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/5.jpg)
5
Power Law: Prob [X=k] = k-2
![Page 6: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/6.jpg)
6
Power Law
log Prob [X=k] ~ log C –α log k
Prob [X=k] ~ Ck-α
Prob [X=k] = k-2
log Prob [X=k] = -2 log k
![Page 7: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/7.jpg)
7
Power Law: Log-Log plot
![Page 8: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/8.jpg)
8
Power Law contd.
Prob [X≥k] ~ Ck-α
Particularly useful if X takes on real values.
More general definition:
Sometimes referred to as “heavy tailed” or “scale free.”
![Page 9: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/9.jpg)
9
Power Laws in Degree distribution
Let G be a graph.
Let Xk be the proportion of nodes with degree k in G.
Then if Xk ~ Ck-α
we say that G has power law degree distribution.
![Page 10: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/10.jpg)
10
Properties of the Web Graph
A Power-law degree distribution has been observed in a wide variety of graphs including citation networks, social networks, protein-protein interaction networks and so on.
It has also been observed in the Web Graph. [Barabási & Albert]
![Page 11: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/11.jpg)
11
Outline
• Background/Previous Work
• Motivation
• Models
• Theoretical results
• Experimental results
• Conclusions
![Page 12: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/12.jpg)
12
Classic Random Graph Models
• In the G(n,p) random graph model:1. There are n nodes.
2. There is an edge between any two nodes with probability p.
•Was proposed by Erdös and Renyi in 1960s.
![Page 13: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/13.jpg)
13
Online G(n,p)
In this model each new node makes k connections to existing nodes uniformly at random.
For this talk we will focus on k = 1,
hence the graph will be a tree.
![Page 14: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/14.jpg)
14
Online G(n,p)
T=1
T=2
½T=3
½
T=4
⅓
⅓ ⅓
![Page 15: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/15.jpg)
15
Properties of Online G(n,p)
• Xk = Proportion of nodes with degree k
E[Xk] = (½k)
• E[degree of first node] = 1+ 1/2 +1/3+1/4 + …1/n = (log n)
• E[max degree] = (log n)
NOT POWER LAWED!!
![Page 16: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/16.jpg)
16
Online G(n,p) (n=100,000, average of 100 runs)
![Page 17: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/17.jpg)
17
Preferential AttachmentIn the Preferential Attachment model, each newnode connects to the existing nodes with a
probability proportional to their degree.
[Barabási & Albert]
![Page 18: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/18.jpg)
18
Preferential Attachment
T=2
¾T=3
¼
Deg = 3 Deg = 1
T=4
32
61
61
Deg = 4 Deg = 1
Deg = 1
T=1Degree = in-degree + out-degree
![Page 19: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/19.jpg)
19
Preferential Attachment
Preferential Attachment gives a power-law degree distribution. [Mitzenmacher, Cooper & Frieze 03, KRRSTU00]
E[degree of 1st node] = √n
![Page 20: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/20.jpg)
20
Preferential Attachment
![Page 21: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/21.jpg)
21
Other Models
Kumar et. al. proposed the “copying model.” [KRRSTU00]
Leskovec et. al. propose a “forest fire” model which has some similarites to this work. [LKF05]
![Page 22: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/22.jpg)
22
Outline
• Background/Previous Work
• Motivation
• Models
• Theoretical results
• Experimental results
• Conclusions
![Page 23: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/23.jpg)
23
Motivating Questions
Why would a new node connect to nodes of high degree?-Are high degree nodes more attractive?-Or are there other explanations?
How does a new node find out what the high degree nodes are?
![Page 24: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/24.jpg)
24
Motivating QuestionsMotivating Observation:
•If p is small then this is the same as preferential attachment.
•Suppose a user does a (undirected) random walk until they find an interesting page.
•What about other processes and directed graphs?
•Suppose each page has a small probability p of being interesting.
![Page 25: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/25.jpg)
25
Outline
• Background/Previous Work
• Motivation
• Models
• Theoretical results
• Experimental results
• Conclusions
![Page 26: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/26.jpg)
26
Directed 1-step Random Surfer, p=.5
¾
T=3
¼
(½) (½)+ (½) (½)+ (½) (½)
T=1Start with a single node with a self-loop.
T=2 1. Choose a node uniformly at random2. With probability p connect3. With probability (1-p) connect to its neighbor
![Page 27: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/27.jpg)
27
Directed 1-step Random SurferIt turns out this model is a mixture of connecting to nodes uniformly at random and preferential attachment.
But taking one step is not very natural.
Has a power-law degree distribution.
What about doing a real random walk?
![Page 28: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/28.jpg)
28
NEW NODE
RANDOM STARTING NODE
1. COIN TOSS: TAIL (at node A)2. COIN TOSS: TAIL (at node B)
3. COIN TOSS: HEAD (at node C)
1. Pick a node uniformly at random.
2. Flip a coin of bias pIf HEADS connect to current node, else walk to neighbor
AB
CD
Directed Coin Flipping model
![Page 29: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/29.jpg)
29
Directed Coin Flipping model
1. At time 1, we start with a single node with a self-loop.
2. At time t, we choose a node u uniformly at random.
3. We then flip a coin of bias p.
4. If the coin comes up heads, we connect to the current node.
5. Else we walk to a random neighbor and go to step 3.
“each page has equal probability p of being interesting to us”
![Page 30: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/30.jpg)
30
Outline
• Background/Previous Work
• Motivation
• Models
• Theoretical results
• Experimental results
• Conclusions
![Page 31: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/31.jpg)
31
Is Directed Coin-Flipping Power-lawed?
We don’t know … but we do have some partial results ...
![Page 32: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/32.jpg)
32
Virtual DegreeDefinitions:
Let li(u) be the number of level i descendents of node u.l1(u) = # of childrenl2(u) = # of grandchildren, e.t.c.
Let = (β1, β2,..) be a sequence of real numbers with 1=1.
Then v(u) = 1 + β1 l1(u) + β2 l2(u) + β3 l3(u) + …
We’ll call v(u) the “Virtual degree of u with respect to .”
![Page 33: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/33.jpg)
33
u
Virtual Degree
v(u) = 1 + β1 (2) + β2 (4) + β3 (0) + β4 (0) + ...
# of children # of grandchildren
![Page 34: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/34.jpg)
34
Virtual Degree
Easy observation: If we set βi = (1-p)i then the expected increase in deg(u) is proportional to v(u).
Expected increase in deg(u) = p/t + (1-p)pl1(u)/t + (1-p)2pl2(u)/t + …= (p/t)v(u)
u
![Page 35: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/35.jpg)
35
Virtual DegreeTheorem: There always exist βi such that 1. For i ≥ 1, |βi| · 1.2. As i → ∞, βi →0 exponentially. 3. The expected increase in v(u) is proportional to v(u).
Recurrence: 1=1, 2=p, i+1=i – (1-p)i-1
for p=½, i = 1, 1/2, 0, -1/4, -1/4, -1/8, 0, 1/16, …
E.g., for p=¾, i = 1, 3/4, 1/2, 5/16, 3/16, 7/64,...
![Page 36: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/36.jpg)
36
Virtual Degree, continued
Theorem: For any node u and time t ≥ tu, E[vt(u)] = Θ((t/tu)p)
Let vt(u) be the virtual degree of node u at time t and tu be the time when node u first appears.
So, the expected virtual degrees follow a power law.
![Page 37: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/37.jpg)
37
Actual Degree
Theorem: For any node u and time t ≥ tu, E[degree(u)] ≥ Ω((t/tu)p(1-p))
We can also obtain lower bounds on the expected values of the actual degrees:
![Page 38: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/38.jpg)
38
Outline
• Background/Previous Work
• Motivation
• Models
• Theoretical results
• Experimental results
• Conclusions
![Page 39: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/39.jpg)
39
Experiments
• Random graphs of n=100,000 nodes
• Compute statistics averaged over 100 runs.
• K=1 (Every node has out-degree 1)
![Page 40: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/40.jpg)
40
Online Erdös-Renyi
![Page 41: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/41.jpg)
41
Directed 1-Step Random Surfer, p=3/4
![Page 42: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/42.jpg)
42
Directed 1-Step Random Surfer, p=1/2
![Page 43: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/43.jpg)
43
Directed 1-Step Random Surfer, p=1/4
![Page 44: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/44.jpg)
44
Directed Coin Flipping, p=1/2
![Page 45: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/45.jpg)
45
Directed Coin Flipping, p=1/4
![Page 46: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/46.jpg)
46
Undirected coin flipping, p=1/2
![Page 47: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/47.jpg)
47
Undirected Coin Flipping p=0.05
![Page 48: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/48.jpg)
48
Outline
• Background/Previous Work
• Motivation
• Models
• Theoretical results
• Experimental results
• Conclusions
![Page 49: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/49.jpg)
49
Conclusions
Directed random walk models appear to generate power-laws (and partial theoretical results).
Power laws can naturally emerge, even if all nodes have the same intrinsic “attractiveness”.
![Page 50: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/50.jpg)
50
Open questions
•Can we prove that the degrees in the directed coin-flipping model do indeed follow a power law?
•Analyze degree distribution for the undirected coin-flipping model with p=1/2?
•Suppose page i has “interestingness” pi. Can we analyze the degree as a function of t, i and pi?
![Page 51: 1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649d1a5503460f949ef8a1/html5/thumbnails/51.jpg)
51
Questions?