mikko vapa, research student p2p computing group department of mathematical information technology
DESCRIPTION
Optimal Resource Discovery Paths of Gnutella2 The IEEE 22nd International Conference on Advanced Information Networking and Applications (AINA 2008) 27.3.2008. Mikko Vapa, research student P2P Computing Group Department of Mathematical Information Technology www.mit.jyu.fi/ cheesefactory. - PowerPoint PPT PresentationTRANSCRIPT
UNIVERSITY OF JYVÄSKYLÄ
Optimal Resource Discovery Paths of Gnutella2The IEEE 22nd International Conference on Advanced Information Networking and Applications (AINA 2008)27.3.2008
Mikko Vapa, research studentP2P Computing Group
Department of Mathematical Information Technologywww.mit.jyu.fi/cheesefactory
UNIVERSITY OF JYVÄSKYLÄ
Resource Discovery Problem
• In peer-to-peer (P2P) resource discovery problem any node in the network can possess resources and also query these resources from other nodes
Node1: Where is ?
Node 1
Node 2
Node 3
Node 4
UNIVERSITY OF JYVÄSKYLÄ
A Simple Solution for the Problem
• The most studied P2P network, Gnutella, for example used Breadth-First Search (BFS) flooding algorithm which sends query to all neighbors
• Problems: all resources in the network can be found, but network gets congested and there are lots of useless packets
Node 1: Where is ?
Node 1
Node 2
Node 3
Node 4
Query
QueryQuery
Query
Query
Query
Node 4: I have it!
Node 2: I have it!Node 4: Node 4 has it too!Reply
Reply
UNIVERSITY OF JYVÄSKYLÄ
Steiner Minimum Tree Problem
• Optimal paths for resource discovery can be found by using non-distributed algorithm which requires global knowledge of topology and resources
• Precisely, this problem can be formulated as a task of finding a Steiner Minimum Tree (SMT) from a graph:
UNIVERSITY OF JYVÄSKYLÄ
Steiner Minimum Tree Problem
• V = {Node 1, Node 2, Node 3, Node 4}• R = {Node 1, Node 2, Node 4}• min T = ({Node 1, Node 2, Node 4}, {1-2, 2-4})• min w(T) = 2
Node 1: Where is ?
Node 1
Node 2
Node 3
Node 4
Query
Query
Node 4: I have it!
Node 2: I have it!Node 4: Node 4 has it too!Reply
Reply
UNIVERSITY OF JYVÄSKYLÄ
Rooted k-Steiner Minimum Tree Problem
• SMT locates all resources in the network, but if only k instances of the matching resources need to be found the problem becomes k-Steiner Minimum Tree problem
• Also the problem is rooted to define which node starts the query
UNIVERSITY OF JYVÄSKYLÄ
MST k-Steiner Minimum Tree Algorithm
• MST k-Steiner Minimum Tree Algorithm was developed to find an approximation solution: Algorithm: MST k-Steiner Minimum Tree
Input: A connected graph G = (V,E), a terminal set VR , a root vertex Rr and
||2 Rk
Output: A Steiner tree T for R in G rooted to the vertex r containing k terminal vertices.
(1) Add one node to the graph G and connect it to all terminal nodes contained in R with an edge having cost 0. The result is denoted as graph GV.
(2) Replace GV with the minimum spanning tree of GV.
(3) Compute the shortest path between two terminal nodes by iterating all edges of E in G and constructing the corresponding triplets. Transform the resulting triplets to graph GR.
(4) Compute a k-minimum spanning tree approximation TR from GR rooted to the vertex r and containing k vertices of R.
(5) Transform TR into subtree T of G by replacing each edge of TR by the corresponding shortest path.
UNIVERSITY OF JYVÄSKYLÄ
MST k-Steiner Minimum Tree Algorithm
EEO log
Time Complexity:
whereE = number ofedges in a graph G
Worst-CaseApproximation Ratio:
2
R
whereR = availableresources
1
m2
r1 r2
r5
r4r3
m1
7
13
1
6
1
31
1m3
r4
r3
r1 r2
r5
1
m2
m1
7
13
1
6
1
31
1m3
0
0
0
00
r4r3
r1 r2
r5m2
m1
11
1m3
0
0
0
00
Graph G Graph GV after step (1) Graph GV after step (2)
r5
r1 r2
r4r3
7
15
1
6
5
r5
r1
r4r3
5
1
5
r1
r5
r4r3
m13
1
31
1m3
Graph GR after step (3) Tree TR after step (4) Tree T after step (5)
UNIVERSITY OF JYVÄSKYLÄ
Simulation Scenarios
Scenario PL10000 N10000 Gnutella2
Distribution Power-Law Normal -
Nodes 10000 10000 74297
Edges 19997 19997 609036
Largest hub 161 11 360
Resources 1000 1000 10
Res. instances 39994 39994 43216
Queries 100 100 100
Diameter 8 10 12
UNIVERSITY OF JYVÄSKYLÄ
Query Packets for Gnutella2with ~75000 nodes• MST k-Steiner Minimum Tree algorithm shows that current local search
algorithms for peer-to-peer networks are far from optimal paths
1
10
100
1000
10000
100000
1000000
0,0 20,0 40,0 60,0 80,0 100,0
% of Resources
Pac
kets
/ q
uer
y
DQP BFS HDSRWSA k-Steiner k
UNIVERSITY OF JYVÄSKYLÄ
Hops for Gnutella2 with ~75000 nodes
• MST k-Steiner does not use the shortest paths to locate resources
0
10
20
30
40
50
60
70
0,0 20,0 40,0 60,0 80,0 100,0
% of Resources
Ho
ps
k-Steiner DQP BFS
UNIVERSITY OF JYVÄSKYLÄHighest Degree Search
K-Steiner Minimum Tree
K-Steiner Tree Algorithm locates9 resource instances with 11 query packets. For this querythe approximated solutionis also the optimal solution.HDS uses almost twice as muchquery packets for this query.
UNIVERSITY OF JYVÄSKYLÄ
Future Work
• Conducting an extensive survey of related work in graph theory for k-Steiner Minimum Trees and modifying the problem to support multiple resource instances on a same node (Prize Collecting Steiner Tree problem with Quota)
• What makes the resource discovery problem hard in P2P networks is that only local information is available– It would be interesting to know how close to the optimum can
algorithms get using local knowledge• A record of the global network topology is used in Open Shortest Path
First IP routing protocol and Dijkstra’s algorithm for computing the shortest paths– It might be possible that MST k-Steiner tree algorithm can be
adapted to P2P networks– In this case, information about the resources needs to be at least
partially cached in the nodes