![Page 1: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/1.jpg)
Decentralized and Dynamic Community Formation in
P2P Networks & Performance of Community Based
Caching
Chepchumba S. Limo
May 6, 2015
1
Committee Members:
Anura Jayasumana (Advisor)
Liuiqing Yang
Christos Papadopoulos
MSc. Thesis Defense
![Page 2: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/2.jpg)
Overview
• Introduction
• Motivation and Problem Statement
• Dynamic Group Discovery Algorithm
• Simulation Implementation
• Case Studies and Results
• Conclusion and Future Work
2
![Page 3: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/3.jpg)
Introduction: Peer-to-Peer Networks
3
• Example of overlay network
• File transfer P2P networks
– Lookup/resource discovery
• P2P messaging
– PUT <key, value>
– GET(key) value• GET(key) node ID
• GET(key) data
• Mitigate resource discovery
– Distributed Hast Tables (DHTs)
– Caching Schemes
Key Value
123 Jack & Jill
A34 Avengers
BC5 Spiderman
24F Beyoncé
GET(123)
Jack & Jill
![Page 4: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/4.jpg)
Introduction: Caching
4
• Distributed Hash Tables (DHT)
– Efficient
– Highly scalable
– Self organizing
• Caching
– Favor popular resources relative to entire network
– But traffic modeled by Zipf’sdistribution
![Page 5: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/5.jpg)
Introduction: Caching
5
• Subset of nodes that share similar interests are said to
form a community
• Communities exist naturally
• Community Based Caching (CBC) algorithm proposed
– Exploits existence of communities when caching
– More nodes benefit from caching
![Page 6: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/6.jpg)
Introduction: Communities
6
Structured P2P Network (Chord) Unstructured P2P Networks
![Page 7: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/7.jpg)
Introduction: Communities
7
• In file sharing P2P networks:– Node_A interested in music and software
– Node_B interested in music and movies
– Node_A and Node_B form music community (community appears)
– Node_C with music and software interest joins network
– Node_C should join music community with Node_A and Node_B (community grows)
– Node_A, Node_B, Node_C leave network (community disappears)
• Properties of communities:1. Naturally occurring
2. Dynamic
3. Nodes/users can belong to multiple communities
4. Nodes/users can join/leave communities at-will
![Page 8: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/8.jpg)
Overview
• Introduction
• Motivation and Problem Statement
• Dynamic Group Discovery Algorithm
• Simulation Implementation
• Case Studies and Results
• Conclusion and Future Work
8
![Page 9: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/9.jpg)
Motivation
9
1. Community Based Caching algorithm (CBC) tested under the
limiting conditions, i.e.:
i. Static community assignment
ii. Nodes/users couldn’t change membership
iii. Nodes limited to being members of only 1 community
iv. Community membership based on websites queried (arguably weak
similarity)
![Page 10: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/10.jpg)
Motivation
10
2. Limitation of existing dynamic community formation algorithms:i. Centralized node for maintenance
ii. Complicated computations
iii. Additional messaging to establish community membership
iv. Limited to being members of one community at a time
3. Basis to established similarities for community formationi. Website queried – weak
ii. Personal interests
iii. Acquired interests
![Page 11: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/11.jpg)
Problem Statement
11
• Motivation summary:i. CBC tested under stringent conditions
ii. Existing algorithms have limitations
iii. Consider other basis of community formation
• Contribution– Decentralized community discovery algorithm
• Considers community properties i.e. naturally occurring, dynamic, members of multiple communities & join/leave communities at-will
• Overcomes limitations of existing algorithms
• Utilize already existing PUT and GET messages – no additional messaging needed
– Special key generation technique• Dissemination group information
– Test CBC under more realistic conditions• Network with churn
![Page 12: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/12.jpg)
Overview
• Introduction
• Motivation and Problem Statement
• Dynamic Group Discovery Algorithm
• Simulation Implementation
• Case Studies and Results
• Conclusion and Future Work
12
![Page 13: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/13.jpg)
Dynamic Group Discovery (DGD)
13
• Key has embedded meta-data on the type of resource it represents
• No additional security risk added if algorithm is publicly known
![Page 14: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/14.jpg)
Dynamic Group Discovery (DGD)
14
• Key generation– Last 12 bits of the key needed for the three levels of group
identification:• Level 1 (mandatory): general classification e.g., music, movies etc.
• Level 2 (optional): specify geographical location e.g., U.S.A, Canada etc.
• Level 3 (optional): specify genre e.g., comedy, jazz etc.
Key Group ID Level 1 Group ID Level 2 Group ID Level 3 Final Key
0123456789abcdef music => 1 Canada => 2 hip-hop => 3 0123456789abcdef 123
a123456789bcdef0 music => 1 N/A=> 0 blues=> 9 a123456789bcdef0 109
b123456789bcdef0 movies => 3 USA => 3 comedy => 6 b123456789bcdef0 236
![Page 15: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/15.jpg)
Dynamic Group Discovery (DGD)
15
• Built on top of structured P2P
– Guaranteed performance compared to unstructured
– Chord used
![Page 16: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/16.jpg)
Dynamic Group Discovery (DGD)
16
• Goal of DGD is to allow community formation in structured
P2P networks
![Page 17: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/17.jpg)
Dynamic Group Discovery (DGD)
17
• Establish group interest– Personal interests
• Two thresholds used:
– λ = for personal interests
– μ = for acquired interests
• λ << μ
void forward(key, msg, nextHop*)
{
if msg type = GET
{
extract group ID from key;
if group ID finger table and not pointing to me // is there interest
{
if hops < (𝑙𝑜𝑔2𝑁)/2{
set nextHop using entries in group ID finger table;
}
else
{
use chord to set nextHop;
}
}
else
{
keep track of specific GET message;
use chord to set nextHop;
if specific GET messages received λ OR μ times
send FIND GROUP request;
}
}
}
Group IDFinger To Group
Member
Frequency of
Use
120 node A N/A
239 node A N/A
912 node A N/A
122 node X 0
122 node Y 7
122 node Z 9
912 node X 20
235 node E 4
420 node G 1
![Page 18: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/18.jpg)
Dynamic Group Discovery (DGD)
18
• Maintaining group ID finger table
– Limited resources
• σ = max number of fingers
Group IDFinger To Group
Member
Frequency of
Use
120 node A N/A
239 node A N/A
912 node A N/A
122 node X 0
122 node Y 7
122 node Z 9
912 node X 20
235 node E 4
420 node G 1
void handle_FINDGROUP_response(groupID, finger)
{
if <group ID, finger> pair already exist // done to avoid duplicates
return;
if group ID finger table is at capacity
{
if one least used finger can be identified
delete it;
else // i.e. multiple fingers with same low frequency use number
pick one at random and delete;
}
add new found finger;
}σ = 3
![Page 19: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/19.jpg)
Overview
• Introduction
• Motivation and Problem Statement
• Dynamic Group Discovery Algorithm
• Simulation Implementation
• Case Studies and Results
• Conclusion and Future Work
19
![Page 20: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/20.jpg)
Simulation Implementation
20
• Oversim
– Flexible network simulation framework
– Popular event driven simulator for P2P networks
• Keys and queries generated external to the simulator
– Able to indirectly control community size and symmetry
• Keys and queries generation:
1. Determine desired symmetry and size
2. Generate random keys with desired symmetry
3. Sort keys based level 1 identification
4. Assign Zipf’s α parameter per community
5. Generate queries
![Page 21: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/21.jpg)
Simulation Implementation
21
void handle PUT event
{
key = read from key file;
if key is unspecified // i.e. all keys have been read from key file
{
schedule GET event;
}
else
{
extract group ID information from key;
add entry to group ID finger table;
create PUT message and send it out to network;
schedule next PUT event;
}
}
![Page 22: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/22.jpg)
Simulation Implementation
22
void handle GET event
{
select group ID file to read from; // Based on personal interest
query = read from key from query file;
if query is unspecified // i.e. all keys have been read from query file
{
return; // do nothing
}
else
{
create GET message with query;
send out GET message;
schedule next GET event;
}
}
![Page 23: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/23.jpg)
Simulation Implementation
23
void hand_REMOVE_request(group id, finger, maxHops, curHops)
{
if <group ID, finger> pair exist
delete entry
if curHops < maxHops
{
curHops ++;
forward message to all nodes in group ID finger table
}
}
![Page 24: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/24.jpg)
Overview
• Introduction
• Motivation and Problem Statement
• Dynamic Group Discovery Algorithm
• Simulation Implementation
• Case Studies and Results
• Conclusion and Future Work
24
![Page 25: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/25.jpg)
Case 1: Varying Number of Nodes
25
• Between 500 and 10,000 nodes
• 40,000 keys used with following distribution:
– 40% group 1
– 40% group 2
– 20% shared equally between group 3 to 9
• Maximum group ID finger table per node = 160
• λ = 2
• μ = 20
• σ = 3
Asymmetrical communities
![Page 26: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/26.jpg)
Case 1: Varying Number of Nodes
26
![Page 27: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/27.jpg)
Case 2: Varying Community Size
27
• 2,000 nodes
• 40,000 keys divided into 2 section i.e., 80% section 1 and 20% section 2
– Run 1: one community in section 1 and eight communities in section 2
– Run 2: two communities in section1 and seven communities in section 2
– Run 3: …
• Maximum group ID finger table per node = 160
• λ = 2
• μ = 20
• σ = 3
![Page 28: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/28.jpg)
Case 2: Varying Community Size
28
![Page 29: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/29.jpg)
Case 3: Introducing Churn
29
• Between 500 and 10,000 nodes
• 40,000 keys used with following distribution:
– 40% group 1
– 40% group 2
– 20% shared equally between group 3 to 9
• Maximum group ID finger table per node = 160
• λ = 2
• μ = 20
• σ = 3
Asymmetrical communities
![Page 30: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/30.jpg)
Case 3: Introducing Churn
30
![Page 31: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/31.jpg)
Overview
• Introduction
• Motivation and Problem Statement
• Dynamic Group Discovery Algorithm
• Simulation Implementation
• Case Studies and Results
• Conclusion and Future Work
31
![Page 32: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/32.jpg)
Conclusion
32
1. Decentralized dynamic group discovery algorithm
2. Key generation with embedded group ID information
3. Improve lookup performance for queries resolved using cache data
• Stronger community basis i.e. personal and acquired interests
• Without churn
4. Easy implementation of dynamic group discovery • Utilize already existing messages
• Additional computation – extracting group ID information
5. Great potential for robust caching solution
![Page 33: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/33.jpg)
Future Work
33
1. Optimize entries in group ID finger table
• Consider distance of new finger relative to other fingers in other tables
2. Consider location of next hop of group member
• Applicable for structured P2P networks
3. DGD need to know exactly how many nodes in network• if finger not pointing to me and hops < (𝑙𝑜𝑔2𝑁)/2
• Find solution to determine number of nodes in network with churn
4. Introduce churn in measurable manner
• Better characterize DGD’s performance
5. Test DGD with other type of P2P networks
![Page 34: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/34.jpg)
Thank you!
• Dr. Jayasumana
• Liuiqing Yang
• Christos Papadopoulos
• Friend and co-workers at CSU and Dot Hill
• Family
34
![Page 35: MSc. Thesis Defense and Dynamic Community Formation in P2P Networks & Performance of Community Based Caching Chepchumba S. Limo May 6, 2015 1 Committee Members:](https://reader031.vdocuments.us/reader031/viewer/2022022012/5b19582d7f8b9a23258c9857/html5/thumbnails/35.jpg)
QUESTIONS?
35