![Page 1: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/1.jpg)
Gossip-based Partitioning and ReplicationMiddle-ware forOnline Social Networks
Muhammad Anis Uddin Nasir(EMDC/ICT/LCN)
Supervisor: Šarūnas GirdzijauskasExaminer: Johan Montelius
![Page 2: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/2.jpg)
Online Social Networks
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
•Vertices •Edges •Metadata
Ioanna Antonio Vaidas
Aras
VasiaAnis
Mudit
Manos
2
LeandroJohan
![Page 3: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/3.jpg)
Existing Solutions
• Relational Databases- MySQL Cluster
• Key-Value stores- Cassandra, Amazon Dynamo
• Document Databases- MongoDB, CouchDB
• Graph Databases- Neo4j, Titans
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 3
![Page 4: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/4.jpg)
Why Existing Solutions are not enough?
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
5
3
4
2
1
10
8
9
7
6
4
![Page 5: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/5.jpg)
Why Existing Solutions are not enough?
• Random Partitioning• Social Request
- E.g., gather new feeds from all the friends
• Enforcing Data Locality
• Random partitioning can lead to full replication!
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
5
3
4
2
1
10
8
9
7
6
1 4 7 82 3 5 6 10 9
1’ 4’ 7’ 8’ 9’ 2’ 3’ 6’5’ 10’
5
![Page 6: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/6.jpg)
Social Graphs are not Random
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 6
Graphs with
small world
properties
![Page 7: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/7.jpg)
Graph Partitioning
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 7
![Page 8: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/8.jpg)
JA-BE-JA- edge-cut
04/18/2023Muhammad Anis Uddin Nasir- Gossip-based Partitioning and
Replication Middle-ware
Server A Server B
6
3
5
2
1
4
76’
3’
1’
4’
7’
• Edge Cut = 3 links, 3+2=5 replicas to maintain
8
![Page 9: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/9.jpg)
SPAR- Minimizing Replicas
04/18/2023Muhammad Anis Uddin Nasir- Gossip-based Partitioning and
Replication Middle-ware
Server A Server B
6
3
5
2
1
4
76’
3’2’
5’
• Edge Cut = 4 links, 2+2=4 replicas to maintain
9
![Page 10: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/10.jpg)
Initialization
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
5
3
4
2
1
10
8
9
7
6
1 4 7 82 3 5 6 10 9
1’ 4’ 7’ 8’ 9’ 2’ 3’ 6’5’ 10’
• Node Addition- Assign it to server with minimum master
• Edge Addition- Check if Nodes are Local- Else create replicas to
maintain locality
10
![Page 11: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/11.jpg)
Gossip Phase
• Cost Function- Count number of replicas- For current and new server
• Peer Selection- Local, Random, Hybrid
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
5
3
4
2
1
10
8
9
7
6
1 4 7 82 3 5 6 10 9
1’ 4’ 7’ 8’ 9’ 5’ 10’
11
2’ 3’ 6’
![Page 12: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/12.jpg)
Gossip Phase
• Cost Function- Count number of replicas- For existing and new server
• Peer Selection- Local, Random, Hybrid
• Simulated Annealing
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
5
3
4
2
1
10
8
9
7
6
6 4 7 82 3 5 1 10 9
4’ 8’ 9’ 3’ 5’ 10’6’ 1’
4 10
12
![Page 13: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/13.jpg)
Simulated Annealing
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 13
![Page 14: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/14.jpg)
Algorithms
Algorithm Random SPAR JA-BE-JA Gossip-based
Data locality
Decentralized
Load Balancing
Fault tolerance
Avoiding Local Optima
Availability
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 14
![Page 15: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/15.jpg)
Datasets
Datasets Vertices Edges
Synth-C 2,000 20,000
Synth-HC 2,000 20,000
Synth-PL 2,000 20,000
SNAP-Facebook 4,039 88,234
WSON-Facebook 60,290 1,545,686
SNAP-Twitter 81,306 1,768,149
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 15
![Page 16: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/16.jpg)
Evaluation- with datasets
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
Synt
h-C
Synt
h-HC
Synt
h-PL
SNAP
-Fac
eboo
k
WSO
N-Fac
eboo
k
SNAP
-Twitt
er0
2
4
6
8
10
12Random
SPAR
JA-BE-JA
Gossip-based
Replic
ati
on O
verh
ead
>3x gain compared to
Random Partitioning
≈2x gain compared to
SPAR
• Number of Servers =16, Replication factor=2
16
![Page 17: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/17.jpg)
Evaluation- with replication factor
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
Replic
ati
on O
verh
ead
• Number of Servers =16
Synt
h-LC
Synt
h-LH
C
Synt
h-PL
Synt
h-C
Synt
h-HC
SNAP
-Fac
eboo
k
WSO
N-Fac
eboo
k
SNAP
-Twitt
er0123456789
10f=0
f=2
Random Graphs generates maximum replication overhead Real Graphs
generates minimum replication overhead
Data locality is achieved by fault tolerance replicas
17
![Page 18: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/18.jpg)
Evaluation- with servers
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
Replic
ati
on O
verh
ead
• Replication factor =2
Number of Servers
WSON-Facebook
18
8 16 32 6402468
101214161820
Random
SPAR
JA-BE-JA
Gossip-based
Gossip-based generates minimum
replication overhead
Replication overhead
increases non linearly
>4x gain compared to Random Partitioning
8 16 32 6402468
101214161820
Gossip-based
![Page 19: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/19.jpg)
Evaluation- dynamicity
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware
• Number of Servers =16, Replication factor=2
1 157 313 469 625 781 937 10931249140515610.2
0.25
0.3
0.35
0.4
0.45
1 125 249 373 497 621 745 869 993 111712411365148916130.2
0.25
0.3
0.35
0.4
0.45
SNAP-Twitter SNAP-Facebook
Number of cycles Number of cycles
Replic
ati
on O
verh
ead
Replic
ati
on O
verh
ead
Spikes show bulk edge addition
AlgorithmStabilization
19
Transition state, i.e., reducing the
number of replicas after new edge
additions
![Page 20: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/20.jpg)
Conclusion
• Random Partitioning does not provide efficient solution of Online Social Networks
• Minimizing Replicas can help to achieve better partitioning
• Gossip-based heuristic was proposed to solve the minimization problem while achieving the global optima
• Algorithm able to handle different datasets and adjusts with dynamic nature of OSNs
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 20
![Page 21: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/21.jpg)
Gossip-based Partitioning and ReplicationMiddle-ware forOnline Social Networks
Muhammad Anis Uddin Nasir(EMDC/ICT/LCN)
Supervisor: Šarūnas GirdzijauskasExaminer: Johan Montelius
![Page 22: Gossip based partitioning and replication for Online Social Networks](https://reader031.vdocuments.us/reader031/viewer/2022032514/55d4a189bb61eb65618b4598/html5/thumbnails/22.jpg)
Future Work
• Execution of the algorithm with large datasets using parallel graph processing frameworks like GraphLab and Apache Girpah
• Load Balancing using both Master and Replicas and providing different consistency levels
• Smart Replication to provide data locality for highly interactive nodes
• Implement different consistency strategies based to access patterns
04/18/2023 Muhammad Anis Uddin Nasir- Gossip-based Partitioning and Replication Middle-ware 22