![Page 1: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/1.jpg)
BCI 2003 Aristotle University of Thessaloniki 1
November 22, 2003
Updating Web views distributed over wide area networks
Sidiropoulos AntonisKatsaros Dimitrios
Aristotle Univ. of Thessaloniki, Greece
Presentation by:Katsaros Dimitrios
![Page 2: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/2.jpg)
BCI 2003 Aristotle University of Thessaloniki 2
November 22, 2003
Content Distribution Networks
INTERNET 2
1
Origin Web server
Web client
4
4
1
3
2
3
CDN Cache Servers
![Page 3: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/3.jpg)
BCI 2003 Aristotle University of Thessaloniki 3
November 22, 2003
Content Distribution Networks
• Advantages– prevention of the flush crowd problem– avoidance of network congestion– reduction of user-perceived latency
• e.g., Akamai– launced in early 1999– 12,000 servers– in 1,000 networks
![Page 4: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/4.jpg)
BCI 2003 Aristotle University of Thessaloniki 4
November 22, 2003
Disseminating Updates
![Page 5: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/5.jpg)
BCI 2003 Aristotle University of Thessaloniki 5
November 22, 2003
• Related work & Motivation• Proposed method• Preliminary performance
evaluation• Conclusions & Future work
Outline
![Page 6: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/6.jpg)
BCI 2003 Aristotle University of Thessaloniki 6
November 22, 2003
• Related work & Motivation• Proposed method• Preliminary performance
evaluation• Conclusions & Future work
Presentation Outline
![Page 7: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/7.jpg)
BCI 2003 Aristotle University of Thessaloniki 7
November 22, 2003
• Lack of bandwidth to disseminate all updates
• Many caches• Single point of updates
generation
Best-effort cache coherency
![Page 8: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/8.jpg)
BCI 2003 Aristotle University of Thessaloniki 8
November 22, 2003
• Static Web object caching/prefetching (Katsaros & Manolopoulos, ACM SAC’04)(Nanopoulos, Katsaros & Manolopoulos, IEEE TKDE’03)
• Dynamic Web object caching/prefetching– cache plays the central role i.e., prefetching (Cho & Garcia-Molina, SIGMOD’00)
and (Gal & Eckstein, J.ACM’01)– minimizing the bandwidth consumption and query latency in the presence of
constraints on the age or accuracy of cached objects (Bright & Raschid, VLDB’02; Cohen & Kaplan, Computer Networks’02; Olston & Widom, SIGMOD’01)
– strong cache coherence maintenance (Challenger, Iyengar & Dantzig, INFOCOM’99)– update dissemination, best-effort but with a single cache (Labrinidis &
Roussopoulos, VLDB’01)– caches and sources cooperate, best effort caching, (Olston & Widom, SIGMOD’02)– optimal tranmission of updates, but fixed assumptions about update rates and
transmission capabilities (Wang, Evans & Kwok, Information Systems Frontiers,’03)
Related work
![Page 9: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/9.jpg)
BCI 2003 Aristotle University of Thessaloniki 9
November 22, 2003
• Related work & Motivation• Proposed method• Preliminary performance
evaluation• Conclusions & Future work
Presentation Outline
![Page 10: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/10.jpg)
BCI 2003 Aristotle University of Thessaloniki 10
November 22, 2003
Web object freshness
Freshness of object O over period [ti,tj] Freshness of database D with N objects
![Page 11: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/11.jpg)
BCI 2003 Aristotle University of Thessaloniki 11
November 22, 2003
• The access pattern of Web objects is skewed
• Objects with higher access rates contribute more to what is perceived as database freshness
• For a database with N objects Oi each with popularity fOi the freshness is defined as :
Weighted Web object freshness
![Page 12: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/12.jpg)
BCI 2003 Aristotle University of Thessaloniki 12
November 22, 2003
• Devise a sequence of update disseminations so as to maximize F(D,T)
• Hence: The “best-effort” cache coherence maintenance is a nonpreemptive
scheduling problem
Maintain best-effort coherency
![Page 13: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/13.jpg)
BCI 2003 Aristotle University of Thessaloniki 13
November 22, 2003
FIFO scheduling
• Assume that there are sufficient – network resources– processing resources
• Use of the FIFO scheduling (First-Come-first-Served)
• Visualize our scheduling problem with the 2-dimensional Gantt charts (Goemans & Williamson, SIAM Journal on Discrete Mathematics’00)
![Page 14: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/14.jpg)
BCI 2003 Aristotle University of Thessaloniki 14
November 22, 2003
• We have three pending refreshes in the server's queue, i.e., Refresh1, Refresh2 and Refresh3, which occurred with the order mentioned
Example of updates
Total cost PopularityRefresh1 4 5Refresh2 3 4Refresh3 1 2
![Page 15: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/15.jpg)
BCI 2003 Aristotle University of Thessaloniki 15
November 22, 2003
2-D Gantt chart for FIFO
popu
larit
y
2
8
11
6
8
4
2
64cost
1
2
3
Divergence = 1 - Freshness = Area under the thick polygonal line = 64
![Page 16: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/16.jpg)
BCI 2003 Aristotle University of Thessaloniki 16
November 22, 2003
Can we do better ?
popu
larit
y
2
8
11
6
8
4
2
64cost
1
2
3
![Page 17: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/17.jpg)
BCI 2003 Aristotle University of Thessaloniki 17
November 22, 2003
Can we do better ?
popu
larit
y
2
8
11
6
8
4
2
64cost
1
2
3
![Page 18: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/18.jpg)
BCI 2003 Aristotle University of Thessaloniki 18
November 22, 2003
Yes ! Schedule the max(pop/cost)
Divergence = 1 - Freshness = Area under the thick polygonal line = 58 (10% gains even for this small example)
popu
larit
y
2
8
11
6
8
4
2
64cost
1
2
3
pop/cost
Refresh1 5/4=1,25
Refresh2 4/3=1,33
Refresh3 2/1=2
![Page 19: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/19.jpg)
BCI 2003 Aristotle University of Thessaloniki 19
November 22, 2003
• Select for dissemination the update with the largest popularity/cost ratio
• It can be proved that this rule is optimal• No longer optimal in the presence of
dependencies• Very efficient heuristic even when there
exist dependencies
Largest Slope Rule scheduling
![Page 20: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/20.jpg)
BCI 2003 Aristotle University of Thessaloniki 20
November 22, 2003
• Related work & Motivation• Proposed method• Preliminary performance
evaluation• Conclusions & Future work
Presentation Outline
![Page 21: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/21.jpg)
BCI 2003 Aristotle University of Thessaloniki 21
November 22, 2003
Simulated System Hardware
MasterCDN
CDN server n
Routers/Gateways
Parasol NodeParasol CPUParasol Network Link
RouterRouter
Router
RouterRouterRouter
CPU:2 CPU:1
CPU:0
CDN server 1 CDN server 2
![Page 22: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/22.jpg)
BCI 2003 Aristotle University of Thessaloniki 22
November 22, 2003
Simulated System Model
DispatcherScheduleralgorithm
Relation updates
DBMS
ViewUpdater
CDN1updater
CDN2updater
CDNnupdater
CDN1 CDN2 CDNn
DB updates
Request for view update
Master CDN
1
2 3
4
5 6
![Page 23: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/23.jpg)
BCI 2003 Aristotle University of Thessaloniki 23
November 22, 2003
masterCDN components
DBMS
CPU:1ViewUpdater
Node:MasterCDN
CPU:0DispatcherCPU:2
Pool of views to
be updated
Scheduler
algorithm
CDN1updater
Pool of
views to
transmit CDN2
updater
Pool of views
to transmi
t CDNnupdater
Pool of
views to
transmit
Rel. Q
ueue
Relation update
![Page 24: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/24.jpg)
BCI 2003 Aristotle University of Thessaloniki 24
November 22, 2003
• Synthetic (sample CDN with 10 edge servers)– Synthetic data generator
•Modeling network nodes, network bandwidth, size of documents, relations, views, view derivation hierarchy, update rates, popularity
• Examine the impact of:– update rate– number of relations
Methodology
![Page 25: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/25.jpg)
BCI 2003 Aristotle University of Thessaloniki 25
November 22, 2003
Freshness vs. Update rate
![Page 26: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/26.jpg)
BCI 2003 Aristotle University of Thessaloniki 26
November 22, 2003
Freshness vs. Update rate
![Page 27: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/27.jpg)
BCI 2003 Aristotle University of Thessaloniki 27
November 22, 2003
Freshness vs. Update rate
![Page 28: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/28.jpg)
BCI 2003 Aristotle University of Thessaloniki 28
November 22, 2003
Freshness vs. #Relations
![Page 29: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/29.jpg)
BCI 2003 Aristotle University of Thessaloniki 29
November 22, 2003
LSR Freshness vs. update rate
![Page 30: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/30.jpg)
BCI 2003 Aristotle University of Thessaloniki 30
November 22, 2003
Freshness vs. (#Rel, dep_density)
Top: 100 Rels
Botom: 500 Rels
Left: Sparse dep. Right: Dense dep.
![Page 31: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/31.jpg)
BCI 2003 Aristotle University of Thessaloniki 31
November 22, 2003
• Related work & Motivation• Proposed method• Preliminary performance
evaluation• Conclusions & Future work
Presentation Outline
![Page 32: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/32.jpg)
BCI 2003 Aristotle University of Thessaloniki 32
November 22, 2003
• Conclusions– we proposed a best-effort cache coherence maintenance
scheme for the edge servers of a CDN– it is a pure push-based dissemination method– the scheme is based on the LSR scheduling algorithm– we presented preliminary results to justify its efficiency
• Future work– Organize the edge serves into a (possibly) deep hierarchy,
so as to parallelize the update dissemination
Conclusions & Future work
![Page 33: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/33.jpg)
BCI 2003 Aristotle University of Thessaloniki 33
November 22, 2003
1. L. Bright and L. Raschid, Using Latency-Recency Profiles for Data Delivery on the Web, Proc. of the VLDB, pp. 550-561, 2002.
2. J. Challenger, A. Iyengar, and P. Dantzig, A Scalable System for Consistently Caching Dynamic Web Data, Proc. of the IEEE INFOCOM, 1999.
3. J. Cho and H. Garcia-Molina, Synchronizing a Database to Improve Freshness, Proc. of the ACM SIGMOD, pp. 117-128, 2000.
4. E. Cohen and H. Kaplan, Refreshment Policies for Web Content Caches, Computer Networks, 38(6), 795-808, 2002.
5. A. Gal and J. Eckstein, Managing Periodically Updated Data in Relational Databases: A Stochastic Modeling Approach, Journal of the ACM, 48(6), pp. 1141-1183, 2001.
6. M.X. Goemans and D.P. Williamson, Two-Dimensional Gantt Charts and a Scheduling Algorithm of Lawler, SIAM Journal on Discrete Mathematics, 13(3), pp. 281-294, 2000.
7. D. Katsaros and Y. Manolopoulos, Caching in Web Memory Hierarchies, Proc. of the ACM SAC, 2004.
8. A. Labrinidis and N. Roussopoulos, Update Propagation Strategies for Improving the Quality of Data on the Web, Proc. of the VLDB, 2001.
9. A. Nanopoulos, D. Katsaros and Y. Manolopoulos, A Data Mining Algorithm for Generalized Web Prefetching, IEEE Trans. on Knowledge and Data Engineering, 15(5), pp.1155-1169, 2003.
10. C. Olston and J. Widom, Adaptive Precision Setting for Cached Approximate Values, Proc. of the ACM SIGMOD, pp. 355-366, 2001.
11. C. Olston and J. Widom, Best-Effort Cache Synchronization with Source Cooperation, Proc. of the ACM SIGMOD, pp. 73-84, 2002.
12. J.W. Wang, D. Evans and M. Kwok, On Staleness and the Delivery of Web Pages, Information Systems Frontiers, 5(2), pp. 129-136, 2003.
References
![Page 34: Updating Web views distributed over wide area networks](https://reader034.vdocuments.us/reader034/viewer/2022051517/56815a7a550346895dc7e44c/html5/thumbnails/34.jpg)
BCI 2003 Aristotle University of Thessaloniki 34
November 22, 2003
Sidiropoulos AntonisDept. of InformaticsAristotle UniversityThessaloniki, 54124, [email protected]://users.auth.gr/~asidirop
Katsaros DimitriosDept. of InformaticsAristotle UniversityThessaloniki, 54124, [email protected]://skyblue.csd.auth.gr
Contact information