evaluation of delivery techniques for dynamic web content mor naaman, hector garcia-molina, andreas...
Post on 20-Dec-2015
221 views
TRANSCRIPT
![Page 1: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/1.jpg)
Evaluation of Delivery Techniques for Dynamic
Web Content
Mor Naaman, Hector Garcia-Molina, Andreas Paepcke
Department of Computer Science
Stanford University
{mor, hector, paepcke}@cs.stanford.edu
http://www-db.stanford.edu/
![Page 2: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/2.jpg)
Dynamic Web is Ubiquitous
![Page 3: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/3.jpg)
Problems with Dynamic Pages• Generation of pages is resource-intensive• Pages are too dynamic, or too personalized,
to be cached
• Higher load on servers (page generation and delivery)
• More network traffic
![Page 4: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/4.jpg)
We Evaluate Two Competing Solutions(Both address at least the network load)
ESI (Oracle, Akamai)
•Enables assembly of pages from small fragments
•Fragments can be cached on specialized network caches (edge servers)
•Fragments are assembled on the edge server
Class Based Delta Encoding
•Computes delta of generated page from a chosen base file
•Base files can be cached on network caches
•Client receives delta from the server and base file from cache; applies delta to base file to get final page
![Page 5: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/5.jpg)
A Page Content Model• Page composed from groups; groups
include items.• Page construction
modeled as two-phase selection (groups, then items)
Groups
Items
![Page 6: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/6.jpg)
Our Simulation
Book pages in Amazon-style website
MyYahoo-type personalized pages
Personalized stock portfolio pages
A simple personalized weather page
Test-case web pages:
![Page 7: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/7.jpg)
Simulation of ESI• Assuming Zipf-like distribution for groups and items (popularityi=k/i)• Performance highly dependant on (ranging from 0.7-1.5 in our
simulations)• Hit rate estimates for items:
=Arrival rate; TTL = item time-to-live; = constant
Sample simulation results(bookstore-type resource, With “backend” servers)
Alpha = 0.8
0
50
100
150
200
250
300
0 2000 4000 6000 8000Time-to-live (seconds)
Traf
fic
(Gb
per
Day
)
Client-EdgeEdge-backendBackend-Main site
Traffic vs. TTL
0%
20%
40%
60%
80%
100%
0.7 0.9 1.1 1.3 1.5Alpha
Hit
rate
Edge hit rate
System hit rate
Hit-rate vs. value of Zipfian parameter
![Page 8: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/8.jpg)
Class-Based Delta Encoding Simulation
• For some pages, client likely to be able to re-use base files
0
50
100
150
200
250
300
0 200 400 600 800 1000Number of base files
Traf
fic
(Gb
per
Day
)
Traff ic Without DEAggregate Client Traff icMain Site Traff ic
Traffic vs. number of base files
0
50
100
150
200
250
300
0 2500 5000 7500 10000 12500 15000Threshold (Bytes)
Tra
ffic
(G
b p
er D
ay)
Aggregate Client Traff ic
Main Site Traff ic
Traffic vs. Same-Base threshold
• For other pages, client-cache link traffic is higher than before. To minimize client traffic, use same base file owned by client if delta is larger than threshold
![Page 9: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/9.jpg)
Sample Comparison Numbers
MyYahoo-type pages
Amazon-style Book pages
Savings – client link
Savings – server link
Edge cache usage
ESI 0% 62% 1.5Mb
DE 66% 87% 3.2Mb
Savings – client link
Savings – server link
Edge cache usage
ESI 0% 30% 1.2Mb
DE -8% 82% 2.2Mb
![Page 10: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d455503460f94a224cd/html5/thumbnails/10.jpg)
Conclusions
Excellent *, Good +, Bad -, Sometimes ~
All the details: http://dbpubs.stanford.edu/pub/2003-7
ESI DE
Reduces server traffic + *
Reduces client traffic - ~
Reduces computational load on web server * -
Performance dependent on web page structure Yes Yes
Performance dependent on characteristics of data
Yes No
Benefits greater when popularity rises Yes Less
Requires main site hardware/software installation
No Yes
Requires web-page code changes Yes No
Requires network infrastructure (CDN services) Yes No
Can exploit information available from CDN for page construction
Yes No