analysis of web caching architectures: hierarchical and distributed caching pablo rodriguez,...
Post on 15-Jan-2016
232 views
TRANSCRIPT
Analysis of Web Caching Architectures: Hierarchical and Distributed Caching
Pablo Rodriguez, Christian Spanner, and Ernst W.
BiersackIEEE/ACM TRANSACTIONS ON
NETWORKINGVOL. 9, NO. 4, Auguest 2001
Abstract Caching architectures
Hierarchical Distributed Hybrid
Analytical models Performance
Connection time Transmission time Total latency Bandwidth Cache load
Caching architectures Hierarchical caching
Institutional cache Intermediate cache National cache
Distributed caching Institutional cache
Network topology
The model Network model
Full O-ary tree Document model
Request – Poisson distribution Popularity - Zipf distribution
Hierarchical caching Caches are placed at the access points
between two different networks. Distributed caching
Caches are placed at the institutional network.
Network model
Document model
1N
1
,
,
1 ,
1
ison distributi Zipf theskewed how determineshat constant t a is
documentpopular most th theof raterequest theis
documents N allfor cache nalinstitutioan from raterequest theis
i
IiI
iI
N
i iII
I
i
i
i
Properties and limitations of the model O-ary trees are good models. Modifying the height or the number of
tiers of the tree can easily model other networks.
The model assumes homogeneous client communities.
Heterogeneous client communities can be easily modeled.
Simulations results in this paper should be considered as relative results.
Connection time Depend on the number of network links
from the client to the cache.
delayn propagatio hop-per the:
travelsidocument for request a that links :
tree theof level the:
noderoot andserver abetween links :
nodesroot between links :
d
L
l
z
H
i
Connection time (cont’d)
zHHHli
hc llLPdTE
2,2,,0
14
H
lii
dc zHzHLPdllLPdTE
2
0
1224124
Distance of transmission
A request first travels up then down
TCP three-way handshake
Server
Transmission time Caches operate in a cut-through mode.
zH
lii
dt
dt
zHHHlii
ht
ht
lLPlLTETE
lLPlLTETE
2
0
2,2,,0
|
|
lNlIld
l
NIH
RIl
IIl
I
hl
hithithitO
zHlHhitO
HlHhitO
HlhitO
l
1
22 ,1
,1
0 ,1
0 ,
2
Request
rate
Comparison O = 4 H = 3 z = 10 N = 250 million
Connection time
Network traffic at every tree level
Expected transmission time
(a) Non-congested national network
(b) Congested national network
Total latency
Heterogeneous client communities
(a) Expected connection time
(b) Expected transmission time
Bandwidth usage The expected number of links traversed to
distribute one packet to the clients.
(a) Regional network
(b) National network
Cache load The filtered request rate
Disk space The average Web document size S
times the average number of copies present in the caching infrastructure.
The average number of copies present in the caching infrastructure can be calculated using the probability that a new document copy is created at every cache level.
Disk space (cont’d)
A hybrid caching scheme A certain number of caches k cooperate
at every network level. When a document cannot be found in a
cache The cache checks if the document resides in
any of the cooperating caches. If multiple caches have a document copy, the
neighbor cache with the lowest latency is selected.
Otherwise, the request is then forwarded to the immediate parent cache or to the server.
Connection time
Connection time (cont’d)
Transmission time
Transmission time (cont’d)
Total latency
Bandwidth usage
(a) National network
(b) Regional network
Cache load
Conclusions Hierarchical caching architecture
Reduce the expected distance to hit a document Decrease the bandwidth usage Reduce the administrative concerns Need powerful intermediate caches or load-
balancing algorithms Distributed caching architecture
Large network distances High bandwidth usages Administrative issues
Hybrid scheme is the best