evaluating content management techniques for web proxy caches

20
Internet Server Internet Server Evaluating Content Management Evaluating Content Management Techniques for Web Proxy Caches Techniques for Web Proxy Caches Cho Joon-ho(CA Lab, CS department, KAIST) Cho Joon-ho(CA Lab, CS department, KAIST) 2001 . 11. 6 2001 . 11. 6 Martin Arlitt, Ludmila Cherkasova, John Diley, Rich Friedrich and Tai Jin (Hewlett-Packard Laboratories) (in 2 nd Workshop on Internet Server Performance, in conjunction with ACM SIGMETRICS 99)

Upload: benita

Post on 18-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Evaluating Content Management Techniques for Web Proxy Caches. Martin Arlitt, Ludmila Cherkasova, John Diley, Rich Friedrich and Tai Jin (Hewlett-Packard Laboratories) ( in 2 nd Workshop on Internet Server Performance, in conjunction with ACM SIGMETRICS 99 ). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Evaluating Content Management Techniques for Web Proxy Caches

Internet ServerInternet ServerEvaluating Content Evaluating Content Management Techniques for Management Techniques for

Web Proxy CachesWeb Proxy Caches

Cho Joon-ho(CA Lab, CS department, KAIST)Cho Joon-ho(CA Lab, CS department, KAIST)

2001 . 11. 62001 . 11. 6

Martin Arlitt, Ludmila Cherkasova, John Diley, Rich Friedrich and Tai Jin (Hewlett-Packard Laboratories)

(in 2nd Workshop on Internet Server Performance, in conjunction with ACM SIGMETRICS 99)

Page 2: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 22 / 20 / 20

AgendaAgenda

ProblemsProblemsQuick Tour (Summary)CritiqueDesign & Design Rationale

Data Collection and ReductionKey Workload CharacteristicsExperimental Design

Simulation ResultsVirtual Cache

Page 3: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 33 / 20 / 20

ProblemsProblems

Current Web Proxy caches utilize simple Current Web Proxy caches utilize simple replacement policiesreplacement policies

Relatively low hit ratesRelatively low hit rates

Additional delaysAdditional delays

So what?Developing a quantitative understanding of Web traffic

How effective are current proxy cache replacement policies for real workloads?Focus on two performance metrics

Hit rate

Byte hit rate

Designing new replacement policiesUtilize frequency for higher performanceAre neither susceptible to cache pollution nor require parameterization

Page 4: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 44 / 20 / 20

AgendaAgenda

Problems

Quick Tour (Summary)Quick Tour (Summary)CritiqueDesign & Design Rationale

Data Collection and ReductionKey Workload CharacteristicsExperimental Design

Simulation ResultsVirtual Cache

Page 5: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 55 / 20 / 20

Quick Tour (Summary) Quick Tour (Summary) – 1/3– 1/3

The problems of existing studiesShort-term traces of busy proxies or long-term traces of relatively inactive proxies

Long-term traces in busy environments are neededLong-term traces in busy environments are needed

Trace driven simulationCollect total 117,652,652 requests during five monthUse smaller and more compact log

The points to be consideredObject sizeRecency of ReferenceFrequency of ReferenceTurnover

Page 6: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 66 / 20 / 20

Quick Tour (Summary) Quick Tour (Summary) – 2/3– 2/3

Existing replacement policyExisting replacement policyLRULRU (Least-Recently-Used)

SizeSize – replaces the largest object

GD-SizeGD-Size (GreedyDual-Size)Replaces the object with the lowest utility

LFULFU - replaces the least frequently used object

New replacement policyNew replacement policyGDSFGDSF (GreedyDual-Size with Frequency)

GD-Size + a frequency factor

LFU-DALFU-DA (Least Frequently Used with Dynamic Aging)LFU-Aging + a dynamic mechanism(Running age L)

Virtual CachesVirtual CachesLogically partitions the cache into N virtual caches

Ki=Ci/Si+L

Ki=Fi*Ci/Si+L

Ki=Ci*Fi+L

Page 7: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 77 / 20 / 20

Quick Tour (Summary) Quick Tour (Summary) – 3/3– 3/3

Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA

Comparison of Proposed Policies to Existing Replacement Policies

Page 8: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 88 / 20 / 20

AgendaAgenda

ProblemsQuick Tour (Summary)

CritiqueCritiqueDesign & Design Rationale

Data Collection and ReductionKey Workload CharacteristicsExperimental Design

Simulation ResultsVirtual Cache

Page 9: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 99 / 20 / 20

CritiqueCritique

ProsQuantitative understanding of Web traffic

Long term trace-driven simulation in busy proxy servers

Providing two new replacement algorithms that run efficientlyProviding a new cache management method, ‘Virtual Cache’

ConsNot freshNo consideration of dynamic dataNo consideration of processing overhead for these more complex algorithmsPerformance improvements are insignificant

Page 10: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1010 / 20 / 20

AgendaAgenda

ProblemsQuick Tour (Summary)Critique

Design & Design RationaleDesign & Design Rationale

Data Collection and ReductionData Collection and Reduction

Key Workload CharacteristicsKey Workload Characteristics

Experimental DesignExperimental DesignSimulation ResultsVirtual Cache

Page 11: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1111 / 20 / 20

Data Collection and ReductionData Collection and Reduction

Data collectionLong term trace-driven simulationTotal 117,652,652 requests were handled during five month periodData include

Client IP address, request time, response status, the time required for the proxy to complete its response…

Data reductionSmaller, more compact log

Due to storage constraintTo ensure that analyses and simulations could be completed in a reasonable amount of time

Reduction by Storing data in more efficient mannerRemoving information of little value

Page 12: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1212 / 20 / 20

Key Workload CharacteristicsKey Workload Characteristics

Cacheable ObjectsMost client requests be for cacheable objects (96%)

Object Set Size total 389GB

Object SizesVariable – medium : 4KB, maximum : 148MB video clip

Recency of reference1/3 of all re-references occurred within one hour

Frequency of referenceWeb referencing patterns are non-uniform

TurnoverObjects that were once popular are no longer requested

Page 13: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1313 / 20 / 20

Experimental Design Experimental Design – 1/2– 1/2

Least-Recently-Used(LRU)Replaces the object requested least recentlyConsiders only a single work load characteristic

SizeReplaces the largest objectTries to minimize the miss ratio (target to byte hit rate)Cache pollution

GreedyDual-Size(GD-Size)

GD-Size(1) for Hit RateGD-Size(Packets) for Byte Hit Rate

Ki=Ci/Si+LCi – the cost associated with bringing object i into the cache

Si – the object size

L – a running age factor

Page 14: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1414 / 20 / 20

Experimental Design Experimental Design – 2/2– 2/2

LFUReplaces the least frequently used objectLFU-Aging = LFU + Aging → avoids cache pollutionParameterization problem still remains

Greedy Dual-Size with Frequency(GDSF)GD-Size doesn’t take into account frequency

Least Frequently Used with Dynamic Aging(LFU-DA)

LFU-Aging requires parameterization to perform wellLFD-DA uses inflation factor as well as the frequency count

Ki=Fi*Ci/Si+L Fi – a frequency count

Ki=Ci*Fi+L

L – a running age factor

Page 15: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1515 / 20 / 20

AgendaAgenda

ProblemsQuick Tour (Summary)CritiqueDesign & Design Rationale

Data Collection and ReductionKey Workload CharacteristicsExperimental Design

Simulation ResultsSimulation ResultsVirtual Cache

Page 16: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1616 / 20 / 20

Simulation Results Simulation Results – 1/2– 1/2

Figure1. Comparison of existing Replacement Policies

Page 17: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1717 / 20 / 20

Simulation Results Simulation Results – 2/2– 2/2

Figure2. Comparison of Proposed Policies to Existing Replacement Policies

Page 18: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1818 / 20 / 20

AgendaAgenda

ProblemsQuick Tour (Summary)CritiqueDesign & Design Rationale

Data Collection and ReductionKey Workload CharacteristicsExperimental Design

Simulation Results

Virtual CacheVirtual Cache

Page 19: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 1919 / 20 / 20

Virtual Cache Virtual Cache – 1/2– 1/2

An approach that can focus on both of An approach that can focus on both of hit ratehit rate and and byte hit ratebyte hit rate simultaneously simultaneously

MechanismLogically partitions the cache into N virtual cachesEach virtual cache(VC)is managed with its own replacement policySteps

Initially all objects are in VC0

Replacements from VCi are moved to VCi+1

Replacements from VCi+1 are evicted form the cache

When reaccessed, objects are reinserted in VC0

Page 20: Evaluating Content Management Techniques for Web Proxy Caches

Evaluating Content Management Tech for Web Proxy CachesEvaluating Content Management Tech for Web Proxy Caches 2020 / 20 / 20

Virtual Cache Virtual Cache – 2/2– 2/2

Figure 4. Analysis of Virtual Cache Performance; VC0 using LFU-DA, VC1 using GDSF-Hits

Figure 3. Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA