rob lancaster, orbitz worldwide survival analysis & ttl optimization
Post on 11-Jan-2016
218 Views
Preview:
TRANSCRIPT
Slide 1
Rob Lancaster, Orbitz WorldwideSurvival Analysis &TTL OptimizationOutlineThe ProblemSurvival AnalysisIntroKey TermsTechniques & Models:Kaplan-Meier EstimatesParametric ModelsOptimizing Cache TTLMethodsResults
The ProblemThe hotel rate cache and TTL optimization.The Hotel Rate Cache
The Hotel Rate CacheKey/Value StoreKey: Search Criteria
Value: Hotel Rate Information
Benefit = Reduce looks & latencyCost = Increased re-price errors
hotel idcheck-in# peoplehostcheck-out# roomsThe Hotel Rate CacheEach cache entry is given a time-to-live (TTL)TTLs set based on intuition ages ago.Goal: Optimize TTL to decrease looks, control re-price errorsHow? Ideally, find greatest TTL value at which probability of rate change is below an acceptable threshold.
Survival AnalysisA brief? introduction.What is Survival Analysis?Statistical procedures for predicting time until an event occurs.Event: death, relapse, recovery, failure.Examples:Heart transplant patients:Time until death.Leukemia patients in remission:Time until relapse.Prison parolees:Re-arrest.
Key TermsSurvival Time, T vs. tFailureCensoringSurvival Function
CensoringPeriod of no informationLeft-censored.Right-censored.Causes:Individual is lost to follow-upDeath from cause unrelated to event of interestStudy endsModels assume either failure or censoring.Survival FunctionSurvival Function: S(t)Probability of survival greater than t, i.e. that T > tProperties:Non-increasingS(t) = 1, for t=0.S(t) = 0, t=
Kaplan-Meier Estimatestjmjqjnj0001411014211134211160287106910510224tj: observation timemj: number of failuresqj: number of censored observationsnj: number at risk
Kaplan-Meier Estimates
Parametric ModelsAccelerated Failure TimeAssume distributionUse regression to fit parameters. is parameterized in terms of predictor variables and regression parameters.
DistributionS(t)ExponentialWeibullLog-logistic
Optimizing Cache TTLMethods and early results.Data CollectionData is collected from service hosts in our hotel stack.Includes every live rate search (aka burst) performed by our hotel stack.Raw data: ~200 GB, compressed, 108 records.Extraction:
top related