statistical anomaly detection for database monitoring
TRANSCRIPT
Optimization, Backups, Replication, and more
Baron Schwartz, Peter Zaitsev &
Vadim Tkachenko
High PerformanceMySQL
3rd Edition
Covers Version 5.5
Optimization, Backups, Replication, and more
Baron Schwartz, Peter Zaitsev &
Vadim Tkachenko
High PerformanceMySQL
3rd Edition
Covers Version 5.5
www.vividcortex.com | [email protected]
THE ONLY TOOL YOU NEEDfor MySQL Performance Management
www.vividcortex.com | [email protected]
Statistical Anomaly Detection
www.vividcortex.com | [email protected]
The Problem
• Measure all the things!
• Ooops. That’s a lot of data.
• Alert spam!
• Now what?
• Find the signal in the noise... but how?
www.vividcortex.com | [email protected]
“Maybe Anomaly Detection”
• “Show me anomalies, not all the data!”
• (There’s a certain logic to this; thresholds are a crude anomaly detection method.)
www.vividcortex.com | [email protected]
Typical Train Of Thought
• Anomalies are a subset
• Likely to be the important subset
• Anomalies thus likely to be bad and rare
• Anomalies will restore me to sanity
www.vividcortex.com | [email protected]
Techniques
• Statistics
• Machine Learning
• Artificial Intelligence
• Physical Models
• More...
www.vividcortex.com | [email protected]
www.vividcortex.com | [email protected]
Deflating Some Of The Things
• Anomalies aren’t rare
• Anomalies aren’t bad
• Anomalies aren’t objective truths
www.vividcortex.com | [email protected]
More Deflation
• Predicting isn’t judging
• You’re probably applying a model, frame of reference, and value judgment unconsciously
www.vividcortex.com | [email protected]
The Usual Process
• Define “normal”
• Predict
• Compare (quantify prediction error)
• Flag as anomalous if error too large
www.vividcortex.com | [email protected]
MORE Assumptions!
• “Data is normally distributed”
• “Normal distribution and sigmas is the model”
• “Everyone knows 3 sigmas is the standard”
• “All models result in normally distributed errors”
www.vividcortex.com | [email protected]
In Reality...
• Gaussian models are oft-used because it’s convenient, not because it’s the sole truth
• Sigmas are just a proxy for probabilities
www.vividcortex.com | [email protected]
As Simple As...?
I’d phrase it this way: if you can find a meaningful model that non-destructively transforms the data such that the mean is stable and the prediction errors are normally distributed, and you define an anomaly as an event whose prediction error is larger than 99.7% of prediction errors, then anomaly detection is simple 3-sigma math. That’s a lot of assumptions, but at least they’re stated.
www.vividcortex.com | [email protected]
Control ChartsIs the process within normal limits?
www.vividcortex.com | [email protected]
ProblemControl charts assume a stationary mean.
Systems are “less normal” than we assume, in both senses.
www.vividcortex.com | [email protected]
RecencyWhat is a system’s “recent” normal?
www.vividcortex.com | [email protected]
Moving AverageAverage over a window of recent data
www.vividcortex.com | [email protected]
Moving Control Charts
www.vividcortex.com | [email protected]
ProblemsMoving average is “more expensive” to compute
Moving average is influenced by “distant” past
www.vividcortex.com | [email protected]
These days should be remembered and kept throughout every generation
- Esther 9:28
Remember All The Things
www.vividcortex.com | [email protected]
Exponential Moving Averages
• Infinite memory, biased towards recent history (past data trails off to nothing)
• Cheap to compute
• Choose a decay factor α
• St = αxt + (1-α) St-1
www.vividcortex.com | [email protected]
EWMA = Low-Pass Filter
www.vividcortex.com | [email protected]
Choosing Decayα = 2/(N+1), where N is desired avg age of samples
www.vividcortex.com | [email protected]
Exponential Moving Control Charts
• Need exponential moving average - easy
• Need exponential moving standard deviation - hmm.
• Standard deviation = square root of variance
• Variance = “mean of square minus square of mean”
• MVP solution: exponential moving avg of squared values
• (see also http://en.wikipedia.org/wiki/EWMA_chart)
www.vividcortex.com | [email protected]
EWMA Chart
www.vividcortex.com | [email protected]
Shortcomings
• Works well when data is approximately normally distributed
• Non-Gaussian data throws “standard deviation” for a loop; false positives ensue
• Requires more advanced techniques
• STILL USEFUL ANYWAY.
www.vividcortex.com | [email protected]
False Positives Considered Harmful
• We’re spending all our time looking for failures.
• False-positive failure detections are BAD.
• See http://danslimmon.wordpress.com/2012/11/02/car-alarms-and-smoke-alarms-the-tradeoff-between-sensitivity-and-specificity/ for more on this important topic.
www.vividcortex.com | [email protected]
Solutions (?)
• Non-parametric tests
• Non-statistical methods
• Throw out anomaly detection?
www.vividcortex.com | [email protected]
Other Disciplines
• Finance
• Weather Forecasting
• Signal Processing
• Statistics, More Broadly
• Physics
www.vividcortex.com | [email protected]
Companies / References
• VividCortex doesn’t have a dog in this fight...
• If you’re interested, look into:
• Metafor Software
• Numenta Grok
• Etsy’s Kale
• Ted Dunning at MapR
• Anton Lebedevich (mabrek.github.io)
• Disclaimer: I haven’t yet seen results good enough to alert on.
www.vividcortex.com | [email protected]
What Does VividCortex Do?
• We use Adaptive Fault Detection
• Related, but NOT anomaly detection
• Uses is-work-getting-done model
• Detects small stalls / unavailability
• Not an all-purpose tool, but it’s useful
www.vividcortex.com | [email protected]
Conclusions
• Anomaly detection won’t fix alert spam IMO.
• It can be a good assistive technology.
• It can be a good component of a larger system.
• IT/monitoring is hoping for a silver bullet.
• I suggest focusing on meaning, not metrics.
www.vividcortex.com | [email protected]
Need Performance Management?
• Performance Management, not Monitoring
• Unifying Principle: Measure Work Getting Done
• 1-Second Resolution, Deep Insight
• Super-Simple Install, Zero Disruption/Config
• Fully Hosted, Low-Cost
www.vividcortex.com | [email protected]
haystack
frying pan
puppy
balloon
drill press
alice in wonderland
owl
singapore opera
chess
decaying leaf
spiral
Image Credits