metrics simplified
TRANSCRIPT
![Page 2: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/2.jpg)
why?
"If you can not measure it, you can not improve it" -Lord Kelvin
99.999% ("five nines") = 5.26 minutes
![Page 3: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/3.jpg)
previously ...
Sending/Collecting is complicated. Single collection server. Tedious to configure new metric collection or creation.Calculating metric from file is expensive.
![Page 4: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/4.jpg)
bottlenecks ...
Poll based collection server
Not easy (!fun) to configure new metric collection or creation.
=grunt work for ops-engineer
uhhhh....
![Page 5: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/5.jpg)
enabling technology
Graphite
RabbitMQ
Graphite Local Proxy
RockSteady ( w/ Esper )
![Page 6: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/6.jpg)
path to graph
1min.juicer.output.apple.sc1.jcr1 20 1276822626
echo "1min.juicer.output.apple.sc1.jcr1 20 1276822626" | nc localhost 3400
![Page 7: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/7.jpg)
path to graph
1min.juicer.output.apple.sc1.jcr1 20 1276822626
echo "1min.juicer.output.apple.sc1.jcr1 20 1276822626" | nc localhost 3400
![Page 8: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/8.jpg)
graph
![Page 9: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/9.jpg)
graph
![Page 10: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/10.jpg)
graph
![Page 11: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/11.jpg)
graph = post event forensic
![Page 12: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/12.jpg)
Rocksteady, metric as event
1min.juicer.common.version.sc1.jcr1 100 1276822626 INSERT INTO Deploy SELECT * FROM Metric(name='common.revision') MATCH_RECORNIZE ( partition by colo, hostname measures A.value as revision, A.colo as colo, A.hostname as hostname, A.app as app, A.timestamp as timestamp pattern (A) define A as A.value > prev(A.value))
![Page 13: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/13.jpg)
Rocksteady, metric as event
1min.juicer.common.version.sc1.jcr1 100 1276822626 INSERT INTO Deploy SELECT * FROM Metric(name='common.revision') MATCH_RECORNIZE ( partition by colo, hostname measures A.value as revision, A.colo as colo, A.hostname as hostname, A.app as app, A.timestamp as timestamp pattern (A) define A as A.value > prev(A.value))
![Page 14: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/14.jpg)
auto threshold, prediction
![Page 15: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/15.jpg)
correlation
Deployment related problem.
Capture sets of metrics when important ones crossed threshold.
Determine dependencies such as cpu to request to second or response time.
![Page 16: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/16.jpg)
correlation
Deployment related problem.
Capture sets of metrics when important ones crossed threshold.
Determine dependencies such as cpu to request to second or response time.
![Page 17: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/17.jpg)
revelation
![Page 18: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/18.jpg)
beyond simple metric
Timing info per request.
Actual time spent in each component in an application.Map out dependency, find exact area of problem.
![Page 19: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/19.jpg)
beyond simple metric
Timing info per request.
Actual time spent in each component in an application.Map out dependency, find exact area of problem.
![Page 20: Metrics simplified](https://reader033.vdocuments.us/reader033/viewer/2022052622/559704ec1a28ab5d4f8b4807/html5/thumbnails/20.jpg)
what we learned?
1. Make metric sending simple.2. Nice UI to make sense of data.3. Real time processing of metric rocks.