life of a promql query - perconalife of a promql query percona live, santa clara, ca – 2017-04-27...

30
Life of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 Björn “Beorn” Rabenstein, Production Engineer, SoundCloud Ltd.

Upload: others

Post on 21-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

Life of a PromQL query

Percona Live, Santa Clara, CA – 2017-04-27Björn “Beorn” Rabenstein, Production Engineer, SoundCloud Ltd.

Page 2: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

The fundamental problem of TSDBs:Vertical writes, horizontal(-ish) reads.

Time (~weeks)

TimeSeries(~millions)

Writes

Reads

Page 3: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

External storage with BigTable semanticscf. https://cloud.google.com/bigtable/docs/schema-design-time-series

...http_requests_total{status="200",method="GET"}@1434317560938 ⇒ 94355http_requests_total{status="200",method="GET"}@1434317561287 ⇒ 94934http_requests_total{status="200",method="GET"}@1434317562344 ⇒ 96483http_requests_total{status="404",method="GET"}@1434317560938 ⇒ 38473http_requests_total{status="404",method="GET"}@1434317561249 ⇒ 38544http_requests_total{status="404",method="GET"}@1434317562588 ⇒ 38663http_requests_total{status="200",method="POST"}@1434317560885 ⇒ 4748http_requests_total{status="200",method="POST"}@1434317561483 ⇒ 4795http_requests_total{status="200",method="POST"}@1434317562589 ⇒ 4833http_requests_total{status="404",method="POST"}@1434317560939 ⇒ 122...

Metric name Dimensions aka Labels Timestamp Sample Value

VALUEKEY

Page 4: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

Labels are the new hierarchies.

api_http_requests_total

method=POSTmethod=GETmethod=...

path=/trackspath=/userspath=...

status=200status=404status=...

job=api-serverjob=nodejob=...

instance=1.2.3.4:80instance=1.2.3.4:81instance=...

api-server1.2.3.4:80

/tracksGET

200404[…]

POST[…]

/users[…]

1.2.3.4:81/tracks

GET200[…]

[...]/users

[...][...]

Shamelessly stolen from Julius Volz.

Page 5: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

api_http_requests_total{method="post",code=~"2.."}

vs.

api-server.*.*.post.2*

Page 6: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

1st generation

Just LevelDB, using the well known approaches of how to implement a TSDB on top of BigTable semantics. With some tweaks…

Indices are LevelDB, too.

(Used in the prototype.)

Page 7: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

2nd generation

LevelDB only for indices.

Custom chunked storage for raw sample data, heavily (ab-)using the file system. More details at https://promcon.io/2016-berlin/talks/the-prometheus-time-series-database/

(Used in Prometheus as we know it.)

Page 8: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

3rd generation

Completely custom TSDB.

Sophisticated custom indexing.

Fully integrated raw sample storage (no abuse of the file system anymore).

Heavy use of mmap.

Details: https://fabxc.org/blog/2017-04-10-writing-a-tsdb/

(Used in upcoming Prometheus 2.)

Page 9: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

PromQL

Page 10: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

Even though Borgmon remains internal to Google, the idea of treating time-series data as a data source for generating alerts is now accessible to everyone through those open source tools like Prometheus, Riemann, Heka, and Bosun [...]

Site Reliability Engineering: How Google Runs Production Systems (O'Reilly Media)

Chapter 10: Practical Alerting from Time-Series Data

Page 11: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL
Page 12: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

PromQL is read-only

Page 13: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

rate(incoming_http_requests_total[5m])

Page 14: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

rate(incoming_http_requests_total[5m])

Page 15: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

rate(incoming_http_requests_total[5m])

SELECT job, instance, method, status, path, client, version, […] rate(value, 5m) FROM incoming_http_requests_total

Page 16: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

avg by(city) (temperature_celsius{country="germany"})

SELECT city, AVG(value) FROM temperature_celsius WHERE country="germany" GROUP BY city

Page 17: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

errors{job="foo"} / total{job="foo"}

SELECT errors.job, errors.instance, […more labels…], errors.value / total.value FROM errors, total WHERE errors.job="foo" AND total.job="foo" JOIN […some more complicated stuff here…]

Page 18: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

some_metric + 1

SELECT 1 + "value" FROM "some_metric"

offset(some_metric.*, 1)

Page 19: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

log10(some_metric)

n/a

logarithm(some_metric.*)

Page 20: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

my_a - my_b

SELECT "a" - "b" FROM "table"

(doesn't work for measurements)

reduceSeries(my.*, "diffSeries", 1, "a", "b")

Page 21: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

temperature_celsius> without(instance) group_left 2 * stddev ignoring (instance) (temperature_celsius) + avg ignoring (instance) (temperature_celsius)

n/a

n/a

Page 22: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

HTTP API

https://prometheus.io/docs/querying/api/#expression-queries

Used by

● internal expression browser● Grafana● custom clients

Page 23: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL
Page 24: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL
Page 25: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

HTTP API PromQL engine

storage

type Querier interface { QueryRange( ctx context.Context, from, through model.Time, matchers ...*metric.LabelMatcher, ) ([]SeriesIterator, error) QueryInstant( ctx context.Context, ts model.Time, stalenessDelta time.Duration, matchers ...*metric.LabelMatcher, ) ([]SeriesIterator, error)// ...}

{code="404"}{code!="404"}{code=~"2.."}{code!~"2.."}

disk

Page 26: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

Indices (simplified, Prometheus 1.x)

1. Key: Label name → Value: all existing label values for that name labelname→labelvalues

2. Key: Label pair ({name="value"})→Value: all series with that pairlabelpair→series

(N.B.: Metric name xxx becomes {__name__="xxx"} .)

Page 27: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

Life of a QueryRange / QueryInstant call (simplified)

1. Resolve negative and regexp matchers into a set of possible simple matchers using labelname→labelvalues.

2. Lookup possible series for each simple matcher using labelpair→series and intersect/union the result.

3. For each remaining series, find the chunks for the requested time (instant or range), load them from disk if needed, and pin them into memory.

4. Return iterators for the series.5. PromQL engine does its thing with them and then closes them.6. Closing the iterators will unpin the chunks from memory (releasing

them into an LRU cache).

Page 28: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

GET /api/v1/query?query=sum(rate(errors_total{job="foo"}[5m]))/sum(rate(requests_total{job="foo"}[5m]))

HTTP API PromQL engine

storage disk

QueryRange(ctx, now–5m, now, {__name__="errors_total", job="foo"})QueryRange(ctx, now–5m, now, {__name__="requests_total", job="foo"})

chunks [now-5m, now] for: {__name__="errors_total", job="foo", code="503", instance="1.2.3.4:80"}, {__name__="errors_total", job="foo", code="500", instance="4.5.3.1:80"}, …

Page 29: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

Credits

PromQL example queries and the comparison to other query languages are taken from:

● Julius Volz’s talk Prometheus Design and Philosophy https://promcon.io/2016-berlin/talks/prometheus-design-and-philosophy

● Brian Brazil’s blog post Translating between monitoring languages https://www.robustperception.io/translating-between-monitoring-languages

Page 30: Life of a PromQL query - PerconaLife of a PromQL query Percona Live, Santa Clara, CA – 2017-04-27 ... Resolve negative and regexp matchers into a set of possible simple ... 5. PromQL

prometheus.io

Find the slides at https://github.com/beorn7/talks