Description
What did you do?
deploy postgres-exporter, run lots of queries
What did you expect to see?
everything works, nothing crashes
What did you see instead? Under which circumstances?
The number of time series created by postgres-exporter increased rapidly. Prometheus was OOM killed soon after.
Additional comment
I understand that it's commonly agreed that Prometheus metrics should have reasonably cardinality and avoid ID-like labels such as trace ID or query ID. This best practice has been discussed in this post and various community issues.
Of course, the user could always disable or drop these metrics, as I already have. But these are still relevant information and could cause confusion to those who haven't investigate closely (It did in my team). These data should be organized into more well reasonable metrics that either sum over different queries or put them into histograms.
Environment
-
System information:
insert output of
uname -srm
here -
postgres_exporter version:
0.8.0
-
postgres_exporter flags:
insert list of flags used here
-
PostgresSQL version:
insert PostgreSQL version here
-
Logs:
insert logs relevant to the issue here