-
Notifications
You must be signed in to change notification settings - Fork 328
Description
Describe the bug
When running a standalone postgres with monitoring enabled causes the memory usage to grow permanently until the configured limit is reached.
The affected Metrics are ccp_nodemx_mem_active_anon
and ccp_nodemx_mem_rss
Restarting the postgres-exporter free's up the used memory immediately
Further Investigation showed that a specific Query took permanently around 600ms
postgres=# explain analyze WITH all_backups AS ( SELECT config_file , jsonb_array_elements(data) AS stanza_data FROM monitor.pgbackrest_info(10) ) , per_stanza AS ( SELECT config_file , stanza_data->>'name' AS stanza , jsonb_array_elements(stanza_data->'backup') AS backup_data FROM all_backups ) SELECT a.config_file , a.stanza , a.backup_data->>'type' AS backup_type , a.backup_data->'info'->'repository'->>'delta' AS repo_backup_size_bytes , a.backup_data->'info'->'repository'->>'size' AS repo_total_size_bytes , (a.backup_data->'timestamp'->>'stop')::bigint - (a.backup_data->'timestamp'->>'start')::bigint AS backup_runtime_seconds FROM per_stanza a JOIN ( SELECT config_file , stanza , backup_data->>'type' AS backup_type , max(backup_data->'timestamp'->>'start') AS max_backup_start , max(backup_data->'timestamp'->>'stop') AS max_backup_stop FROM per_stanza GROUP BY 1,2,3) b ON a.config_file = b.config_file AND a.stanza = b.stanza AND a.backup_data->>'type' = b.backup_type AND a.backup_data->'timestamp'->>'start' = b.max_backup_start AND a.backup_data->'timestamp'->>'stop' = b.max_backup_stop;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=2626697.50..3279629.74 rows=31 width=168) (actual time=630.971..631.104 rows=2 loops=1)
Hash Cond: ((a.config_file = per_stanza.config_file) AND (a.stanza = per_stanza.stanza) AND ((a.backup_data ->> 'type'::text) = ((per_stanza.backup_data ->> 'type'::text))) AND (((a.backup_data -> 'timestamp'::text) ->> 'start'::text) = (max(((per_stanza.backup_data -> '
timestamp'::text) ->> 'start'::text)))) AND (((a.backup_data -> 'timestamp'::text) ->> 'stop'::text) = (max(((per_stanza.backup_data -> 'timestamp'::text) ->> 'stop'::text)))))
CTE all_backups
-> ProjectSet (cost=0.25..517.75 rows=100000 width=64) (actual time=630.327..630.328 rows=1 loops=1)
-> Function Scan on pgbackrest_info (cost=0.25..10.25 rows=1000 width=64) (actual time=630.308..630.308 rows=1 loops=1)
CTE per_stanza
-> ProjectSet (cost=0.00..53250.00 rows=10000000 width=96) (actual time=630.337..630.340 rows=6 loops=1)
-> CTE Scan on all_backups (cost=0.00..2000.00 rows=100000 width=64) (actual time=630.329..630.330 rows=1 loops=1)
-> CTE Scan on per_stanza a (cost=0.00..200000.00 rows=10000000 width=96) (actual time=630.339..630.340 rows=6 loops=1)
-> Hash (cost=2527968.75..2527968.75 rows=1000000 width=160) (actual time=0.208..0.208 rows=2 loops=1)
Buckets: 65536 Batches: 32 Memory Usage: 512kB
-> HashAggregate (cost=2134375.00..2517968.75 rows=1000000 width=160) (actual time=0.072..0.104 rows=2 loops=1)
Group Key: per_stanza.config_file, per_stanza.stanza, (per_stanza.backup_data ->> 'type'::text)
Planned Partitions: 64 Batches: 1 Memory Usage: 409kB
-> CTE Scan on per_stanza (cost=0.00..225000.00 rows=10000000 width=128) (actual time=0.002..0.009 rows=6 loops=1)
Planning Time: 0.267 ms
JIT:
Functions: 38
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 7.328 ms, Inlining 16.180 ms, Optimization 379.258 ms, Emission 234.166 ms, Total 636.932 ms
Execution Time: 638.649 ms
(21 rows)
After further Investigation I stumbled upon
and disabling jit
massively improved the execution time
postgres=# set jit=off;
SET
postgres=# explain analyze WITH all_backups AS ( SELECT config_file , jsonb_array_elements(data) AS stanza_data FROM monitor.pgbackrest_info(10) ) , per_stanza AS ( SELECT config_file , stanza_data->>'name' AS stanza , jsonb_array_elements(stanza_data->'backup') AS backup_data FROM all_backups ) SELECT a.config_file , a.stanza , a.backup_data->>'type' AS backup_type , a.backup_data->'info'->'repository'->>'delta' AS repo_backup_size_bytes , a.backup_data->'info'->'repository'->>'size' AS repo_total_size_bytes , (a.backup_data->'timestamp'->>'stop')::bigint - (a.backup_data->'timestamp'->>'start')::bigint AS backup_runtime_seconds FROM per_stanza a JOIN ( SELECT config_file , stanza , backup_data->>'type' AS backup_type , max(backup_data->'timestamp'->>'start') AS max_backup_start , max(backup_data->'timestamp'->>'stop') AS max_backup_stop FROM per_stanza GROUP BY 1,2,3) b ON a.config_file = b.config_file AND a.stanza = b.stanza AND a.backup_data->>'type' = b.backup_type AND a.backup_data->'timestamp'->>'start' = b.max_backup_start AND a.backup_data->'timestamp'->>'stop' = b.max_backup_stop;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=2626697.50..3279629.74 rows=31 width=168) (actual time=1.019..1.232 rows=2 loops=1)
Hash Cond: ((a.config_file = per_stanza.config_file) AND (a.stanza = per_stanza.stanza) AND ((a.backup_data ->> 'type'::text) = ((per_stanza.backup_data ->> 'type'::text))) AND (((a.backup_data -> 'timestamp'::text) ->> 'start'::text) = (max(((per_stanza.backup_data -> '
timestamp'::text) ->> 'start'::text)))) AND (((a.backup_data -> 'timestamp'::text) ->> 'stop'::text) = (max(((per_stanza.backup_data -> 'timestamp'::text) ->> 'stop'::text)))))
CTE all_backups
-> ProjectSet (cost=0.25..517.75 rows=100000 width=64) (actual time=0.405..0.406 rows=1 loops=1)
-> Function Scan on pgbackrest_info (cost=0.25..10.25 rows=1000 width=64) (actual time=0.381..0.382 rows=1 loops=1)
CTE per_stanza
-> ProjectSet (cost=0.00..53250.00 rows=10000000 width=96) (actual time=0.415..0.419 rows=6 loops=1)
-> CTE Scan on all_backups (cost=0.00..2000.00 rows=100000 width=64) (actual time=0.407..0.408 rows=1 loops=1)
-> CTE Scan on per_stanza a (cost=0.00..200000.00 rows=10000000 width=96) (actual time=0.417..0.419 rows=6 loops=1)
-> Hash (cost=2527968.75..2527968.75 rows=1000000 width=160) (actual time=0.246..0.247 rows=2 loops=1)
Buckets: 65536 Batches: 32 Memory Usage: 512kB
-> HashAggregate (cost=2134375.00..2517968.75 rows=1000000 width=160) (actual time=0.104..0.128 rows=2 loops=1)
Group Key: per_stanza.config_file, per_stanza.stanza, (per_stanza.backup_data ->> 'type'::text)
Planned Partitions: 64 Batches: 1 Memory Usage: 409kB
-> CTE Scan on per_stanza (cost=0.00..225000.00 rows=10000000 width=128) (actual time=0.002..0.012 rows=6 loops=1)
Planning Time: 0.237 ms
Execution Time: 1.374 ms
(17 rows)
In the light of these findings, it would be good to also disable
jit
in thepostgresql.conf
templates.
I am currently testing the setup with jit
disabled and will know within the next day, if the memory consumption keeps stable.
To Reproduce
Steps to reproduce the behavior:
- start a postgres container with monitoring enabled
- start an exporter container with the database configured
- request the metrics endpoint, or have Prometheus do that for you
- Watch the metrics rise
Expected behavior
Memory consumption should be at a stable level
Screenshots
will follow
Please tell us about your environment:
- Operating System:
- Where is this running: Local
- Container Image Tag:
- crunchydata/crunchy-postgres:centos8-13.3-4.7.1
- crunchydata/crunchy-postgres-exporter:centos8-4.7.1
- Platform: Docker