Description
What did you do?
I run postgresql-exporter in an environment with three vmagents scraping the exporter.
It happens that they do it almost simultaneously every time: all three HTTP requests come before the first answer starts to be returned, I see that from tcpdump.
At the exporter side I see multiple 'collector failed' errors every scrape round, on random collector modules.
At postgres side I see the following:
At first round of scrapes, there are 3 new connections in postgres, two of them have 'select version()' as their last query and stay idle, one is functional.
At every next round of scrapes, there are 2 new additional connections (previous idle connections remain), which are also idle, the first functional connection continues to be used.
I tried to run exporter version 0.13.2 in the same vmagent setup and it was fine: there are two connections at postgres side, which are being reused.
Also there are no leaks when I make HTTP requests one by one on version 0.14.0.
I guess it might be related to sql.Open
call in instance.setup
method, which is called on every incoming request in 0.14.0, but only once on collector initialization in 0.13.2.
https://github.com/prometheus-community/postgres_exporter/blob/v0.14.0/collector/instance.go#L46
What did you expect to see?
Postgres connections are correctly handled.
What did you see instead? Under which circumstances?
Postgres connections are used up to the limit.
Environment
- System information:
Linux 5.10.0-25-amd64 x86_64
- postgres_exporter version:
postgres_exporter, version 0.14.0 (branch: HEAD, revision: c06e57db4e502696ab4e8b8898bb2a59b7b33a59)
build user: root@f2337de13240
build date: 20230920-01:43:49
go version: go1.20.8
platform: linux/amd64
tags: netgo static_build
-
postgres_exporter flags:
-
PostgreSQL version:
PostgreSQL 16.0 (Debian 16.0-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
- Logs:
ts=2023-09-21T15:09:19.429Z caller=collector.go:199 level=error msg="collector failed" name=database duration_seconds=0.072067468 err="sql: database is closed"
ts=2023-09-21T15:09:19.429Z caller=collector.go:199 level=error msg="collector failed" name=wal duration_seconds=0.055904477 err="sql: database is closed"
ts=2023-09-21T15:09:19.431Z caller=collector.go:199 level=error msg="collector failed" name=database duration_seconds=0.057682597 err="sql: database is closed"
ts=2023-09-21T15:09:29.426Z caller=collector.go:199 level=error msg="collector failed" name=replication_slot duration_seconds=0.067115499 err="sql: database is closed"
ts=2023-09-21T15:09:29.426Z caller=collector.go:199 level=error msg="collector failed" name=locks duration_seconds=0.066662661 err="sql: database is closed"
ts=2023-09-21T15:09:29.429Z caller=collector.go:199 level=error msg="collector failed" name=database duration_seconds=0.069763998 err="sql: database is closed"