Description
We've been running into an issue with pg_pathman. Versions in use:
pg_pathman: 1.4.1
Postgresql: 9.6.3
The fundamental issue is that for the first DELETE query against partitioned tables run on a connection, the planning process is extraordinarily slow, often taking several seconds, yet is much faster on subsequent executions for the same Postgres connection.
So when we first interact with a Postgres connection, and run EXPLAIN
against the child table, everything happens quite quickly:
postgres=# \c production_db
You are now connected to database "production_db" as user "postgres".
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table_2605" WHERE "schema_name"."redacted_table_2605"."customer_id" = 31104246 AND "schema_name"."redacted_table_2605"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 12.970 ms
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table_2605" WHERE "schema_name"."redacted_table_2605"."customer_id" = 31104246 AND "schema_name"."redacted_table_2605"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 4.848 ms
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table_2605" WHERE "schema_name"."redacted_table_2605"."customer_id" = 31104246 AND "schema_name"."redacted_table_2605"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 0.745 ms
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table_2605" WHERE "schema_name"."redacted_table_2605"."customer_id" = 31104246 AND "schema_name"."redacted_table_2605"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 0.958 ms
However, when do the EXPLAIN
for the query against the parent table, the first explain (and actual queries too) takes several seconds. This is the same psql session as above, using the exactly same connection, I've just split it to put this explanation in:
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table" WHERE "schema_name"."redacted_table"."customer_id" = 31104246 AND "schema_name"."redacted_table"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 2097.566 ms
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table" WHERE "schema_name"."redacted_table"."customer_id" = 31104246 AND "schema_name"."redacted_table"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 1.095 ms
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table" WHERE "schema_name"."redacted_table"."customer_id" = 31104246 AND "schema_name"."redacted_table"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 0.993 ms
production_db=# EXPLAIN DELETE FROM "schema_name"."redacted_table" WHERE "schema_name"."redacted_table"."customer_id" = 31104246 AND "schema_name"."redacted_table"."account_id" = 1931;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Delete on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
-> Index Scan using redacted_table_2605_pkey on redacted_table_2605 (cost=0.29..4.59 rows=6 width=6)
Index Cond: ((account_id = 1931) AND (customer_id = 31104246))
(3 rows)
Time: 0.928 ms
The table we're querying has the following check constraint for pg_pathman:
"pathman_redacted_table_2605_check" CHECK (account_id >= 1931 AND account_id < 1932)
Notice the timings: In both cases, where we run DELETE queries against the parent table, and directly against the child table, the Postgres query planner produces the correct query plan. But the first time it does so when pg_pathman is involved, it takes several seconds.
We've also noticed where these DELETE conditions are triggered, Postgres memory use spikes and it starts accruing tens of thousands of RowExclusiveAccess locks. This only happens with DELETE statements, and does not affect INSERT
or UPDATE
statements so far as we can tell.