Fix inefficient sql for restore statistics#130
Conversation
Statistics are restored in 3 stages. In the first step, we update reltuples in pg_class. In the second step, we delete statistics for a specific attribute in pg_statistic. And at the last stage, we insert new statistics for this attribute. At the second stage, to determine the attribute number, it is searched in pg_attribute. For this, a subquery and the IN operator were used. This led to an inefficient plan that contains a seq scan on pg_statistic. This, in turn, can significantly affect the speed of statistical restore. Fix this by replacing the IN operator with = in the attribute statistics deletion query. We can be sure that the subquery will return no more than one row, since pg_attribute has a unique index on the attrelid and attname attributes. Since this fix will only affect new backups, a fix for gprestore has also been added. For backups created by problematic versions of gpbackup, nested loop is enabled before restoring statistics. This should lead to a more effective plan with index scan.
This comment was marked as resolved.
This comment was marked as resolved.
|
What about opening a separate ticket to improve planner (for ex., try to replace 'IN' with '=' on preprocessing if it is guaranteed to be no more than 1 element in the set)? |
|
What is the quantitative improvement result of the change? For ex. restore time reduced from M sec. to N sec. on K data... |
In my tests on a database with 6800 tables and 27200 rows in pg_statistic, the statistics restore time decreased from 30 to 20 minutes. |
reworded |
There was a problem hiding this comment.
When I run linters ( make lint) I see many messages not related to current patch, but maybe we should fix it in the future?
messages
...
filepath/filepath.go:1: /usr/local/go/src/path/match.go:221:15: DecodeRuneInString not declared by package utf8 (typecheck)
package filepath
utils/io.go:78:51: cannot use v (variable of type []interface{}) as []invalid type value in argument to fmt.Fprintf (typecheck)
bytesWritten, err := fmt.Fprintf(file.Writer, s, v...)
^
history/history.go:1: /usr/local/go/pkg/mod/github.com/mattn/go-sqlite3@v1.14.22/convert.go:26:30: undeclared name: any (typecheck)
package history
toc/toc.go:1: /usr/local/go/src/fmt/errors.go:22:33: undeclared name: any (typecheck)
package toc
arenadata/arenadata_helper.go:1: /usr/local/go/src/strconv/quote.go:439:18: ValidString not declared by package utf8 (typecheck)
package arenadata
options/flag.go:1: /usr/local/go/pkg/mod/github.com/spf13/pflag@v1.0.5/flag.go:910:28: cannot use a (variable of type []interface{}) as []invalid type value in argument to fmt.Errorf (typecheck)
package options
report/report.go:1: /usr/local/go/src/fmt/errors.go:22:33: undeclared name: any (typecheck)
package report
backup/queries_globals.go:227:23: expected '(', found '[' (typecheck)
func GetResourceGroups[T ResourceGroupBefore7 | ResourceGroupAtLeast7](connectionPool *dbconn.DBConn) []T {
^
backup/queries_globals.go:297:20: T (variable of type ResourceGroupBefore7) is not a type (typecheck)
results := make([]T, 0)
^
backup/predata_relations.go:335:23: MaxInt64 not declared by package math (typecheck)
maxVal := int64(math.MaxInt64)
^
backup/predata_relations.go:336:23: MinInt64 not declared by package math (typecheck)
minVal := int64(math.MinInt64)
^
backup/predata_relations.go:344:24: MaxInt16 not declared by package math (typecheck)
maxVal = int64(math.MaxInt16)
^
backup/data.go:181:10: oidMap.Store undefined (type sync.Map has no field or method Store) (typecheck)
oidMap.Store(table.Oid, Unknown)
^
backup/data.go:320:24: oidMap.Load undefined (type sync.Map has no field or method Load) (typecheck)
state, _ := oidMap.Load(table.Oid)
^
helper/backup_helper.go:1: /usr/local/go/src/sync/map.go:47:21: expected ';', found '[' (typecheck)
package helper
restore/data.go:1: /usr/local/go/pkg/mod/github.com/spf13/pflag@v1.0.5/flag.go:910:28: cannot use a (variable of type []interface{}) as []invalid type value in argument to fmt.Errorf (typecheck)
package restore
testutils/functions.go:1: /usr/local/go/pkg/mod/github.com/!greengage!d!b/gp-common-go-libs@v1.0.23/gplog/gplog.go:477:33: cannot use output[1:] (value of type []interface{}) as []invalid type value in argument to fmt.Sprintf (typecheck)
package testutils
options/options.go:395:4: ineffectual assignment to `includeOids` (ineffassign)
includeOids = childOids
^
backup/predata_relations.go:479:3: ineffectual assignment to `section` (ineffassign)
section, entry := view.GetMetadataEntry()
^
backup/queries_acl.go:323:2: ineffectual assignment to `aclCols` (ineffassign)
aclCols := "''"
^
backup/queries_table_defs.go:378:2: ineffectual assignment to `query` (ineffassign)
query := ``
^
helper/restore_helper.go:532:6: ineffectual assignment to `err` (ineffassign)
var err error = nil
^
restore/restore.go:708:2: ineffectual assignment to `err` (ineffassign)
err = errorWriter.Flush()
^
testutils/functions.go:83:4: ineffectual assignment to `host` (ineffassign)
host, _ = operating.System.Hostname()
^
make: *** [Makefile:60: lint] Error 1
make: Leaving directory '/home/gpadmin/src/gpbackup'
|
I performed some experiments. I created test data (200 tables with 200 attributes with 200 rows) and ANALYZE it to fill stats with help this script (thanks chatGPT). spoilerDO $$
DECLARE
schema_name text := 'public';
num_tables int := 200;
num_columns int := 200;
rows_per_table int := 200;
t int;
c int;
tbl_name text;
col_defs text;
sel_list text;
BEGIN
FOR t IN 1..num_tables LOOP
tbl_name := format('wide_%s', t);
-- Build column definitions: c1 int, c2 int, ... cN int
col_defs := '';
FOR c IN 1..num_columns LOOP
col_defs := col_defs || format('c%s int%s', c, CASE WHEN c < num_columns THEN ', ' ELSE '' END);
END LOOP;
-- Drop/create with proper schema qualification
EXECUTE format('DROP TABLE IF EXISTS %I.%I', schema_name, tbl_name);
-- Use explicit distribution to avoid NOTICE; RANDOMLY is fine for tiny data
EXECUTE format('CREATE TABLE %I.%I (%s) DISTRIBUTED RANDOMLY', schema_name, tbl_name, col_defs);
-- Lightweight INSERT using generate_series
sel_list := '';
FOR c IN 1..num_columns LOOP
sel_list := sel_list || format('(gs %% 1000) AS c%s%s', c, CASE WHEN c < num_columns THEN ', ' ELSE '' END);
END LOOP;
EXECUTE format(
'INSERT INTO %I.%I SELECT %s FROM generate_series(1, %s) AS gs',
schema_name, tbl_name, sel_list, rows_per_table
);
-- Collect stats to populate pg_statistic
EXECUTE format('ANALYZE %I.%I', schema_name, tbl_name);
END LOOP;
END$$;Experiment log: spoilertime psql -d test_db -a -f tasks/ADBDEV-8064/test_tables.sql \
> tasks/ADBDEV-8064/test_tables_output.sql 2>&1
real 0m38.811s
user 0m0.011s
sys 0m0.014s
# SELECT count(*) FROM pg_statistic;
# count
# -------
# 40455
# (1 row)
####################################
# After patch backups and restores
time gpbackup --backup-dir=/home/gpadmin/.data/backup \
--dbname=test_db --with-stats
# ...
20250824:14:01:07 gpbackup:gpadmin:gpdb6u:579944-[INFO]:-Backup Timestamp = 20250824140107
# ...
real 0m7.163s
user 0m3.303s
sys 0m0.671s
time gpbackup --backup-dir=/home/gpadmin/.data/backup \
--dbname=test_db --with-stats --metadata-only
# ...
20250824:14:04:04 gpbackup:gpadmin:gpdb6u:581236-[INFO]:-Backup Timestamp = 20250824140404
# ...
real 0m5.997s
user 0m3.236s
sys 0m0.641s
# full backup restore
time gprestore --timestamp=20250824140107 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db
real 9m10.405s
user 0m5.703s
sys 0m4.788s
# metadata only restore
time gprestore --timestamp=20250824140404 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db
real 9m23.476s
user 0m4.890s
sys 0m4.906s
###################################################
# Before patch backups and restores
time gpbackup --backup-dir=/home/gpadmin/.data/backup \
--dbname=test_db --with-stats --metadata-only
20250824:14:37:36 gpbackup:gpadmin:gpdb6u:584317-[INFO]:-Backup Timestamp = 20250824143736
real 0m5.912s
user 0m3.222s
sys 0m0.623s
psql -c "drop database restored_db;"
time gprestore --timestamp=20250824143736 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db
real 12m9.813s
user 0m5.310s
sys 0m5.067s
###################################################
# After patch restore with before patch backup
psql -c "drop database restored_db;"
time gprestore --timestamp=20250824143736 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db
real 9m29.430s
user 0m4.836s
sys 0m4.876sThe burst approximately ~= 25% (9m10s (or 9m30s with Nested Loop) vs 12m10sec) Not bad, but can you explain why it takes 9 minutes instead e.g. 40 secs?
It looks like there is a big potential for speedup. Do we understand why it's still so long? And what we can do with this in the future? |
|
We spend much time on the planning side, as we have a lot of queries to execute. What if we rework the statistics backup in order to eliminate single query for each statistics entry, for ex. smth like:
In terms of planning, that should reduce the complexity from O(N) to constant time. Do you see any drawbacks here? What do you think? Of course, that sounds like a big rework of the statistics backup, and maybe it is better to do it as a separate feature... |
|
@whitehawk |
…ested loop guc after statastic restore.
I've spent some time for a quick and dirty experiment - refer to steps below (commands use files funcs.sql and restore_new.sql): ## create test tables (100 tables with 50 columns, and drop first 3 columns)
psql postgres -f funcs.sql
psql postgres -c "select create_and_analyze_tables_with_drop(100, 50, 3);"
## do backup CASE 1 (current gpbackup approach)
psql postgres -c "select generate_statistic_sql(100, 50, 3);" -t -A -o /tmp/stat_backup_old.sql
## do backup CASE 2 (approach with const number of queries)
rm /tmp/pg_statistic_with_names.csv
psql postgres -c "
COPY (
SELECT
s.*,
a.attname AS attribute_name,
c.relname AS table_name
FROM
pg_statistic s
JOIN
pg_class c ON s.starelid = c.oid
JOIN
pg_attribute a ON s.starelid = a.attrelid AND s.staattnum = a.attnum
WHERE
c.relkind = 'r'
ORDER BY
c.relname,
a.attname
) TO '/tmp/pg_statistic_with_names.csv' WITH (FORMAT 'csv', HEADER, ENCODING 'UTF8');"
##### RESTORE 1
## Recreate cluster
psql postgres -f funcs.sql
psql postgres -c "select create_and_analyze_tables_without_drop(100, 50, 3);"
time psql postgres -f /tmp/stat_backup_old.sql
##### RESTORE 2
## Recreate cluster
psql postgres -f funcs.sql
psql postgres -c "select create_and_analyze_tables_without_drop(100, 50, 3);"
time psql postgres -f restore_new.sqlresults for restore 1: and for restore 2: Improvement looks promising. |
hilltracer
left a comment
There was a problem hiding this comment.
I agree with current changes, let's investigate how we can speed up more in the separate task. Also let's decide what we will do with linter messages next week.
|
@whitehawk |
|
@whitehawk |
Is it expected, or is it a separate gpbackup bug? |
Yes, this is expected. Lines 181 to 182 in e984e80 There was a comment that got lost at some point. https://github.com/arenadata/gpbackup/blame/c6e3bc4d235866d11ed13bd0384e6a2205f243df/backup/statistics.go#L69-L73 But perhaps this limitation was relevant for 5X or even an earlier version, since my backup created via COPY was restored without errors.
|
This reverts commit bbbd801.
This patch fixed inefficient sql when restoring statistics by replacing the IN operator with =. To fix existing backups, the nested loop was enabled when restoring statistics. This was done only for a certain range of patchsets. But it turned out that the binaries that are supplied to customers do not have a patchset number, which is why this optimization is not activated. Therefore, this patch has been reworked in the next commits. This reverts commit bbbd801.
This patch fixed inefficient sql when restoring statistics by replacing the IN operator with =. To fix existing backups, the nested loop was enabled when restoring statistics. This was done only for a certain range of patchsets. But it turned out that the binaries that are supplied to customers do not have a patchset number, which is why this optimization is not activated. Therefore, this patch has been reworked in the next commits. This reverts commit bbbd801.
Statistics are restored in 3 stages. In the first step, we update reltuples in
pg_class. In the second step, we delete statistics for a specific attribute in
pg_statistic. And at the last stage, we insert new statistics for this
attribute. At the second stage, to determine the attribute number, it is
searched in pg_attribute. For this, a subquery and the IN operator were used.
This led to an inefficient plan that contains a seq scan on pg_statistic. This,
in turn, can significantly affect the speed of statistical restore.
Fix this by replacing the IN operator with = in the attribute statistics
deletion query. We can be sure that the subquery will return no more than one
row, since pg_attribute has a unique index on the attrelid and attname
attributes.
Since this fix will only affect new backups, a fix for gprestore has also been
added. For backups created by problematic versions of gpbackup, nested loop is
enabled before restoring statistics. This should lead to the use of index scan
with fewer rows.