Skip to content
This repository was archived by the owner on Oct 2, 2025. It is now read-only.

Fix inefficient sql for restore statistics#130

Merged
KnightMurloc merged 3 commits intomasterfrom
ADBDEV-8064
Aug 27, 2025
Merged

Fix inefficient sql for restore statistics#130
KnightMurloc merged 3 commits intomasterfrom
ADBDEV-8064

Conversation

@KnightMurloc
Copy link

@KnightMurloc KnightMurloc commented Aug 21, 2025

Statistics are restored in 3 stages. In the first step, we update reltuples in
pg_class. In the second step, we delete statistics for a specific attribute in
pg_statistic. And at the last stage, we insert new statistics for this
attribute. At the second stage, to determine the attribute number, it is
searched in pg_attribute. For this, a subquery and the IN operator were used.
This led to an inefficient plan that contains a seq scan on pg_statistic. This,
in turn, can significantly affect the speed of statistical restore.

Fix this by replacing the IN operator with = in the attribute statistics
deletion query. We can be sure that the subquery will return no more than one
row, since pg_attribute has a unique index on the attrelid and attname
attributes.

Since this fix will only affect new backups, a fix for gprestore has also been
added. For backups created by problematic versions of gpbackup, nested loop is
enabled before restoring statistics. This should lead to the use of index scan
with fewer rows.

Statistics are restored in 3 stages. In the first step, we update reltuples in
pg_class. In the second step, we delete statistics for a specific attribute in
pg_statistic. And at the last stage, we insert new statistics for this
attribute. At the second stage, to determine the attribute number, it is
searched in pg_attribute. For this, a subquery and the IN operator were used.
This led to an inefficient plan that contains a seq scan on pg_statistic. This,
in turn, can significantly affect the speed of statistical restore.

Fix this by replacing the IN operator with = in the attribute statistics
deletion query. We can be sure that the subquery will return no more than one
row, since pg_attribute has a unique index on the attrelid and attname
attributes.

Since this fix will only affect new backups, a fix for gprestore has also been
added. For backups created by problematic versions of gpbackup, nested loop is
enabled before restoring statistics. This should lead to a more effective plan
with index scan.
@whitehawk

This comment was marked as resolved.

@whitehawk
Copy link

What about opening a separate ticket to improve planner (for ex., try to replace 'IN' with '=' on preprocessing if it is guaranteed to be no more than 1 element in the set)?

@whitehawk
Copy link

What is the quantitative improvement result of the change? For ex. restore time reduced from M sec. to N sec. on K data...

@KnightMurloc
Copy link
Author

What is the quantitative improvement result of the change? For ex. restore time reduced from M sec. to N sec. on K data...

In my tests on a database with 6800 tables and 27200 rows in pg_statistic, the statistics restore time decreased from 30 to 20 minutes.

@KnightMurloc
Copy link
Author

Therefore I think this part of the description should be adjusted.

reworded

@hilltracer hilltracer self-requested a review August 22, 2025 11:19
Copy link

@hilltracer hilltracer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run linters ( make lint) I see many messages not related to current patch, but maybe we should fix it in the future?

messages
...
filepath/filepath.go:1: /usr/local/go/src/path/match.go:221:15: DecodeRuneInString not declared by package utf8 (typecheck)
package filepath
utils/io.go:78:51: cannot use v (variable of type []interface{}) as []invalid type value in argument to fmt.Fprintf (typecheck)
	bytesWritten, err := fmt.Fprintf(file.Writer, s, v...)
	                                                 ^
history/history.go:1: /usr/local/go/pkg/mod/github.com/mattn/go-sqlite3@v1.14.22/convert.go:26:30: undeclared name: any (typecheck)
package history
toc/toc.go:1: /usr/local/go/src/fmt/errors.go:22:33: undeclared name: any (typecheck)
package toc
arenadata/arenadata_helper.go:1: /usr/local/go/src/strconv/quote.go:439:18: ValidString not declared by package utf8 (typecheck)
package arenadata
options/flag.go:1: /usr/local/go/pkg/mod/github.com/spf13/pflag@v1.0.5/flag.go:910:28: cannot use a (variable of type []interface{}) as []invalid type value in argument to fmt.Errorf (typecheck)
package options
report/report.go:1: /usr/local/go/src/fmt/errors.go:22:33: undeclared name: any (typecheck)
package report
backup/queries_globals.go:227:23: expected '(', found '[' (typecheck)
func GetResourceGroups[T ResourceGroupBefore7 | ResourceGroupAtLeast7](connectionPool *dbconn.DBConn) []T {
                      ^
backup/queries_globals.go:297:20: T (variable of type ResourceGroupBefore7) is not a type (typecheck)
	results := make([]T, 0)
	                  ^
backup/predata_relations.go:335:23: MaxInt64 not declared by package math (typecheck)
	maxVal := int64(math.MaxInt64)
	                     ^
backup/predata_relations.go:336:23: MinInt64 not declared by package math (typecheck)
	minVal := int64(math.MinInt64)
	                     ^
backup/predata_relations.go:344:24: MaxInt16 not declared by package math (typecheck)
			maxVal = int64(math.MaxInt16)
			                    ^
backup/data.go:181:10: oidMap.Store undefined (type sync.Map has no field or method Store) (typecheck)
		oidMap.Store(table.Oid, Unknown)
		       ^
backup/data.go:320:24: oidMap.Load undefined (type sync.Map has no field or method Load) (typecheck)
				state, _ := oidMap.Load(table.Oid)
				                   ^
helper/backup_helper.go:1: /usr/local/go/src/sync/map.go:47:21: expected ';', found '[' (typecheck)
package helper
restore/data.go:1: /usr/local/go/pkg/mod/github.com/spf13/pflag@v1.0.5/flag.go:910:28: cannot use a (variable of type []interface{}) as []invalid type value in argument to fmt.Errorf (typecheck)
package restore
testutils/functions.go:1: /usr/local/go/pkg/mod/github.com/!greengage!d!b/gp-common-go-libs@v1.0.23/gplog/gplog.go:477:33: cannot use output[1:] (value of type []interface{}) as []invalid type value in argument to fmt.Sprintf (typecheck)
package testutils
options/options.go:395:4: ineffectual assignment to `includeOids` (ineffassign)
			includeOids = childOids
			^
backup/predata_relations.go:479:3: ineffectual assignment to `section` (ineffassign)
		section, entry := view.GetMetadataEntry()
		^
backup/queries_acl.go:323:2: ineffectual assignment to `aclCols` (ineffassign)
	aclCols := "''"
	^
backup/queries_table_defs.go:378:2: ineffectual assignment to `query` (ineffassign)
	query := ``
	^
helper/restore_helper.go:532:6: ineffectual assignment to `err` (ineffassign)
	var err error = nil
	    ^
restore/restore.go:708:2: ineffectual assignment to `err` (ineffassign)
	err = errorWriter.Flush()
	^
testutils/functions.go:83:4: ineffectual assignment to `host` (ineffassign)
			host, _ = operating.System.Hostname()
			^
make: *** [Makefile:60: lint] Error 1
make: Leaving directory '/home/gpadmin/src/gpbackup'

@hilltracer
Copy link

hilltracer commented Aug 24, 2025

I performed some experiments.

I created test data (200 tables with 200 attributes with 200 rows) and ANALYZE it to fill stats with help this script (thanks chatGPT).

spoiler
DO $$
DECLARE
    schema_name     text := 'public';
    num_tables      int  := 200; 
    num_columns     int  := 200;
    rows_per_table  int  := 200;
    t               int;
    c               int;
    tbl_name        text;
    col_defs        text;
    sel_list        text;
BEGIN
    FOR t IN 1..num_tables LOOP
        tbl_name := format('wide_%s', t);

        -- Build column definitions: c1 int, c2 int, ... cN int
        col_defs := '';
        FOR c IN 1..num_columns LOOP
            col_defs := col_defs || format('c%s int%s', c, CASE WHEN c < num_columns THEN ', ' ELSE '' END);
        END LOOP;

        -- Drop/create with proper schema qualification
        EXECUTE format('DROP TABLE IF EXISTS %I.%I', schema_name, tbl_name);

        -- Use explicit distribution to avoid NOTICE; RANDOMLY is fine for tiny data
        EXECUTE format('CREATE TABLE %I.%I (%s) DISTRIBUTED RANDOMLY', schema_name, tbl_name, col_defs);

        -- Lightweight INSERT using generate_series
        sel_list := '';
        FOR c IN 1..num_columns LOOP
            sel_list := sel_list || format('(gs %% 1000) AS c%s%s', c, CASE WHEN c < num_columns THEN ', ' ELSE '' END);
        END LOOP;

        EXECUTE format(
            'INSERT INTO %I.%I SELECT %s FROM generate_series(1, %s) AS gs',
            schema_name, tbl_name, sel_list, rows_per_table
        );

        -- Collect stats to populate pg_statistic
        EXECUTE format('ANALYZE %I.%I', schema_name, tbl_name);
    END LOOP;
END$$;

Experiment log:

spoiler
time psql -d test_db -a -f tasks/ADBDEV-8064/test_tables.sql \
> tasks/ADBDEV-8064/test_tables_output.sql 2>&1

real	0m38.811s
user	0m0.011s
sys	0m0.014s

# SELECT count(*) FROM pg_statistic;
# count 
# -------
#  40455
# (1 row)

####################################
# After patch backups and restores

time gpbackup --backup-dir=/home/gpadmin/.data/backup \
--dbname=test_db --with-stats
# ...
20250824:14:01:07 gpbackup:gpadmin:gpdb6u:579944-[INFO]:-Backup Timestamp = 20250824140107
# ...
real	0m7.163s
user	0m3.303s
sys	0m0.671s

time gpbackup --backup-dir=/home/gpadmin/.data/backup \
--dbname=test_db --with-stats --metadata-only
# ...
20250824:14:04:04 gpbackup:gpadmin:gpdb6u:581236-[INFO]:-Backup Timestamp = 20250824140404
# ...
real	0m5.997s
user	0m3.236s
sys	0m0.641s

# full backup restore
time gprestore --timestamp=20250824140107 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db

real	9m10.405s
user	0m5.703s
sys	0m4.788s

# metadata only restore
time gprestore --timestamp=20250824140404 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db

real	9m23.476s
user	0m4.890s
sys	0m4.906s

###################################################
# Before patch backups and restores
time gpbackup --backup-dir=/home/gpadmin/.data/backup \
--dbname=test_db --with-stats --metadata-only
20250824:14:37:36 gpbackup:gpadmin:gpdb6u:584317-[INFO]:-Backup Timestamp = 20250824143736

real	0m5.912s
user	0m3.222s
sys	0m0.623s

psql -c "drop database restored_db;"
time gprestore --timestamp=20250824143736 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db

real	12m9.813s
user	0m5.310s
sys	0m5.067s

###################################################
# After patch restore with before patch backup
psql -c "drop database restored_db;"
time gprestore --timestamp=20250824143736 --backup-dir=/home/gpadmin/.data/backup \
--with-stats --redirect-db=restored_db --create-db

real	9m29.430s
user	0m4.836s
sys	0m4.876s

The burst approximately ~= 25% (9m10s (or 9m30s with Nested Loop) vs 12m10sec)

Not bad, but can you explain why it takes 9 minutes instead e.g. 40 secs?

  • create tables, add data, and EXPLAIN - ~40 sec
  • backup data and stats - 7 sec
  • restore stats - 9 min (before patch it was 12 min)

It looks like there is a big potential for speedup. Do we understand why it's still so long? And what we can do with this in the future?

@whitehawk
Copy link

@KnightMurloc @hilltracer

We spend much time on the planning side, as we have a lot of queries to execute.

What if we rework the statistics backup in order to eliminate single query for each statistics entry, for ex. smth like:

  1. Dump statistics data into a separate data file ('.csv' for example). It should contain all fields from pg_statistic plus attname (as we will need to recalculate staattnum during restore). And for restore use following steps:
  2. Create a temp table (let's call it temp_pg_statistic) by COPY FROM the csv file.
  3. Update staattnum in temp_pg_statistic by extracting the right value of attnum from pg_attribure by attrelid and attname.
  4. Delete all values from pg_statistic:
DELETE FROM pg_statistic
WHERE WHERE (starelid, staattnum) IN (
    SELECT starelid, staattnum
    FROM temp_pg_statistic
);
  1. Insert new values into pg_statistic from temp_pg_statistic:
INSERT INTO pg_statistic (column1, column2, ...)
SELECT column1, column2, ...
FROM temp_pg_statistic;

In terms of planning, that should reduce the complexity from O(N) to constant time.

Do you see any drawbacks here? What do you think?

Of course, that sounds like a big rework of the statistics backup, and maybe it is better to do it as a separate feature...

@KnightMurloc
Copy link
Author

@whitehawk
I tried to implement statistics restore using a prepared statement to remove planning. But in my tests, it gave only about a 10% increase in speed. The insert becomes the largest bottleneck in this case. Implementation via COPY is likely to improve the situation, but it requires significant reworking of the backup and restore process of statistics.

@whitehawk
Copy link

whitehawk commented Aug 26, 2025

@whitehawk I tried to implement statistics restore using a prepared statement to remove planning. But in my tests, it gave only about a 10% increase in speed. The insert becomes the largest bottleneck in this case. Implementation via COPY is likely to improve the situation, but it requires significant reworking of the backup and restore process of statistics.

I've spent some time for a quick and dirty experiment - refer to steps below (commands use files funcs.sql and restore_new.sql):

## create test tables (100 tables with 50 columns, and drop first 3 columns)
psql postgres -f funcs.sql

psql postgres -c "select create_and_analyze_tables_with_drop(100, 50, 3);"

## do backup CASE 1 (current gpbackup approach)
psql postgres -c "select generate_statistic_sql(100, 50, 3);" -t -A -o /tmp/stat_backup_old.sql


## do backup CASE 2 (approach with const number of queries)
rm /tmp/pg_statistic_with_names.csv
psql postgres -c "
COPY (
    SELECT
        s.*,
        a.attname AS attribute_name,
        c.relname AS table_name
    FROM
        pg_statistic s
    JOIN
        pg_class c ON s.starelid = c.oid
    JOIN
        pg_attribute a ON s.starelid = a.attrelid AND s.staattnum = a.attnum
    WHERE
        c.relkind = 'r'
    ORDER BY
        c.relname,
        a.attname
) TO '/tmp/pg_statistic_with_names.csv' WITH (FORMAT 'csv', HEADER, ENCODING 'UTF8');"


##### RESTORE 1
## Recreate cluster
psql postgres -f funcs.sql
psql postgres -c "select create_and_analyze_tables_without_drop(100, 50, 3);"
time psql postgres -f /tmp/stat_backup_old.sql

##### RESTORE 2
## Recreate cluster
psql postgres -f funcs.sql
psql postgres -c "select create_and_analyze_tables_without_drop(100, 50, 3);"
time psql postgres -f restore_new.sql

results for restore 1:

real    2m46,507s
user    0m0,798s
sys     0m0,902s

and for restore 2:

real    0m1,018s
user    0m0,004s
sys     0m0,007s

Improvement looks promising.

Copy link

@hilltracer hilltracer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with current changes, let's investigate how we can speed up more in the separate task. Also let's decide what we will do with linter messages next week.

@KnightMurloc
Copy link
Author

@whitehawk
In my database, the second method did not show such a large increase in speed. It took 27 minutes to restore the statistics, of which 24 took the COPY from the file. The size of the statistics backup has also increased significantly from 951M created by gpbackup to 11G created via COPY.

@KnightMurloc
Copy link
Author

@whitehawk
It seems that gpbackup does not backup statistics for array columns. The one column in my table is an array, the statistics of which created most of the volume in case of COPY backup.

@whitehawk
Copy link

@whitehawk It seems that gpbackup does not backup statistics for array columns. The one column in my table is an array, the statistics of which created most of the volume in case of COPY backup.

Is it expected, or is it a separate gpbackup bug?

@KnightMurloc
Copy link
Author

KnightMurloc commented Aug 27, 2025

Is it expected, or is it a separate gpbackup bug?

Yes, this is expected.

if len(attStat.Type) > 1 && attStat.Type[0] == '_' && attStat.Type[1] != '_' {
attributeQuery = `0::smallint,

There was a comment that got lost at some point.
https://github.com/arenadata/gpbackup/blame/c6e3bc4d235866d11ed13bd0384e6a2205f243df/backup/statistics.go#L69-L73
But perhaps this limitation was relevant for 5X or even an earlier version, since my backup created via COPY was restored without errors.

UPD:
It seems that even though the data for this column was correctly loaded from the file into the temporary table, when inserted into pg_statistic, the data for the column is still not restored.

@KnightMurloc KnightMurloc merged commit bbbd801 into master Aug 27, 2025
2 checks passed
@KnightMurloc KnightMurloc deleted the ADBDEV-8064 branch August 27, 2025 12:04
KnightMurloc added a commit that referenced this pull request Sep 18, 2025
KnightMurloc added a commit that referenced this pull request Sep 18, 2025
This patch fixed inefficient sql when restoring statistics by replacing the IN
operator with =. To fix existing backups, the nested loop was enabled when
restoring statistics. This was done only for a certain range of patchsets. But
it turned out that the binaries that are supplied to customers do not have a
patchset number, which is why this optimization is not activated. Therefore,
this patch has been reworked in the next commits.

This reverts commit bbbd801.
KnightMurloc added a commit that referenced this pull request Sep 22, 2025
This patch fixed inefficient sql when restoring statistics by replacing the IN
operator with =. To fix existing backups, the nested loop was enabled when
restoring statistics. This was done only for a certain range of patchsets. But
it turned out that the binaries that are supplied to customers do not have a
patchset number, which is why this optimization is not activated. Therefore,
this patch has been reworked in the next commits.

This reverts commit bbbd801.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants