Skip to content

Comments

Rebase#1

Merged
ankitml merged 22 commits intoankitml:masterfrom
reorg:master
Feb 28, 2025
Merged

Rebase#1
ankitml merged 22 commits intoankitml:masterfrom
reorg:master

Conversation

@ankitml
Copy link
Owner

@ankitml ankitml commented Feb 28, 2025

No description provided.

Melkij and others added 22 commits April 8, 2024 20:42
No actual code changes. Just a small tweak in the tests to support NULL attsstattarget: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org
Currently pg_repack calls `PQconnectdb()` to connect to a database. The drawback is that it is necessary to pass a connection, and all unexpected characters should be escaped.
pg_repack has separate options for connection parameters (host, port, dbname, etc) and it isn't necessary to use `PQconnectdb()`. More robust and solid approach is to use `PQconnectdbParams()`, which accepts connection parameters as two arrays of values and it doesn't require escaping value strings.
- Add a new apply-count command line parameter to allow tuning of the REDO apply_count variable.
- Check on startup that switch_threshold is not greater or equal to apply_count, which would cause REDO to stop after the first apply_log call (fwiw this bug also exists in 1.5.0, since switch_threshold parameter was added but was not checked against the default APPLY_COUNT of 1000).
- Tuning apply_count might be useful for improving REDO performance on tables with small tuples that receive lots of updates.
- Fixes #393
…ion `--only-indexes` (#398)

When the option --only-indexes is used it doesn't make sense to repack indexes of a decoratively partitioned table since they don't store data. Therefore we can just ignore parent tables within the function repack_all_indexes().

Issue: #389
Previously, while swapping the target table and its temp table, we
used to acquire an AccessExclusiveLock on only the target table but
not on its temp table. So the following scenario could happen:

1. Run (auto) vacuum on the temp table.
2. Swap the target table's relfile node and the temp table's
relfilenode, and commit.
3. Run (auto) vacuum on the target table.

1 and 3 run concurrently on different tables but these tables use the
same relfilenode. In the reported case, since concurrent
changes (INSERT/UPDATE/DELETE) can also run concurrently, the second
vacuum ended up removing newly added tuples.

To fix the issue, this commit changes the pg_repack command so that it
acquires an AcessExclusiveLock also on the temp table before swapping
these tables.

Issue: #399
In lock_access_share() function, before acquiring an AccessShareLock
on the target table, concurrent DDL commands are terminated
unconditionally. However, there is a window between this termination
and the actual "LOCK TABLE ... IN ACCESS SHARE MODE" operation,
allowing other DDL commands to run on the target table. If the LOCK
TABLE failed due to a lock conflict, the transaction was rolled back
and retried to execute LOCK TABLE, but the LOCK TABLE command could
not be executed again as the transaction had already closed.

This commit fixes the issue by using SAVEPOINT and ROLLBACK TO
SAVEPOINT so that we can retry to take a lock in the same top
transaction.

Issue #383
This PR is follow up of #157.
- Fixed the typo: #157 (comment)
- Removed `PG_VERSION_NUM >= 90300` check since we don't support such old versions anyway
- Added check `relform1->relkind != RELKIND_INDEX` since `relfrozenxid` and `relminmxid` is non-zero only for tables

---------

Co-authored-by: Alexey Bashtanov <alexey@brandwatch.com>
Fix possible two vacuums concurrently processing the same relfilenode.
…ss_share

Use savepoints when retrying to take AccessShareLock.
Mostly list of changes, new version number was already commited earlier. Also I noticed another place (in META.json) where the minimum version of postgresql needed to be updated.
prepare changes for new release 1.5.1
commit 0cecc908e9749101b5e93ba58d76a62c9f226f9e in postgresql changed the signature of the LockHeldByMe function, which caused the pg_repack build to fail.

Actually, the orstronger argument is not important for us, since we need the strictest AccessExclusiveLock lock. But I can simplify the code a little by using a more appropriate new CheckRelationOidLockedByMe function.
fix LockHeldByMe compilation error against pg17
Co-authored-by: Pavel Seleznev <PNSeleznev@sberbank.ru>
The option --no-superuser-check allows to by-pass the check if the user is a superuser. That was done for users which run pg_repack on Amazon, where users cannot run it as a superuser.
The problem is that the option --no-superuser-check works only on the CLI level, skipping the check only by the pg_repack client. But there are also checks done by the extension functions exported to SQL.

The PR removes the check that the user is a superuser from functions repack_trigger() and repack_apply(). That check is redundant since queries executed by that functions can be executed by a user manually. Moreover repack_trigger() is a SECURITY DEFINER function, which means that it is executed with superuser privileges (pg_repack extension can be created only by superuser).

The PR changes privilege check in functions repack_swap(), repack_drop() and repack_index_swap(). Now that functions can be run by an owner of a table. That check is necessary since _swap functions swap relfilenodes on pg_class system catalog table. repack_drop() acquires ACCESS EXCLUSIVE lock and therefore it also requires privilege check.
Skipping repacking the invalid indexes could lead to corrupting the
invalid indexes. In some rare cases, this might cause inserts to
tables to fail as changes to a table continuously be applied to
invalid indexes on it and updates to corrupted invalid indexes
potentially might fail by the checks done during insertion.
Therefore, by default do not repack a table with invalid indexes.
Repack only user explicitly specifies --no-error-on-invalid-index.
Just add release notes. The version was already bumped within META.json
before.
…440)

When using --only-indexes pg_repack uses different approach to repack
indexes, which is running CREATE INDEX CONCURRENTLY command to create a
temporary index and swapping the target index and the temporary index
afterwards.
That approach didn't use any cleanup callbacks and therefore in case of
an error temporary indexes wouldn't be cleaned up.

The commit adds the callback repack_cleanup_index.
@ankitml ankitml merged commit 563cdb1 into ankitml:master Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants