Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR in repack.repack_swap() if CIC creates a new index on the original table after rebuild_indexes() on new table #420

Open
JohnHien opened this issue Sep 1, 2024 · 2 comments
Milestone

Comments

@JohnHien
Copy link

JohnHien commented Sep 1, 2024

In repack_one_table(), the rebuild_indexes() function is called to rebuild the indexes on the new table and the CREATE INDEX commands are from the index definition on the original table.
What if a new index is created on the original table after the rebuild_indexes() function called? I think this new index on the original table will not be created in the new table.

To prove it, I modified the code in repack_one_table() to sleep for 20s after calling rebuild_indexes(). When the pg_repack client was sleeping, I attached to it with gdb, so the client paused and I had time to create the new index.
I executed the CIC command and a new index test_a_idx was created on the original table. This is possible because only an ACCESS SHARE lock is held on the table now.

postgres=# create index concurrently on test(a);
CREATE INDEX
postgres=# \d+ test
                                          Table "public.test"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           | not null |         | plain   |             |              |
Indexes:
    "test_pkey" PRIMARY KEY, btree (a)
    "test_a_idx" btree (a)
Triggers firing always:
    repack_trigger AFTER INSERT OR DELETE OR UPDATE ON test FOR EACH ROW EXECUTE FUNCTION repack.repack_trigger('a')
Access method: heap

Then I executed 'c' to continue in gdb and I got the error in repack_swap():

ERROR: query failed: ERROR:  relation "repack.index_16577" does not exist
CONTEXT:  SQL statement "SELECT X.oid, Y.oid  FROM pg_catalog.pg_index I,       pg_catalog.pg_class X,       pg_catalog.pg_class Y WHERE I.indrelid = $1   AND I.indexrelid = X.oid   AND I.indisvalid   AND Y.oid = ('repack.index_' || X.oid)::regclass"
DETAIL: query was: SELECT repack.repack_swap($1)

repack_swap() assumes that all the indexes on the original table are already created on the new table with name index_, but in this case the new index test_a_idx with oid 16577 was not created on the new table because it was created after rebuild_indexes().

postgres=# select relname from pg_class where oid = 16577;
  relname
------------
 test_a_idx
(1 row)

I know this is a corner case and I am not sure whether we should fix it.

@JohnHien JohnHien changed the title ERROR in repack.repack_swap() if CIC creates a new index on the original table after rebuild_indexes() in new table ERROR in repack.repack_swap() if CIC creates a new index on the original table after rebuild_indexes() on new table Sep 2, 2024
@za-arthur
Copy link
Collaborator

I think this would require refactoring of repack_one_table(). And this issue looks related to the issue #421.

New algorithm can look like this:
1 - get list of tables
2 - call repack_one_table()

  • lock the table
  • get information about the table, like the list of columns, indexes, etc.
  • do the rest

Now the issue is that ACCESS SHARE lock (as currently) won't prevent from creating new indexes. CREATE INDEX acquires SHARE lock and CREATE INDEX CONCURRENTLY acquires SHARE UPDATE EXCLUSIVE.
If pg_repack will acquire SHARE lock then it will also lock all other writes to the table (like INSERT, UPDATE), which is not acceptible
If pg_repack will acquire SHARE UPDATE EXCLUSIVE lock then it seems it will prevent from both CREATE INDEX and CIC (according to the documentation):

Conflicts with the SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock modes. This mode protects a table against concurrent schema changes and VACUUM runs.

So SHARE UPDATE EXCLUSIVE lock might be suitable lock. But it will also lock VACUUM and ANALYZE on the table.

Thoughts?

@andreasscherbaum
Copy link
Collaborator

SHARE UPDATE EXCLUSIVE is a rather big change. On first glance I don't see that it breaks other DML operations, but it certainly can break other workflows.

If we implement this, this deserves a version number change, and a longer beta test where we hear back from users if it works for them.

@za-arthur za-arthur added this to the 1.6.0 milestone Sep 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants