-
Notifications
You must be signed in to change notification settings - Fork 17
Perform unlogged index reinitialization in index_open #264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I think we need to step back and think harder how we can fix unlogged tables. Currently, we're trying to make it OK for the unlogged relation to not exist, and silence errors caused by that. That feels like an unnatural thing to do and an uphill battle, because that's not how unlogged tables work in PostgreSQL. In PostgreSQL, at server startup we scan the data directory, and initialize every unlogged relation from the "init fork". In other words, we scan the data directory searching for files with the |
Also, we need a test for this |
First of all it is not exactly true. Postgres is not resetting unlogged relations on startup: it does it only in case of recovery. I think that we need to make even more steps backward and try to understand why do we (actually not we, but Neon users) may want to use unlogged tables at all. Certainly key point is performance: no WAL overhead. In case of Neon performance advantage will be much more bigger, because unlogged table are actually local table and their access time is the same as with standalone postgres. There is one main drawbacks of unlogged tables i Vaniila: them do no survive failures. It leads to 2 ain use cases of unlogged tables in Vanilla Postgres:
In case of Neon there are two moe drawbacks of unlogged tables:
Both items actually mean that it is almost not possible to use unlogged table for more or less large data sources. You can not load data for several hours just to loose them in 5 minutes. Using unlogged table for efficient transformation into normal table may be have some sense, because we can build indexes without any access to pageserver. But take in account that data has to fit in local storage, it seems to be easier and more efficient just to use this space for local file cache which can help to reach the same goal (eliminate get_page overhead) without this tricks with unlogged->logged transformation. My option is that unlogged table in current state in eon are almost useless. I do not suggest just to ignore |
It should be fine. The whole relation directory is stored as one key-value pair in the pageserver storage. It's quick to scan through. Would you like to write a performance test for that, to verify? (If you have a large number of relations, the way we store the relation directory as one giant key-value pair might be a performance issue, but that's a separate issue.) |
Here's a proof-of-concept of a fix with that approach: https://github.com/neondatabase/neon/pull/new/fix-unlogged-in-basebackup. Need to add tests and #262. But you can use it for performance testing already |
@arssher just opened a PR for that: neondatabase/neon#3706 :-) |
Instead of trying to create missing files on the way, send init fork contents as main fork from pageserver during basebackup. Add test for that. Call put_rel_drop for init forks; previously they weren't removed. Bump vendor/postgres to revert previous approach on Postgres side. Co-authored-by: Arseny Sher <sher-ars@yandex.ru> ref neondatabase/postgres#264 ref neondatabase/postgres#259 ref #1222
Instead of trying to create missing files on the way, send init fork contents as main fork from pageserver during basebackup. Add test for that. Call put_rel_drop for init forks; previously they weren't removed. Bump vendor/postgres to revert previous approach on Postgres side. Co-authored-by: Arseny Sher <sher-ars@yandex.ru> ref neondatabase/postgres#264 ref neondatabase/postgres#259 ref #1222
Instead of trying to create missing files on the way, send init fork contents as main fork from pageserver during basebackup. Add test for that. Call put_rel_drop for init forks; previously they weren't removed. Bump vendor/postgres to revert previous approach on Postgres side. Co-authored-by: Arseny Sher <sher-ars@yandex.ru> ref neondatabase/postgres#264 ref neondatabase/postgres#259 ref #1222
No description provided.