Skip to content

Restore indexes from backup with the original partitioning #7589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 21, 2024

Conversation

jepett0
Copy link
Collaborator

@jepett0 jepett0 commented Aug 8, 2024

#7608

This PR enables restoring indices from backups with the original partition split boundaries.

This feature is mostly needed to speed-up the restoration from backups. According to my tests (tpch lineitem table with 156x scale on an 8 node cluster with the cpu80_soc2_mem512G_net25G_4ssd preset), it cuts the duration of the BuildIndexes stage of the import/s3 operation from 829 seconds to 568 seconds when restoring a 150GiB table with a single 100GiB index. This is a 31% reduction in BuildIndexes time! 🎉 Total restoration from backup time went from 1259 seconds to 995 seconds which is a 21% reduction.

C++ SDK is also changed a little to enable users to create a table with an index that has specific partitioning settings and uniform partition count or explicit split boundaries. Enabling TTableBuilder to add an index based on its description makes it as capable as session.AlterTable already was. You can see how it can be helpful in the added test.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

@jepett0 jepett0 force-pushed the IndexBackupRestore.1 branch from 7abf265 to 781ca4f Compare August 9, 2024 09:56

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

@jepett0 jepett0 requested review from MBkkt and ijon August 9, 2024 10:57
@jepett0 jepett0 marked this pull request as ready for review August 9, 2024 10:57
@jepett0 jepett0 force-pushed the IndexBackupRestore.1 branch from 781ca4f to 0196b02 Compare August 13, 2024 12:14

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This method was never called, so this bug was hidden.
+ Add a dedicated option to control inclusion of the indexImplTable boundaries in the main table description. This should help minimizing the network traffic and IO CPU pool usage.
These are needed for a good looking test. However, these functions might be helpful for users also. It is currently the only way (except direct GRPCs) to create a table with an index that has predefined split boundaries.
@jepett0 jepett0 force-pushed the IndexBackupRestore.1 branch from 33900bb to 5798b1c Compare August 16, 2024 20:36

This comment was marked as outdated.

This comment was marked as outdated.

Copy link

github-actions bot commented Aug 20, 2024

2024-08-20 15:06:11 UTC Pre-commit check for 4b6555c has started.
2024-08-20 15:09:25 UTC Check linux-x86_64-release-clang14 is running...
🟢 2024-08-20 15:55:31 UTC Build successful.

@jepett0 jepett0 merged commit e56caba into ydb-platform:main Aug 21, 2024
11 of 13 checks passed
stanislav-shchetinin pushed a commit to stanislav-shchetinin/ydb that referenced this pull request Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants