Skip to content

release-23.1.9-rc: backupccl: during restore, do not .Next() any keys in makeSimpleImportSpans #109939

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

blathers-crl[bot]
Copy link

@blathers-crl blathers-crl bot commented Sep 1, 2023

Backport 1/1 commits from #109750 on behalf of @rhu713.

/cc @cockroachdb/release


Previously, if bulkio.restore.use_simple_import_spans was true during
restore, makeSimpleImportSpans called .Next() on all end keys of its input
file spans in order to handle the fact that these spans are end keys are end
key inclusive. This resulted in some spans having start or end keys that are
not valid for splitting. This patch removes all .Next() calls in
makeSimpleImportSpans, and instead addresses the end key inclusive file spans
by constantly keeping track of all files that do not have their end points
covered as the covering is created, and immediately populating the next cover
entry with these files.

This fixes an issue where a split can be called on an invalid key that's in the
form of someValidKey.Next() during restore. These invalid keys will generally
have a NULL at the end of the key, which will result in an error when calling
EnsureSafeSplits on this split key. Currently errors from EnsureSafeSplits
are ignored, and thus a split will always be attempted on this type of invalid
split key. This split key can land in the middle of a row with column families,
and thus result in failing SQL queries when querying the restored table.

This patch adds some additional testing for backup manifest file entries with
zero sized spans. The previous .Next() called on all file spans meant that
there were no zero sized spans, so backups with these types of files were
under tested.

Informs: #109483

Release note (bug fix): Fixes an issue where a split can be called on an
invalid key that's in the form of someValidKey.Next() during restore
with bulkio.restore.use_simple_import_spans=true. This
split key can land in the middle of a row with column families, and thus result
in failing SQL queries when querying the restored table.


Release justification: fixes a severe bug in a default-disabled codepath

…tSpans

Previously, if `bulkio.restore.use_simple_import_spans` was true during
restore, makeSimpleImportSpans called .Next() on all end keys of its input
file spans in order to handle the fact that these spans are end keys are end
key inclusive. This resulted in some spans having start or end keys that are
not valid for splitting. This patch removes all .Next() calls in
makeSimpleImportSpans, and instead addresses the end key inclusive file spans
by constantly keeping track of all files that do not have their end points
covered as the covering is created, and immediately populating the next cover
entry with these files.

This fixes an issue where a split can be called on an invalid key that's in the
form of someValidKey.Next() during restore. These invalid keys will generally
have a NULL at the end of the key, which will result in an error when calling
EnsureSafeSplits on this split key. Currently errors from EnsureSafeSplits
are ignored, and thus a split will always be attempted on this type of invalid
split key. This split key can land in the middle of a row with column families,
and thus result in failing SQL queries when querying the restored table.

This patch adds some additional testing for backup manifest file entries with
zero sized spans. The previous .Next() called on all file spans meant that
there were no zero sized spans, so backups with these types of files were
under tested.

Informs: #109483

Release note (bug fix): Fixes an issue where a split can be called on an
invalid key that's in the form of someValidKey.Next() during restore
with `bulkio.restore.use_simple_import_spans=true`. This
split key can land in the middle of a row with column families, and thus result
in failing SQL queries when querying the restored table.
@blathers-crl blathers-crl bot requested a review from a team as a code owner September 1, 2023 21:54
@blathers-crl blathers-crl bot force-pushed the blathers/backport-release-23.1.9-rc-109750 branch from 235b47f to cea3c5a Compare September 1, 2023 21:54
@blathers-crl blathers-crl bot requested review from rhu713 and removed request for a team September 1, 2023 21:54
@blathers-crl blathers-crl bot force-pushed the blathers/backport-release-23.1.9-rc-109750 branch from 4b39fb1 to 925dfd8 Compare September 1, 2023 21:54
@blathers-crl blathers-crl bot added blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. labels Sep 1, 2023
@blathers-crl
Copy link
Author

blathers-crl bot commented Sep 1, 2023

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@blathers-crl blathers-crl bot requested a review from dt September 1, 2023 21:54
@blathers-crl blathers-crl bot added the backport Label PR's that are backports to older release branches label Sep 1, 2023
@blathers-crl blathers-crl bot requested a review from msbutler September 1, 2023 21:54
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rhu713 rhu713 merged commit 28b2e76 into release-23.1.9-rc Sep 1, 2023
@rhu713 rhu713 deleted the blathers/backport-release-23.1.9-rc-109750 branch September 1, 2023 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Label PR's that are backports to older release branches blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants