Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: refine how we delete timelines after shard split #8436

Merged
merged 3 commits into from
Aug 2, 2024

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Jul 19, 2024

Problem

Previously, when we do a timeline deletion, shards will delete layers that belong to an ancestor. That is not a correctness issue, because when we delete a timeline, we're always deleting it from all shards, and destroying data for that timeline is clearly fine.

However, there exists a race where one shard might start doing this deletion while another shard has not yet received the deletion request, and might try to access an ancestral layer. This creates ambiguity over the "all layers referenced by my index should always exist" invariant, which is important to detecting and reporting corruption.

Now that we have a GC mode for clearing up ancestral layers, we can rely on that to clean up such layers, and avoid deleting them right away. This makes things easier to reason about: there are now no cases where a shard will delete a layer that belongs to a ShardIndex other than itself.

Summary of changes

  • Modify behavior of RemoteTimelineClient::delete_all
  • Add test_scrubber_physical_gc_timeline_deletion to exercise this case
  • Tweak AWS SDK config in the scrubber to enable retries. Motivated by seeing the test for this feature encounter some transient "service error" S3 errors (which are probably nothing to do with the changes in this PR)

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added c/storage/pageserver Component: storage: pageserver a/tech_debt Area: related to tech debt labels Jul 19, 2024
Copy link

github-actions bot commented Jul 19, 2024

2104 tests run: 2037 passed, 0 failed, 67 skipped (full report)


Flaky tests (1)

Postgres 14

Code coverage* (full report)

  • functions: 32.8% (7122 of 21684 functions)
  • lines: 50.4% (57360 of 113737 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
b0fd1eb at 2024-08-01T18:32:13.796Z :recycle:

@jcsp jcsp force-pushed the jcsp/ancestral-gc-timeline-deletion branch 2 times, most recently from 789ada5 to 33b70dc Compare July 19, 2024 19:27
@jcsp jcsp force-pushed the jcsp/ancestral-gc-timeline-deletion branch from 33b70dc to b0fd1eb Compare August 1, 2024 17:37
@jcsp jcsp requested a review from skyzh August 1, 2024 19:37
@jcsp jcsp marked this pull request as ready for review August 1, 2024 19:37
@jcsp jcsp requested a review from a team as a code owner August 1, 2024 19:37
@jcsp jcsp merged commit c537990 into main Aug 2, 2024
64 checks passed
@jcsp jcsp deleted the jcsp/ancestral-gc-timeline-deletion branch August 2, 2024 07:00
arpad-m pushed a commit that referenced this pull request Aug 5, 2024
## Problem

Previously, when we do a timeline deletion, shards will delete layers
that belong to an ancestor. That is not a correctness issue, because
when we delete a timeline, we're always deleting it from all shards, and
destroying data for that timeline is clearly fine.

However, there exists a race where one shard might start doing this
deletion while another shard has not yet received the deletion request,
and might try to access an ancestral layer. This creates ambiguity over
the "all layers referenced by my index should always exist" invariant,
which is important to detecting and reporting corruption.

Now that we have a GC mode for clearing up ancestral layers, we can rely
on that to clean up such layers, and avoid deleting them right away.
This makes things easier to reason about: there are now no cases where a
shard will delete a layer that belongs to a ShardIndex other than
itself.

## Summary of changes

- Modify behavior of RemoteTimelineClient::delete_all
- Add `test_scrubber_physical_gc_timeline_deletion` to exercise this
case
- Tweak AWS SDK config in the scrubber to enable retries. Motivated by
seeing the test for this feature encounter some transient "service
error" S3 errors (which are probably nothing to do with the changes in
this PR)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants