-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catalog/lease: purge old leases after getting the initial version #139589
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @fqazi)
pkg/sql/catalog/lease/lease.go
line 1517 at r1 (raw file):
if durationUntilExpiry < m.storage.leaseRenewalTimeout() { if err := t.maybeQueueLeaseRenewal(ctx, m, id, desc.GetName()); err != nil { desc.Release(ctx)
nit: could we have this be a step in maybeQueueLeaseRenewal
instead? something like this:
func (t *descriptorState) maybeQueueLeaseRenewal(
ctx context.Context, m *Manager, desc *descriptorVersionState,
) (retErr error) {
if !atomic.CompareAndSwapInt32(&t.renewalInProgress, 0, 1) {
return nil
}
defer func() {
if retErr != nil {
desc.Release(ctx)
}
}()
// Start the renewal. When it finishes, it will reset t.renewalInProgress.
newCtx := m.ambientCtx.AnnotateCtx(context.Background())
// AddTags and not WithTags, so that we combine the tags with those
// filled by AnnotateCtx.
newCtx = logtags.AddTags(newCtx, logtags.FromContext(ctx))
return t.stopper.RunAsyncTask(newCtx,
"lease renewal", func(ctx context.Context) {
t.startLeaseRenewal(ctx, m, desc.GetID(), desc.GetName())
})
}
pkg/sql/catalog/lease/lease.go
line 1703 at r1 (raw file):
// descriptor versions, which could have been acquired concurrently. // For example the range feed sees version 2 and a query concurrently // acquires version 1.
nice catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)
pkg/sql/catalog/lease/lease.go
line 1517 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: could we have this be a step in
maybeQueueLeaseRenewal
instead? something like this:func (t *descriptorState) maybeQueueLeaseRenewal( ctx context.Context, m *Manager, desc *descriptorVersionState, ) (retErr error) { if !atomic.CompareAndSwapInt32(&t.renewalInProgress, 0, 1) { return nil } defer func() { if retErr != nil { desc.Release(ctx) } }() // Start the renewal. When it finishes, it will reset t.renewalInProgress. newCtx := m.ambientCtx.AnnotateCtx(context.Background()) // AddTags and not WithTags, so that we combine the tags with those // filled by AnnotateCtx. newCtx = logtags.AddTags(newCtx, logtags.FromContext(ctx)) return t.stopper.RunAsyncTask(newCtx, "lease renewal", func(ctx context.Context) { t.startLeaseRenewal(ctx, m, desc.GetID(), desc.GetName()) }) }
Done.
Great idea!
pkg/sql/catalog/lease/lease.go
line 1703 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nice catch!
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @fqazi)
pkg/sql/catalog/lease/lease.go
line 1349 at r2 (raw file):
if err := t.maybeQueueLeaseRenewal( ctx, m, descVersion); err != nil { descVersion.Release(ctx)
nit: remove the Release
here now
pkg/sql/catalog/lease/lease.go
line 1517 at r2 (raw file):
if durationUntilExpiry < m.storage.leaseRenewalTimeout() { if err := t.maybeQueueLeaseRenewal(ctx, m, desc); err != nil { desc.Release(ctx)
nit: remove the Release
here now
Previously, when the initial version of a descriptor was being acquired by the range feed, that an executing query could fetch an older version. The initial version fetched will normally be the latest observed version by the range feed as a minimum. When this occurred the queries previous version could linger, since the initial version acquisition never purges old versions. This could cause WaitForOneVersion checks to be stuck forever, since the old version could only be purged on access. To address this, this patch modifies the initial version logic to always issue a purge after, which will guarantee that any previous version is released. This patch also fixes a few other minor cases where leases could leak, but these should eventually clean up based on the lease expiry when a new version arrives. Fixes: cockroachdb#139168 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)
pkg/sql/catalog/lease/lease.go
line 1349 at r2 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: remove the
Release
here now
Done.
@rafiss TFTR! bors r+ |
@fqazi thanks for making this change! I've got 2 questions:
|
blathers backport 25.1 |
Sadly this was more a race condition with concurrent lease acquisitions, which makes it tricky to add unit test coverage. I'll add a concurrent acquisition test for creation scenarios, but it will still have a one in N chance of catching it.
Yes, done. |
Previously, when the initial version of a descriptor was being acquired by the range feed, that an executing query could fetch an older version. The initial version fetched will normally be the latest observed version by the range feed as a minimum. When this occurred the queries previous version could linger, since the initial version acquisition never purges old versions. This could cause WaitForOneVersion checks to be stuck forever, since the old version could only be purged on access. To address this, this patch modifies the initial version logic to always issue a purge after, which will guarantee that any previous version is released. This patch also fixes a few other minor cases where leases could leak, but these should eventually clean up based on the lease expiry when a new version arrives.
Fixes: #139168
Release note: None