-
Notifications
You must be signed in to change notification settings - Fork 140
CBG-3690 don't re-read document for resync #7892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request optimizes the document resync operation by eliminating unnecessary document re-reads from the bucket. The key changes improve performance by passing the document data directly from DCP feed events to the resync function, and add support for creating HLV (Hybrid Logical Vector) metadata during resync operations.
Key Changes
- Modified
resyncDocumentto accept a*sgbucket.BucketDocumentparameter instead of re-reading from the bucket - Added HLV creation logic to the resync path for documents that don't have HLV
- Removed the non-xattr code pathway from resync operations
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| db/database.go | Updated resyncDocument signature to accept bucket document directly; added HLV update logic during resync |
| db/background_mgr_resync_dcp.go | Modified DCP callback to convert feed events to bucket documents and pass to resync |
| db/crud.go | Extended HLV update logic to handle Resync case alongside Import |
| db/document.go | Added bucketDocumentFromFeed and getBucketDocument helper functions; removed unused UnmarshalDocumentFromFeed |
| db/document_test.go | Added test for getBucketDocument function |
| db/database_test.go | Enhanced resync tests to verify HLV creation and metadata-only updates |
| base/constants.go | Added VirtualExpiry constant for fetching document expiry |
|
To consider: Is the latest set of changes better than https://github.com/couchbase/sync_gateway/pull/7892/files/eb809bf36ded144a008de3741c629a11a999d827 ? Does it add more risk to the ticket to make it not worthwhile? The idea behind the latest set of changes is to avoid having to set The refactor exposes something that is tricky which is that I thought about if the right thing to do is to add |
5081439 to
b85e507
Compare
This entire comment is obviated since it was determined to always update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
8f27379 to
f54a069
Compare
adamcfraser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, a few questions on the details.
db/document.go
Outdated
| }, nil | ||
| } | ||
|
|
||
| func getBucketDocument(ctx context.Context, collection *DatabaseCollection, docID string) (*sgbucket.BucketDocument, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a test-only function at this point in time? I was going to suggest that import could rely on preserveExpiry at this point, but it looks like this function doesn't have any non-test usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I moved this to test only code, I was going to just leave as is.
db/database.go
Outdated
| } | ||
| }) | ||
| if rawGlobalXattr != nil { | ||
| updatedDoc.Xattrs[base.GlobalXattrName] = rawGlobalXattr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be removed, since resync won't be adding/removing attachments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup
| Xattrs: xattrs, | ||
| Cas: event.Cas, | ||
| Expiry: event.Expiry, | ||
| IsTombstone: event.Opcode == sgbucket.FeedOpDeletion, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this handle all tombstone cases? e.g. if we modify a system xattr on a document that's already a tombstone, will that be sent as FeedOpDeletion or FeedOpMutation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, any tombstone mutation will show up as FeedOpDeletion.
db/database.go
Outdated
| if updatedExpiry != nil { | ||
| updatedDoc.UpdateExpiry(*updatedExpiry) | ||
| } | ||
| updatedDoc, shouldUpdate, updatedExpiry, _, updatedUnusedSequences, err = db.getResyncedDocument(ctx, doc, regenerateSequences, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider removing highSeq from getResyncedDocument since it doesn't appear to be used? Or maybe a follow-up PR to remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did decide to remove this.
db/database.go
Outdated
| }, | ||
| Expiry: updatedExpiry, | ||
| } | ||
| if doc.MetadataOnlyUpdate != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should line 1861 (adding the mou xattr) be inside this if clause? Or is there a scenario where rawMouXattr is non-nil, but doc.MetadataOnlyUpdate is nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if statement doesn't need to exist anymore, it did for <4.0 since Sync Gateway didn't write _mou unless CBS was >=7.6
CBG-3690 don't re-read document for resync
Pre-review checklist
fmt.Print,log.Print, ...)base.UD(docID),base.MD(dbName))docs/apiIntegration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/171/ (unrelated flake in TestRemoveIndexesUseViewsTrueAndFalse)