CBG-5153: Implement resync using CBGT for a single node#8057
CBG-5153: Implement resync using CBGT for a single node#8057RIT3shSapata wants to merge 7 commits intomainfrom
Conversation
torcolvin
left a comment
There was a problem hiding this comment.
A few comments before I look more, I think the test is passing right now because it starts the non distributed resync.
torcolvin
left a comment
There was a problem hiding this comment.
I gave a non exhaustive review for any cases where import was used for code, but I don't think I found all cases. Definitely give a look through the code pathways to see if you can find other locations.
base/dcp_sharded.go
Outdated
| // StartShardedDCPFeed initializes and starts a CBGT Manager targeting the provided bucket. | ||
| // dbName is used to define a unique path name for local file storage of pindex files | ||
| func StartShardedDCPFeed(ctx context.Context, dbName string, configGroup string, uuid string, heartbeater Heartbeater, bucket Bucket, spec BucketSpec, scope string, collections []string, numPartitions uint16, cfg cbgt.Cfg) (*CbgtContext, error) { | ||
| func StartShardedDCPFeed(ctx context.Context, dbName string, configGroup string, uuid string, heartbeater Heartbeater, bucket Bucket, spec BucketSpec, scope string, collections []string, numPartitions uint16, cfg cbgt.Cfg, resyncIndex bool) (*CbgtContext, error) { |
There was a problem hiding this comment.
This could take DestType instead of a boolean parameter, which would be generally preferred since it would be more reasonable. The name DestType might not be appropriate, but it could be ShardedDCPFeedType instead?
It might be worth figuring out below what is actually needed to do to split on the feed type and whether to pass these arguments into this function or pass a type parameter. Review this comment after looking through the rest of the code and the comments.
|
|
||
| // Given a dbName, generate a unique and length-constrained index name for CBGT to use as part of their DCP name. | ||
| func GenerateIndexName(dbName string) string { | ||
| func GenerateIndexName(dbName string, feedID string) string { |
There was a problem hiding this comment.
We can pull this into a separate ticket but I think that this value needs to be <200 characters and we need to shorten this.
We can't break code that might work for import but as we make the index name longer, we want to make sure that this code works if there is a long database name.
sync_gateway/rest/config_manager.go
Line 832 in aa83a21
I think creating a separate ticket is a good plan for this.
There was a problem hiding this comment.
Like I mentioned in my comment earlier, I don't think it will ever exceed 34:
#8057 (comment)
we can pair up and discuss it further
| // the only index defined, and the name is safe. In that case, continue using legacy index name | ||
| // to avoid restarting the import processing from zero | ||
| func dcpSafeIndexName(ctx context.Context, c *CbgtContext, dbName string) (safeIndexName, previousUUID string) { | ||
| func dcpSafeIndexName(ctx context.Context, c *CbgtContext, dbName string, feedType ShardedDCPFeedType, feedID string) (safeIndexName, previousUUID string) { |
There was a problem hiding this comment.
I think we can pass feedType to both functions here, and through the callstack.
There was a problem hiding this comment.
I did not understand this, could you please elaborate more?
There was a problem hiding this comment.
Can you pass just feedType ShardedDCPFeedType? I think that the feedID should actually always be the same for resync, and doesn't need to be unique per run.
That was my mistake in the earlier specification.
|
|
||
| base.StoreDestFactory(loggingCtx, resyncDestKey, resyncDestFunc) | ||
|
|
||
| base.InfofCtx(loggingCtx, base.KeyJavascript, "ResyncID: %s Starting DCP resync for bucket: %q ", resyncLoggingID, base.UD(bucket.GetName())) |
There was a problem hiding this comment.
| base.InfofCtx(loggingCtx, base.KeyJavascript, "ResyncID: %s Starting DCP resync for bucket: %q ", resyncLoggingID, base.UD(bucket.GetName())) | |
| base.InfofCtx(loggingCtx, base.KeyAll, "Resync: Starting DCP resync for bucket: %q ", base.UD(bucket.GetName())) |
I believe that the loggingCtx should already contain the CorrelationID and so it doesn't need to be duplicated. You should make sure that is true.
There was a problem hiding this comment.
loggingCtx does have correlationID field but it is empty in this scenario. On checking the code, UserLogCtx does not populate the correlationID. So I think this is necessary here?
There was a problem hiding this comment.
Agree, this is out of scope for fix in this PR, let me see about fixing this separately. We should still change the log key to be KeyAll and Resync: is redundant.
| // format: _sync:{m_$}:cfg[groupID:] (collections) | ||
| // format: _sync:cfg:[groupID:] (default) |
There was a problem hiding this comment.
| // format: _sync:{m_$}:cfg[groupID:] (collections) | |
| // format: _sync:cfg:[groupID:] (default) | |
| // format: _sync:{m_$}:resync_cfg[groupID:] (collections) | |
| // format: _sync:resync_cfg:[groupID:] (default) |
CBG-5153
DRAFT PR
Describe your PR here...
Pre-review checklist
fmt.Print,log.Print, ...)base.UD(docID),base.MD(dbName))docs/apiDependencies (if applicable)
Integration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/0000/