Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][Segment Replication] With Segment Replication enabled new Replica shards are falling behind Primary until an operation happens on index #5313

Closed
Rishikesh1159 opened this issue Nov 20, 2022 · 2 comments · Fixed by #5332
Assignees
Labels
bug Something isn't working distributed framework

Comments

@Rishikesh1159
Copy link
Member

Describe the bug
With Segment Replication enabled when a new replica shard is recovered/created/added to an existing cluster, then replica shards don't get a checkpoint (latest segments) from primary until an operation is performed on index. So, replica will fall behind until a new operation happens on index.

Explanation:
-> In Ideal Segment Replication scenario, when a refresh happens on index and if a new reference is opened (happens only after some operation on index) then primary shard publishes checkpoint to replicas and send segment files for replica's to catch up.

-> But in case of new replica shards added to existing cluster, replicas don't receive any checkpoint from primary until an operation (index/update/delete) happens on index. Even if we manually refresh the index, a new reference will not opened until an operation (index/update/delete) happens on index and checkpoint is never published from primary to replica. So replica will fall behind.

To Reproduce
Steps to reproduce the behavior:

  1. Start a cluster and create a new index with a primary shard.
  2. Insert some documents into the index
  3. Add new replica shard to existing cluster.
  4. Search for docs inserted in step 2 on new replica.
  5. Search on new replica will return empty even though documents are inserted successfully and present on primary.

Expected behavior
-> Search for documents on replica should not be empty if they are successfully inserted before.

Expected Solution
-> In segment replication when a new replica shard is added to existing cluster, it goes through process of peer recovery and finally mark it as STARTED.
-> After peer recovery is completed and before shard is marked as STARTED, we have to force new replica shard to start a round of replication (segment replication) to fetch latest segment files from primary shard. Then after this replication event is completed then we should mark the shard as STARTED.
-> This way replica shard will have all the latest segment files before it is STARTED and ready to be searched.

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@mch2
Copy link
Member

mch2 commented Nov 21, 2022

@Rishikesh1159 This is because phase 1 of recovery copies from the primary's latest safe commit. Any segments created after that safe commit will not be copied with recovery & will be copied on the first replication event after the replica is started.

I think its reasonable to force a round of replication here so we are not dependent on the primary receiving consistent index load & refreshing. I think we could do this by triggering a round of segrep when RecoveryListener resolves before IndicesClusterStateService marks the shard as active.

@Rishikesh1159
Copy link
Member Author

Rishikesh1159 commented Nov 22, 2022

Thanks @mch2, yes what you said is correct. Forcing a round of replication while recovering/creating new replica shard makes sense and would solve this bug. I see two possible solution to force segment replication during recovery:

  1. As you mentioned we can use recoveryListener to trigger replication event, I implemented this solution with PR: [Segment Replication] Trigger a round of replication for replica shards during peer recovery when segment replication is enabled #5332
  2. Another possible way is to trigger a publish checkpoint from primary when the finalize recovery step of replica shard is completed. We can this below piece of code block after this line in RecoverySourceHandler:
if(shard.indexSettings().isSegRepEnabled() && request.isPrimaryRelocation() ==false){
                shard.sendCheckpoint(shard);
            }

and add IndexShard with:

public void sendCheckpoint(IndexShard recoverySource){
        this.checkpointPublisher.publish(recoverySource);
    }

For now I am going with solution 1 which you suggested. If needed we can discuss solution 2 and use it instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distributed framework
Projects
Status: Done
2 participants