Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check Failing() before serving random node #1825

Merged
merged 4 commits into from
Jul 17, 2021

Conversation

AnatolyRugalev
Copy link
Contributor

@AnatolyRugalev AnatolyRugalev commented Jul 16, 2021

We encounter errors during Redis failover with RouteRandomly set to true.

After some debugging we found that Failing is never checked when serving random node for the slot.

This should be pretty easy to reproduce:

  1. Start cluster using this docker-compose service:
  redis-cluster:
    image: grokzen/redis-cluster:6.0.9
    container_name: redis-cluster.go-redis-test
    environment:
      - "IP=0.0.0.0"
    ports:
      - 7000-7005:7000-7005
  1. Connect to cluster, perform read/write operations
  2. docker-compose exec redis-cluster supervisorctl stop redis-2

cluster.go Outdated Show resolved Hide resolved
cluster.go Outdated
return nodes[idx], nil
}
}
return c.nodes.Random()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch - thanks 👍

Does it make sense to try random node here? It will result in MOVED error and will re-try the command again on the same failing nodes.

I think we should:

  • remove c.nodes.Random() and just use first/random failing node as we did previously
  • add an optimization for len(nodes) == 1 since rand.Perm is somewhat expensive
if len(nodes) == 1 {
    return nodes[0]
}

cluster.go Outdated
n := rand.Intn(len(nodes))
return nodes[n], nil
for _, idx := range rand.Perm(len(nodes)) {
if !nodes[idx].Failing() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly better:

if node := nodes[idx]; !node.Failing() {
    return node, nil
}

@monkey92t
Copy link
Collaborator

Thanks!

@monkey92t monkey92t merged commit 62fc2c8 into redis:master Jul 17, 2021
@monkey92t
Copy link
Collaborator

There are still some other issues that need to be resolved, we will release a new version later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants