Skip to content

JedisCluster retry caused java.lang.StackOverflowError #2173

@yangbodong22011

Description

@yangbodong22011

Steps to reproduce:

  1. I use the default configuration of JedisCluster to connect to Redis Cluster, then the Redis Cluster machine is down. After 5 attempts, JedisCluster gives an error No more cluster attempts left.
// JedisCluster code
JedisCluster jc = new JedisCluster(new HostAndPort("127.0.0.1", 30001));

// Exception stack
redis.clients.jedis.exceptions.JedisClusterMaxAttemptsException: No more cluster attempts left.
	at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:86)
	at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:124)
	at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:124)
	at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:124)
        ...
  1. Then I increased maxAttempts, but get java.lang.StackOverflowError error due to multiple recursions.
// JedisCluster code
JedisCluster jc = new JedisCluster(new HostAndPort("127.0.0.1", 30001), 2000, 999999999);

// Exception stack
Exception in thread "main" java.lang.StackOverflowError
	at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:153)
	at java.lang.StringCoding.decode(StringCoding.java:193)
	at java.lang.String.<init>(String.java:426)
	at java.lang.String.<init>(String.java:491)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at redis.clients.jedis.Connection.connect(Connection.java:181)
	at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:100)
	at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:1894)
	at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:117)
	at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:889)
	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:424)
	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:349)
	at redis.clients.jedis.util.Pool.getResource(Pool.java:50)
	at redis.clients.jedis.JedisPool.getResource(JedisPool.java:234)
	at redis.clients.jedis.JedisSlotBasedConnectionHandler.getConnectionFromSlot(JedisSlotBasedConnectionHandler.java:78)
	at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:102)
	at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:124)
        ...
  1. I recommend adding retryWaitTimeMillis option to the JedisCluster as the interval between retries, such as the following.
public JedisCluster(HostAndPort node, int timeout, int maxAttempts, int retryWaitTimeMillis) {
  }

} catch (JedisConnectionException jce) {
      // release current connection before recursion
      releaseConnection(connection);
      connection = null;

      if (attempts <= 1) {
        //We need this because if node is not reachable anymore - we need to finally initiate slots
        //renewing, or we can stuck with cluster state without one node in opposite case.
        //But now if maxAttempts = [1 or 2] we will do it too often.
        //TODO make tracking of successful/unsuccessful operations for node - do renewing only
        //if there were no successful responses from this node last few seconds
        this.connectionHandler.renewSlotCache();
      }
      if (retryWaitTimeMillis > 0) {
        Thread.sleep(retryWaitTimeMillis);
      }
      return runWithRetries(slot, attempts - 1, tryRandomNode, redirect);
    }

Any suggestions please send me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions