Retry Mechanism Fails When Redis Container is Paused

### Expected behavior

When the Redis container is **paused** (not stopped), the connection attempt should fail, triggering the retry mechanism. The retry number should increase monotonically until until the specified maximum number of retries is reached.

#### Example logs of expected behavior:

```logs
INFO - Attempt 1/5. Backing off for 0.5 seconds
INFO - Attempt 2/5. Backing off for 1.0 seconds
INFO - Attempt 3/5. Backing off for 2.0 seconds
INFO - Attempt 4/5. Backing off for 4.0 seconds
```

The print statement was added at the end of the following `except`  block:
https://github.com/redis/redis-py/blob/ea01a303ab54e9698689b796dc5c67644425ba51/redis/retry.py#L60-L70


### Actual behavior

Instead of progressing through the retry attempts, the retry mechanism gets stuck at the first attempt, repeating indefinitely.

#### Example logs of actual behavior:
```logs
INFO - Attempt 1/5. Backing off for 0.5 seconds
INFO - Attempt 1/5. Backing off for 0.5 seconds
INFO - Attempt 1/5. Backing off for 0.5 seconds
INFO - Attempt 1/5. Backing off for 0.5 seconds
```

### Root Cause

The issue occurs because the `sock.connect` (line 575 of the `_connect` method) succeeds even when the container is `paused`. However, subsequent read operations fail with `Timeout`.
https://github.com/redis/redis-py/blob/ea01a303ab54e9698689b796dc5c67644425ba51/redis/connection.py#L728-L763 

### Possible solution

To properly detect when the connection is truly established, we can send a `PING` command immediately after `connect()` and verify the response.
Add the following after `sock.connect()` to ensure the connection is functional:
```python
ping_parts = self._command_packer.pack("PING")
for part in ping_parts:
    sock.sendall(part)

    response = sock.recv(7)

    if not str_if_bytes(response).startswith("+PONG"):
        raise OSError(f"Redis handshake failed: unexpected response {response!r}")
```

### Additional Comments

- There may be a better way to handle the read operation for the `PING` response using existing methods, but calling `_send_ping` directly does not work in this case.

- This issue also affects the **asynchronous version** of `redis-py`.

-----

Let me know if you'd like a clearer example to reproduce the behavior.

	while True:
	try:
	return do()
	except self._supported_errors as error:
	failures += 1
	fail(error)
	if self._retries >= 0 and failures > self._retries:
	raise error
	backoff = self._backoff.compute(failures)
	if backoff > 0:
	sleep(backoff)

	def _connect(self):
	"Create a TCP socket connection"
	# we want to mimic what socket.create_connection does to support
	# ipv4/ipv6, but we want to set options prior to calling
	# socket.connect()
	err = None
	for res in socket.getaddrinfo(
	self.host, self.port, self.socket_type, socket.SOCK_STREAM
	):
	family, socktype, proto, canonname, socket_address = res
	sock = None
	try:
	sock = socket.socket(family, socktype, proto)
	# TCP_NODELAY
	sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

	# TCP_KEEPALIVE
	if self.socket_keepalive:
	sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
	for k, v in self.socket_keepalive_options.items():
	sock.setsockopt(socket.IPPROTO_TCP, k, v)

	# set the socket_connect_timeout before we connect
	sock.settimeout(self.socket_connect_timeout)

	# connect
	sock.connect(socket_address)

	# set the socket_timeout now that we're connected
	sock.settimeout(self.socket_timeout)
	return sock

	except OSError as _:
	err = _
	if sock is not None:
	sock.close()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retry Mechanism Fails When Redis Container is Paused #3555

Expected behavior

Example logs of expected behavior:

Actual behavior

Example logs of actual behavior:

Root Cause

Possible solution

Additional Comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Retry Mechanism Fails When Redis Container is Paused #3555

Description

Expected behavior

Example logs of expected behavior:

Actual behavior

Example logs of actual behavior:

Root Cause

Possible solution

Additional Comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions