Skip to content

Commit

Permalink
Fix possible endless wait in stop() after AUTH_FAILED error
Browse files Browse the repository at this point in the history
In case of AUTH_FAILED in the zk-loop thread it will call
client._session_callback which will reset the queue.

However another thread can add to this queue CloseInstance event, and
if the _session_callback() will be called after CloseInstance was added
to the queue, then stop() will never return (and zk-loop will endlessly
spin).

Here is how it looks like with addititional logging:

    39: [ Thread-3 (zk_loop) ] INFO: client.py:568: _session_callback: Zookeeper session closed, state: AUTH_FAILED
    39: [ MainThread ] Level 5: client.py:721: stop: Sending CloseInstance
    39: [ Thread-3 (zk_loop) ] Level 5: client.py:403: _reset: Reseting the client
    39: [ Thread-3 (zk_loop) ] Level 5: connection.py:625: _connect_attempt: Connecting
    39: [ Thread-3 (zk_loop) ] Level 5: connection.py:625: _connect_attempt: Connecting

You can find details in this gist [1].

  [1]: https://gist.github.com/azat/bc7aaea1c32a4f1ea75ad646d26280e9
  • Loading branch information
azat committed Feb 8, 2023
1 parent 92b071d commit cd4a2b2
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
2 changes: 1 addition & 1 deletion kazoo/protocol/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -619,7 +619,7 @@ def _connect_attempt(self, host, hostip, port, retry):
self.ping_outstanding.clear()
last_send = time.time()
with self._socket_error_handling():
while True:
while not self.client._stopped.is_set():
# Watch for something to read or send
jitter_time = random.randint(1, 40) / 100.0
deadline = last_send + read_timeout / 2.0 - jitter_time
Expand Down
2 changes: 2 additions & 0 deletions kazoo/tests/test_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,8 @@ def test_async_auth_failure(self):
with pytest.raises(AuthFailedError):
client.add_auth("unknown-scheme", digest_auth)

client.stop()

def test_add_auth_on_reconnect(self):
client = self._get_client()
client.start()
Expand Down

0 comments on commit cd4a2b2

Please sign in to comment.