Skip to content

Using Redis sentinel, some commands 'send' hangs #92

Open
@jgwinner

Description

@jgwinner

Describe the bug
When using Redis sentinel - 1 master, 2 duplicates, 3 sentinels - I can always connect to the (curent) writeable master.

However, if I 'stop' that service, the next command hangs.

This may not be a bug, maybe it's proper use of Sentinel, but if this is the case the examples and/or docs should probably be updated. I can do a PR if that's the case.

To Reproduce
Steps to reproduce the behavior:

  1. Implement Redis Sentinel
  2. Connect to the current master
  3. Issue an 'lpop' command:
	redisClient->lpop(GetQueue().toStdString(), [&webReply,&bError](cpp_redis::reply& charRequest) {
                      .... stuff ... 
		});
	redisClient->sync_commit(std::chrono::milliseconds(5000)); 
  1. It works fine.
  2. stop the current master. The slaves will fail over and promote a new master.
  3. The next 'lpop' command hangs down inside this routine:
client::send(const std::vector<std::string> &redis_cmd, const reply_callback_t &callback) {
	std::lock_guard<std::mutex> lock_callback(m_callbacks_mutex);
	__CPP_REDIS_LOG(info, "cpp_redis::client attempts to store new command in the send buffer");
	unprotected_send(redis_cmd, callback);
	__CPP_REDIS_LOG(info, "cpp_redis::client stored new command in the send buffer");

Expected behavior
I would think issuing any serial commands to a connection that can't take it should result in a tacopie or cpp_redis error being thrown.

Possible workaround
My routine that does the connect DOES immediately exits out:

	LOG_INFO << "Connecting to HA Redis servers with " << iSentinels << "sentinels";
	bool bDisconnect = false;
	redisClient->connect("ourcluster", [&bDisconnect](const std::string& host, std::size_t port, cpp_redis::connect_state status) {
		if (status == cpp_redis::connect_state::dropped) {
			LOG_INFO << "Client disconnected from " << host.c_str() << ":" << port;

The 'dropped' fires right off, so I could assign a bool value, and then in the outer loop, before every command check to see if we're disconnected - but that seems like a lot of potential boilerplate code.

I guess I could also just try to reconnect inside the 'connect' callback? Is that the right way to do it? The code example didn't show this

Desktop (please complete the following information):

  • OS: [Windows 10]
  • Browser [n/a]
  • Version [Current]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions