Question: replication redundancy #121

ridingrails · 2017-01-11T00:35:00Z

Great tool! I have a question about redundancy. When you set a follower for a master, what happens if the master host terminates? Is there any type of leader election among followers currently? Or do all the followers need to be programmed to follow a new master?

tidwall · 2017-01-11T01:14:16Z

Hi @ridingrails,

In short it would need to be programmed.

Replication in Tile38 is very simple. A follower is just a readonly mirror of the leader. When new write commands are sent to a leader, the leader will forward the commands to each follower. The followers can serve readonly commands like GET and SEARCH, but cannot handle write commands like SET and DEL.

When a leader terminates with active followers, the followers simply wait until the leader returns. There aren't elections and it's not a true cluster like Raft.

To make it more redundant there would need to be a sentinel like service that watches the leader and knows of the followers. Then when a leader fails, the service would switch a follower to leader state by issuing a FOLLOW none command.

I hope this helps answer your question.

ridingrails · 2017-01-11T01:35:29Z

Yes it does, thanks Josh!

octete · 2017-08-04T11:05:12Z

Hi @tidwall

Regarding this, I have noticed that when the master dies, the followers don't respond at all. Is this by design?
i.e.

127.0.0.1:9852> NEARBY fleet4 POINT 33.462 -112.268 3000
{"ok":true,"objects":[],"count":0,"cursor":0,"elapsed":"83.569µs"}
127.0.0.1:9852> NEARBY fleet4 POINT 33.462 -112.268 3000
(error) catching up to leader
127.0.0.1:9852> NEARBY fleet4 POINT 33.462 -112.268 3000
(error) catching up to leader
127.0.0.1:9852> NEARBY fleet4 POINT 33.462 -112.268 3000
(error) catching up to leader
127.0.0.1:9852> NEARBY fleet4 POINT 33.462 -112.268 3000
(error) catching up to leader

This is what happens when the master dies.
I was thinking that it would be desirable to have the followers reply to queries whilst the master if offling.

tidwall · 2017-08-04T12:36:51Z

Hi @octete,

I agree and I just pushed an update to the master branch which changes the behavior. Now a follower only needs to catch up with the leader one time to begin accepting reads.

This should address the case when a leader goes down the follower will continue to responding to reads.

If the leader dies and the follower server is restarted then the catching up to leader error will occur until the follower resyncs with leader, or the follower stops following with a FOLLOW no one command.

Let me know if this fixes the issue or if you have further questions.

octete · 2017-08-04T13:19:29Z

Hi @tidwall

That's enough for now! Thanks for the prompt response. 😁

jbfarez · 2018-06-14T14:13:58Z

Hi @tidwall, sorry for "reopening" this case but I was just wondering if there is a way to "transform/promote" a follower to a leader?
If not, do you plan to design an high availability mechanism for tile38?

tidwall · 2018-06-14T19:32:14Z

Hi @jbfarez,

It's possible to promote the follower to leader by sending the follower a FOLLOW no one command. And then demote the leader by sending FOLLOW host port, pointing to the new leader.

Right now this is a manual step (or perhaps with a custom made script). I have plans on adding at automated failover in the future.

mudit3774 · 2019-07-02T19:15:40Z

@tidwall we are planning to use Tile38 in production and we would like to check if there is a proposal around this so that we can implement the same.

Do we plan to use etcd or any other similar coordinator to achieve this? Do we plan to give some sort of "cluster-mode" and a simple cluster formation strategy?

Or do we plan to monitor the leader from the follower and in case of failover, follower assumes leadership and sends FOLLOW host port command to the failed leader? In this case, we will have to change the DNS to the new node so Tile38 should either publish an event or allow a callback registry.

The issue with a separate health check monitor would probably be monitoring and HA of the monitor itself.

Thoughts?

tidwall · 2019-07-02T21:26:21Z

@mudit3774 There's no official built-in support for HA.

I've heard that some people have had success using Redis Sentinel. It's possible to run the redis-server with the --sentinel flag and configure it to point to the Tile38 Leader instead of a Redis Master.

RashadAnsari · 2019-09-16T15:50:20Z

@mudit3774 There's no official built-in support for HA.

I've heard that some people have had success using Redis Sentinel. It's possible to run the redis-server with the --sentinel flag and configure it to point to the Tile38 Leader instead of a Redis Master.

Hi @tidwall

I created a docker image for Tile38 HA using Redis Sentinel.

https://github.com/RashadAnsari/tile38-ha

Mukund2900 · 2023-04-26T20:00:25Z

@tidwall @iwpnd
For anyone who wants a step by step guide. Here it is ->
(For now setup is on local change the host and ip based on needs) ->

spin up a tile 38 server -> tile38-server -d data1 this will start at port 9851
spin up another tile 38 server -> tile38-server -p 9010 -d data2 this will start at port 9010
open tile38-cli at port 9010 and enter -> FOLLOW 127.0.0.1 9851
on another terminal start redis sentinel with redis-sentinel sentinel.conf
Here is the sentinel.conf -

sentinel monitor mymaster 127.0.0.1 9851 1
sentinel down-after-milliseconds mymaster 300
sentinel failover-timeout mymaster 1800
protected-mode no

Change the variables based on needs this one directly makes the slave master once master goes down.
Also as i have only 2 instances i.e. 1 slave and one master/leader value ahead of monitor mymaster is 1.
Now in your client implementation just change the way we connect with tile38. For e.g. in java ->

connect as follows (example in java client)-

    private RedisClient createRedisClient(String sentinelHost, int sentinelPort, String password , String masterId) {
        RedisURI redisURI = RedisURI.Builder.sentinel(sentinelHost, sentinelPort, masterId, password).withDatabase(0).build();
        return RedisClient.create(redisURI);
    }

Also you can check the tile38 java client for client side implementation.
It automatically discovers the new Leader at any point and waits for reconnection.

Kilowhisky · 2023-05-10T21:01:51Z

FYI for those in the C# land, StackExchange.Redis does does not work with sentinel(in tile38) as it appears to be trying to get replication information using the ROLE command from the tile38 server.

I will update this post once i find a workaround...

EDIT:
Looks like ROLE command is a large part of Sentinel. https://redis.io/docs/reference/sentinel-clients/
EDIT2:
Dropped bug: #686

ridingrails changed the title ~~Replication redundancy~~ Question: replication redundancy Jan 11, 2017

ridingrails closed this as completed Jan 11, 2017

tidwall added a commit that referenced this issue Aug 4, 2017

allow reads on disconnected followers (#121)

cb062bd

mudit3774 mentioned this issue Jan 3, 2020

Server Hooks #520

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: replication redundancy #121

Question: replication redundancy #121

ridingrails commented Jan 11, 2017

tidwall commented Jan 11, 2017

ridingrails commented Jan 11, 2017

octete commented Aug 4, 2017

tidwall commented Aug 4, 2017

octete commented Aug 4, 2017

jbfarez commented Jun 14, 2018 •

edited

Loading

tidwall commented Jun 14, 2018

mudit3774 commented Jul 2, 2019

tidwall commented Jul 2, 2019

RashadAnsari commented Sep 16, 2019

Mukund2900 commented Apr 26, 2023 •

edited

Loading

Kilowhisky commented May 10, 2023 •

edited

Loading

Question: replication redundancy #121

Question: replication redundancy #121

Comments

ridingrails commented Jan 11, 2017

tidwall commented Jan 11, 2017

ridingrails commented Jan 11, 2017

octete commented Aug 4, 2017

tidwall commented Aug 4, 2017

octete commented Aug 4, 2017

jbfarez commented Jun 14, 2018 • edited Loading

tidwall commented Jun 14, 2018

mudit3774 commented Jul 2, 2019

tidwall commented Jul 2, 2019

RashadAnsari commented Sep 16, 2019

Mukund2900 commented Apr 26, 2023 • edited Loading

Kilowhisky commented May 10, 2023 • edited Loading

jbfarez commented Jun 14, 2018 •

edited

Loading

Mukund2900 commented Apr 26, 2023 •

edited

Loading

Kilowhisky commented May 10, 2023 •

edited

Loading