Skip to content

Add master-replica switching support #165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 19, 2022
Merged

Conversation

0x501D
Copy link
Member

@0x501D 0x501D commented Mar 5, 2022

This patch adds ability to use queue in a master-replica scheme.
The queue will monitor the operation mode of the tarantool and
perform the necessary actions accordingly.

This patch adds five states for queue: INIT, STARTUP, RUNNING, ENDING
and WAITING. When the tarantool is launched for the first time,
the state of the queue is always INIT until box.info.ro is false.

States switching scheme:

stateDiagram-v2
direction LR
w: Waiting
note right of w
  monitor rw status
end note
s: Startup
note left of s
  wait for maximum
  upstream lag * 2
  release all tasks
  start driver
end note
Init --> s
r: Running
note right of r
  queue is ready
end note
e: Ending
note left of e
  stop driver
end note
w --> s: ro -> rw
s --> r
r --> e: rw -> ro
e --> w
Loading

In the STARTUP state, the queue is waiting for possible data synchronization
with other cluster members by the time of the largest upstream lag multiplied
by two. After that, all taken tasks are released, except for tasks with
session uuid matching inactive sessions uuids. This makes possible to take
a task, switch roles on the cluster, and release the task within the timeout
specified by the queue.cfg({ttr = N}) parameter. Note: all clients that take()
and do not ack()/release() tasks must be disconnected before changing the role.
And the last step in the STARTUP state is starting tube driver using new
method called start(). Each tube driver must implement start() and stop()
methods. The start() method should start the driver fibers, if any, in other
words, initialize all asynchronous work with the tubes space. The stop()
methods shuld stop the driver fibers, if any, respectively.

In the RUNNING state, the queue is working as usually. The ENDING state calls
stop() method. in the WAITING state, the queue listens for a change in the
read_only flag.

All states except INIT is controlled by new fiber called 'queue_state_fiber'.

A new release_all() method has also been added, which forcibly returns all
taken tasks to a ready state. This method can be called per tube.

Closes #120

@0x501D 0x501D marked this pull request as draft March 5, 2022 12:39
@0x501D 0x501D requested a review from LeonidVas March 5, 2022 12:39
@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch from 8e03331 to c88e15d Compare March 15, 2022 20:33
@0x501D 0x501D marked this pull request as ready for review March 16, 2022 08:23
@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch 2 times, most recently from c29a8be to eb163ac Compare March 16, 2022 08:46
Copy link
Contributor

@LeonidVas LeonidVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thank you for the patchset. I left comments. I haven't reviewed it all yet, but I plan to keep reviewing and commenting.

@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch 2 times, most recently from 5ec056e to 54f27e7 Compare March 24, 2022 10:10
Copy link
Contributor

@LeonidVas LeonidVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thank you for the patchset. It looks good.
About the commit message title:

  • add subsystem (like "test: add master replica switching", "sessions: move inactive...")

About the commit message body:

  • add "Follows up #xxx" or "Part of #xxx"

Please check the effect of the changes on the queue performance (#139 (comment))

See comments bellow.

Queue states needs inactive_sessions on both master and replica nodes.

Part of #120
@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch 4 times, most recently from b6d7d1b to 3fd8d24 Compare April 18, 2022 10:01
local queue_state = require('queue.abstract.queue_state')
rawset(_G, 'queue', require('queue'))

-- Replica connection handler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

No.)

Copy link
Contributor

@LeonidVas LeonidVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have some test for "switching replicaset mode" otherwise it will be break periodically.

@@ -601,6 +601,61 @@ function method.create_tube(tube_name, tube_type, opts)
return self
end

-- Replicaset mode switch.
local function migrate_queue_taken_space(mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mode -> replicaset_mode (or replicaset_mode_on).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@LeonidVas
Copy link
Contributor

Mark all commits after Closes as Follows up.

@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch from 3fd8d24 to fa612a0 Compare April 27, 2022 13:46
local function max_lag()
local max_lag = 0

for i=1, #box.info.replication do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tarantool> box.info.replication
---
- 1:
    id: 1
    uuid: 84560a97-1cba-4e24-8814-d49fa1f9d6a5
    lsn: 1001048
  3:
    id: 3
    uuid: 2d82e625-3a6c-427a-9075-93bfc5c07dd7
    lsn: 0
...

tarantool> #box.info.replication
---
- 1
...

tarantool> table.maxn(box.info.replication)
---
- 3
...

Use table.maxn() for possibly holed arrays.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch from fa612a0 to 7181e7d Compare May 3, 2022 16:26
Copy link
Contributor

@LeonidVas LeonidVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG, I think this set of patches has gone too far). It will be great if it doesn't crash some existing installations.
I leave a few comments and would like to clarify for @Totktonada :In the current implementation, after switching from master to replica, the user will need to re-identify on the replica.

@0x501D great job.

README.md Outdated
@@ -846,9 +856,6 @@ Driver class must implement the following API:
1. `create_space` - creates the supporting space. The arguments are:
* space name
* space options

Driver class should implement the following API:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be removed in readme: add queue states description commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

-- When the queue is running in single mode,
-- the space is converted to temporary mode to increase performance.
--
local function switch_in_replicaset(mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use the argument name replicaset_mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -657,11 +719,11 @@ function method.start()
box.space._queue_taken:drop()
end

local _taken = box.space._queue_taken_2
if _taken == nil then
local _mode = queue.cfg['in_replicaset'] or false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use name replicaset_mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

-- When the queue is running in single mode,
-- the space is converted to temporary mode to increase performance.
--
local function switch_in_replicaset(mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use the argument name replicaset_mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

})
end
else
box.space._queue_shared_sessions:truncate()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't queue_session._on_session_remove(...) be called for each deleted session?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, fixed.

@@ -158,6 +185,10 @@ local function identify(conn_id, session_uuid)
-- Generate new UUID for the session.
cur_uuid = uuid.bin()
queue_session_ids:insert{conn_id, cur_uuid}
local ttr = queue_session.cfg['ttr'] or 0
if ttr > 0 then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU we must insert the session into the table _queue_shared_sessions in any case, regardless of the value of the ttr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -207,11 +238,21 @@ local function disconnect(conn_id)
{ limit = 1 })[1]

-- If a queue session doesn't have any active connections it should be
-- removed (if ttr is absent) or moved to the "inactive sessions" list.
-- removed (if ttr is absent) or moved to the "shared sessions" list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the session is not moved to the table, it just changes the active flag from true to false.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

local function exist_inactive(session_uuid)
if box.space._queue_inactive_sessions:get{session_uuid} then
local function exist_shared(session_uuid)
if not box.space._queue_shared_sessions then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We create this space ourselves, why can't it exist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch 4 times, most recently from 31e85eb to b005b27 Compare May 18, 2022 08:33
@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch from b005b27 to 4ca6584 Compare May 19, 2022 08:23
0x501D added 7 commits May 19, 2022 12:49
This patch adds ability to use queue in a master-replica scheme.
The queue will monitor the operation mode of the tarantool and
perform the necessary actions accordingly.

This patch adds five states for queue: INIT, STARTUP, RUNNING, ENDING
and WAITING. When the tarantool is launched for the first time,
the state of the queue is always INIT until box.info.ro is false.

States switching scheme:

                       +-----------+
                       |  RUNNING  |
                       +-----------+
                        ^         | (rw -> ro)
                        |         v
+------+       +---------+       +--------+
| INIT | --->  | STARTUP |       | ENDING |
+------+       +---------+       +--------+
                        ^         |
             (ro -> rw) |         v
                       +-----------+
                       |  WAITING  |
                       +-----------+

In the STARTUP state, the queue is waiting for possible data synchronization
with other cluster members by the time of the largest upstream lag multiplied
by two. After that, all taken tasks are released, except for tasks with
session uuid matching inactive sessions uuids. This makes possible to take
a task, switch roles on the cluster, and release the task within the timeout
specified by the queue.cfg({ttr = N}) parameter. Note: all clients that take()
and do not ack()/release() tasks must be disconnected before changing the role.
And the last step in the STARTUP state is starting tube driver using new
method called start(). Each tube driver must implement start() and stop()
methods. The start() method should start the driver fibers, if any, in other
words, initialize all asynchronous work with the tubes space. The stop()
methods shuld stop the driver fibers, if any, respectively.

In the RUNNING state, the queue is working as usually. The ENDING state calls
stop() method. in the WAITING state, the queue listens for a change in the
read_only flag.

All states except INIT is controlled by new fiber called 'queue_state_fiber'.

A new release_all() method has also been added, which forcibly returns all
taken tasks to a ready state. This method can be called per tube.

Closes #120
Since version 2.4.1, Tarantool has the popen built-in module
that supports execution of external programs.
popen() uses in tests for creating replica instance.

Follows up #120
Running a queue in master-replica mode requires that
_queue_inactive_sessions and _queue_taken_2 were not
temporary spases. This results in a 20 percent performance loss.
The `in_replicaset` option allows you to get the same performance
in single mode.

Follows up #120
In order for the user to be able to finish servicing the task after
the queue switched from replica to master, the `_queue_inactive_session`
space was renamed to `_queue_shared_sessions` and had an `active` field
in it. When the `take` method is called, the session uuid is automatically
added to the `_queue_shared_sessions` space with `exp_time` set to zero
and `active` field set to true. When the client closes the session, this
entry becomes inactive (`active` = false) and `exp_time` is set.

At the start of the queue, we make all active sessions inactive, and set
an expiration time for them. If the `ttr` setting is not set, then we delete
all sessions.

If `ttr` is not set shared sessions is disabled.

Follows up #120
In replicaset mode we cannot use temporary tubes.
Data on them will not be replicated and the correct
operation of the queue will be disrupted.

This patch will not allow creation of the temporary tube
if `in_replicaset == true` or will not allow to set
`in_replicaset = true` if temporary tubes exists.

Follows up #120
@0x501D 0x501D force-pushed the 0x501D/gh-120-queue-states branch from 4ca6584 to 3e2efcd Compare May 19, 2022 09:49
@LeonidVas LeonidVas merged commit bff76e2 into master May 19, 2022
@LeonidVas LeonidVas deleted the 0x501D/gh-120-queue-states branch May 19, 2022 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to avoid SPoF?
3 participants