-
Notifications
You must be signed in to change notification settings - Fork 7
Potential yield in connection from pool GC #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
af2a642
to
c4087bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Thank you for the patchset.
bug: prevent potential fiber yield in gc
bug
is not a subsystem and looks strange. I can prepare something like: gc: prevent fiber yield
LGTM.
Why do you think that it is good? An incorrect behavior is worse than a fair error. Sure, stop everything is not nice. We can mark the pool as broken and decline all operations with it. But we should give a fair error. Instead, you hide it. I don't see any changes, which would help to find a root of the problem if it'll actually occur. |
It's not a fair error, it's plain process exit. Using fiber.yield in gc is forbidden either way. This patch removes the usage of a forbidden thing. Regarding processing possible errors in gc... is there any valid way supported by Tarantool? As far as I remember, throwing errors is forbidden same as yielding. |
c4087bc
to
760174e
Compare
Changed |
Log as much as possible (error message, connection address/identifier, pool address/identifier, Lua backtrace), block the entire pool for future operations. |
Connection pools allow user to get connections and explicitly return them back. But it is allowed to return connection to any pool. This could easily lead to pool queue overflow. Each pool allocates fiber channel of connection count size to store connections. If user would put several extra connections to a pool, next puts will lead to `pool:put()` hangup since channel will be full and `channel:put()` is used without timeout and possible error processing, relying on get/put+gc balance. Part of #67
3809280
to
76eaca1
Compare
9c8eca2
to
aebe7a0
Compare
@Totktonada I think we may test only that pool became unavailable and do not bother with reading logs. |
I don't mind. |
@LeonidVas , @Totktonada , I've updated the PR after our discussion. Please, update your reviews |
'pool.queue manually. Closing the pool...') | ||
log.error(debug.traceback) | ||
|
||
pool:close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know, TBH, whether we should try to close all the connections or just mark the pool as unusable and don't touch anything inside. How you decided to prefer pool:close()
over pool.usable = false
?
(I really don't know what is better, it is not some indirect push toward one of variants.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How you decided to prefer
pool:close()
overpool.usable = false
?
Well, the main cause was that the first thing pool:close()
do is setting pool.usable = false
, so even if anything is wrong with pool contents, pool still will be unusable. The connections in pool are the connections not used by anyone else (if user's got a connection with pool:get()
, it won't be in queue), so I think it's better to close them explicitly if possible rather than waiting for garbage collect.
mysql/init.lua
Outdated
log.error('pool %s internal queue unexpected state: there are no ' .. | ||
'empty slots. It is likely that someone had messed with ' .. | ||
'pool.queue manually. Closing the pool...') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think there should be word 'mysql' somewhere.
- Can we add something about the connection object or raw connection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
log.error('mysql pool %s internal queue unexpected state: there are no ' ..
'empty slots, connection %s cannot be put back. It is likely ' ..
'that someone had messed with pool.queue manually. Closing ' ..
'the pool...', pool, conn_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a brief glance the changes looks meaningful.
The code is tricky and I can't say that I'm able to spot a problem here even if there is one, so that's all that I can say.
BTW, nice description! |
Yielding in gc is prohibited since Tarantool 2.6.0-138-gd3f1dd720, 2.5.1-105-gc690b3337, 2.4.2-89-g83037df15, 1.10.7-47-g8099cb053. If fiber yield happens in object gc, Tarantool process exits with code 1. Gc hook of a connection from a connection pool calls `pool.queue:put(...)`. Fiber `channel:put()` may yield if the channel is full. `channel:put()`/`channel:get()` in mysql driver connection pool is balanced: * the channel capacity is equal to the connection count; * pool create fills the channel with multiple `channel:put()`s all the way through; * `pool:get()` causes single `channel:get()` and creates a connection object with gc hook; * `pool:put()` causes single `channel:put()` and removes gc hook; * gc hook causes single `channel:put()`. There are no other cases of `channel:put()`. `pool:put()` and gc hook cannot be executed both in the same time: either a connection is used inside `pool:put()` and won't be garbage collected or a connection is no longer used anywhere else (including `pool:put()`) and it will be garbage collected. So now there are no valid cases when gc hook may yield unless someone messes up with pool queue. Messing up with pool queue anyway breaks a connection pool internal logic, so we make the pool unavailable in that case. Delayed gc should not break existing logic. If `pool:get()` is called, we expect to get a connection after gc and it's not yet returned, `channel:get()` will yield and trigger all planned returns. Closes #67
dbb50d5
to
353a738
Compare
api: forbid putting to a wrong pool
Connection pools allow user to get connections and explicitly return them back. But it is allowed to return connection to any pool. This could easily lead to pool queue overflow. Each pool allocates fiber channel of connection count size to store connections. If user would put several extra connections to a pool, next puts will lead to
pool:put()
hangup since channel will be full andchannel:put()
is used without timeout and possible error processing, relying on get/put+gc balance.Part of #67
gc: prevent potential fiber yield
Yielding in gc is prohibited since Tarantool 2.6.0-138-gd3f1dd720, 2.5.1-105-gc690b3337, 2.4.2-89-g83037df15, 1.10.7-47-g8099cb053. If fiber yield happens in object gc, Tarantool process exits with code 1.
Gc hook of a connection from a connection pool calls
pool.queue:put(...)
. Fiberchannel:put()
may yield if the channelis full.
channel:put()
/channel:get()
in mysql driver connection pool is balanced:channel:put()
s all the way through;pool:get()
causes singlechannel:get()
and creates a connection object with gc hook;pool:put()
causes singlechannel:put()
and removes gc hook;channel:put()
.There are no other cases of
channel:put()
.pool:put()
and gc hook cannot be executed both in the same time: either a connection is used insidepool:put()
and won't be garbage collected or a connection is no longer used anywhere else (includingpool:put()
) and it will be garbage collected. So now there are no valid cases when gc hook may yield unless someone messes up with pool queue. Messing up with pool queue anyway breaks a connection pool internal logic, so we make the pool unavailable in that case.Delayed gc should not break existing logic. If
pool:get()
is called, we expect to get a connection after gc and it's not yet returned,channel:get()
will yield and trigger all planned returns.Closes #67