-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ipc) avoid replaying all events when spawning new workers #88
Conversation
Prior to this fix, new workers spawned to replace existing, long running ones would attempt to replay all events (from index 0 up to the current index). This could cause new workers to never catch up with the current pace of events being broadcast by older workers. Thanks Robert Paprocki for the report. Fix #87
@p0pr0ck5 Would you have a look at this fix? Thanks! |
I think this will actually break things, because calling |
(Unless that's a horrendous oversight in the documentation, in which case, yeah, this is what i had in mind 👍 ) |
Yes, that is a documentation oversight, the shm API is definitely available in both of the |
Right, that's what I thought. Okay. This looks solid at first glance. A bit busy currently but will have a close look and run through the lab environment where i reproduced the original behavior soon- hopefully late tonight or tomorrow. |
Awesome! Thanks! |
Ran through my lab environment where I reported the bug and I no longer see the behavior noted. Nice fix! 👍 |
Great, thanks for testing it! |
Fixed: - The IPC module now avoids replaying all events when spawning new workers, and gets initialized with the latest event index instead. [#88](thibaultcha/lua-resty-mlcache#88) Changelog: https://github.com/thibaultcha/lua-resty-mlcache/blob/master/CHANGELOG.md#241
Fixed: - The IPC module now avoids replaying all events when spawning new workers, and gets initialized with the latest event index instead. [#88](thibaultcha/lua-resty-mlcache#88) Changelog: https://github.com/thibaultcha/lua-resty-mlcache/blob/master/CHANGELOG.md#241
A follow-up to #88. Described in #93 and reported a few months ago is the likely possibility that mlcache and ipc instances are created in the `init` phase. If a worker is to be started much later in the master process' lifetime, the newly forked ipc instance will have an `idx` attribute set to the shm_idx value at the time of `init` (likely 0). These workers will resume polling evicted events, and `poll()` will likely timeout indefinitely from there. For the fix in #88 to work, the mlcache instance has to be instantiated during `init_worker` or later. This patch proposes an approach which works for instances created in both `init` and `init_worker`: as soon as the ipc shm has started evicting items, we guarantee that future workers will resume polling at the current index, without having to call any method but `poll()`. Fix #93
A follow-up to #88. Described in #93 and reported a few months ago is the likely possibility that mlcache and ipc instances are created in the `init` phase. If a worker is to be started much later in the master process' lifetime, the newly forked ipc instance will have an `idx` attribute set to the shm_idx value at the time of `init` (likely 0). These workers will resume polling evicted events, and `poll()` will likely timeout indefinitely from there. For the fix in #88 to work, the mlcache instance has to be instantiated during `init_worker` or later. This patch proposes an approach which works for instances created in both `init` and `init_worker`: as soon as the ipc shm has started evicting items, we guarantee that future workers will resume polling at the current index, without having to call any method but `poll()`. Fix #93
Prior to this fix, new workers spawned to replace existing, long running
ones would attempt to replay all events (from index 0 up to the current
index). This could cause new workers to never catch up with the current
pace of events being broadcast by older workers.
Thanks Robert Paprocki for the report.
Fix #87