fix(ipc) avoid replaying all events when spawning new workers #88

thibaultcha · 2020-01-14T22:48:59Z

Prior to this fix, new workers spawned to replace existing, long running
ones would attempt to replay all events (from index 0 up to the current
index). This could cause new workers to never catch up with the current
pace of events being broadcast by older workers.

Thanks Robert Paprocki for the report.

Fix #87

Prior to this fix, new workers spawned to replace existing, long running ones would attempt to replay all events (from index 0 up to the current index). This could cause new workers to never catch up with the current pace of events being broadcast by older workers. Thanks Robert Paprocki for the report. Fix #87

thibaultcha · 2020-01-14T22:49:16Z

@p0pr0ck5 Would you have a look at this fix? Thanks!

p0pr0ck5 · 2020-01-14T22:53:13Z

I think this will actually break things, because calling :get (https://github.com/openresty/lua-nginx-module#ngxshareddictget) is not supported inside of init_(worker)_by_lua, which is generally where .new() gets called

p0pr0ck5 · 2020-01-14T22:54:23Z

(Unless that's a horrendous oversight in the documentation, in which case, yeah, this is what i had in mind 👍 )

thibaultcha · 2020-01-14T22:57:11Z

Yes, that is a documentation oversight, the shm API is definitely available in both of the init and init_worker contexts (the init phase is even delayed by ngx_lua when at least 1 lua_shared_dict is defined for this exact reason).

p0pr0ck5 · 2020-01-14T23:06:14Z

Right, that's what I thought. Okay.

This looks solid at first glance. A bit busy currently but will have a close look and run through the lab environment where i reproduced the original behavior soon- hopefully late tonight or tomorrow.

thibaultcha · 2020-01-14T23:10:41Z

Awesome! Thanks!

p0pr0ck5 · 2020-01-16T02:51:35Z

Ran through my lab environment where I reported the bug and I no longer see the behavior noted. Nice fix! 👍

thibaultcha · 2020-01-17T22:26:10Z

Great, thanks for testing it!

Fixed: - The IPC module now avoids replaying all events when spawning new workers, and gets initialized with the latest event index instead. [#88](thibaultcha/lua-resty-mlcache#88) Changelog: https://github.com/thibaultcha/lua-resty-mlcache/blob/master/CHANGELOG.md#241

A follow-up to #88. Described in #93 and reported a few months ago is the likely possibility that mlcache and ipc instances are created in the `init` phase. If a worker is to be started much later in the master process' lifetime, the newly forked ipc instance will have an `idx` attribute set to the shm_idx value at the time of `init` (likely 0). These workers will resume polling evicted events, and `poll()` will likely timeout indefinitely from there. For the fix in #88 to work, the mlcache instance has to be instantiated during `init_worker` or later. This patch proposes an approach which works for instances created in both `init` and `init_worker`: as soon as the ipc shm has started evicting items, we guarantee that future workers will resume polling at the current index, without having to call any method but `poll()`. Fix #93

thibaultcha mentioned this pull request Jan 14, 2020

Workers launched after IPC shm LRU fail to catch up #87

Closed

thibaultcha merged commit f3e07ab into master Jan 17, 2020

thibaultcha deleted the fix/ipc-no-replay-all branch January 17, 2020 22:26

thibaultcha mentioned this pull request Jan 20, 2020

chore(deps) bump lua-resty-mlcache to 2.4.1 Kong/kong#5474

Merged

p0pr0ck5 mentioned this pull request Feb 18, 2020

Kong1.3 with pgsql in k8s generate a large number of dropping event logs Kong/kong#5578

Closed

thibaultcha mentioned this pull request Jun 23, 2020

[enhancement] strengthen the IPC module #93

Closed

4 tasks

thibaultcha mentioned this pull request Sep 25, 2020

feat(ipc) avoid new workers attempting to replay evicted events #97

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ipc) avoid replaying all events when spawning new workers #88

fix(ipc) avoid replaying all events when spawning new workers #88

thibaultcha commented Jan 14, 2020

thibaultcha commented Jan 14, 2020

p0pr0ck5 commented Jan 14, 2020

p0pr0ck5 commented Jan 14, 2020

thibaultcha commented Jan 14, 2020

p0pr0ck5 commented Jan 14, 2020

thibaultcha commented Jan 14, 2020

p0pr0ck5 commented Jan 16, 2020

thibaultcha commented Jan 17, 2020

fix(ipc) avoid replaying all events when spawning new workers #88

fix(ipc) avoid replaying all events when spawning new workers #88

Conversation

thibaultcha commented Jan 14, 2020

thibaultcha commented Jan 14, 2020

p0pr0ck5 commented Jan 14, 2020

p0pr0ck5 commented Jan 14, 2020

thibaultcha commented Jan 14, 2020

p0pr0ck5 commented Jan 14, 2020

thibaultcha commented Jan 14, 2020

p0pr0ck5 commented Jan 16, 2020

thibaultcha commented Jan 17, 2020