Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kill -WINCH does not appear to gracefully shut down workers #11

Open
matthewlein opened this issue Jan 19, 2024 · 0 comments
Open

kill -WINCH does not appear to gracefully shut down workers #11

matthewlein opened this issue Jan 19, 2024 · 0 comments

Comments

@matthewlein
Copy link

matthewlein commented Jan 19, 2024

When I start a job, then call

kill -WINCH `cat tmp/pids/delayed_job_master.pid`

my workers teminate immediately and my job fails with a SignalException SIGTERM

This is the key feature I'm looking for in this gem, deployments that won't interrupt running jobs.

ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0  16824  3044 ?        Ss   20:48   0:00 /bin/sh -c bundle exec rake assets:precompile && bundle exec bin/delayed_job_master -c config/delayed_job_master.rb -
root          77  0.0  0.3 102664 26116 ?        S    20:48   0:00 bin/sleep
root        1682  0.0  0.0  18588  5976 pts/0    Ss   21:06   0:00 bash
root        2608  0.5  3.2 457040 258168 ?       Sl   21:17   0:00 ruby bin/delayed_job_master -c config/delayed_job_master.rb -D
root        2689 39.3  4.1 525028 327704 ?       Sl   21:17   0:08 delayed_job: worker[0] (default, urgent, mailer) @primary [BUSY]
root        2837  6.7  3.3 462672 264852 ?       Rl   21:18   0:00 delayed_job: worker[0] (default, urgent, mailer) @primary
root        2845 12.0  3.3 462672 264820 ?       Sl   21:18   0:00 delayed_job: worker[0] (default, urgent, mailer) @primary
root        2847 10.6  3.3 462672 264712 ?       Sl   21:18   0:00 delayed_job: worker[0] (default, urgent, mailer) @primary
root        2873  0.0  0.0  23792  6436 pts/0    R+   21:18   0:00 ps aux

kill -WINCH `cat tmp/pids/delayed_job_master.pid`

ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0  16824  3044 ?        Ss   20:48   0:00 /bin/sh -c bundle exec rake assets:precompile && bundle exec bin/delayed_job_master -c config/delayed_job_master.rb -
root          77  0.0  0.3 102664 26116 ?        S    20:48   0:00 bin/sleep
root        1682  0.0  0.0  18588  5976 pts/0    Ss   21:06   0:00 bash
root        2934  0.0  0.0  23792  6524 pts/0    R+   21:18   0:00 ps aux

The log shows

I, [2024-01-19T21:18:30.507015 #2608]  INFO -- : received WINCH signal
I, [2024-01-19T21:18:30.507329 #2608]  INFO -- : sent TERM signal to worker 2689
I, [2024-01-19T21:18:30.647926 #2689]  INFO -- : performed UpdateInventoryLevelsJob [2e837034-35df-4573-b638-0fc7815bb547] from DelayedJob(urgent) with arguments: [], memory: 322 MB
I, [2024-01-19T21:18:30.650728 #2689]  INFO -- : shut down worker 2689
D, [2024-01-19T21:18:30.940508 #2608] DEBUG -- : found terminated pid: 2689

Looking at the code, it seems like the signaler calls graceful_stop

def graceful_stop
@signaler.dispatch(:TERM)
@stop = true
end

which dispatches TERM

def dispatch(signal)
@master.workers.each do |worker|
next unless worker.pid
dispatch_to(signal, worker.pid)
end
end

and that kills the process

def dispatch_to(signal, pid)
Process.kill(signal, pid)
@master.logger.info { "sent #{signal} signal to worker #{pid}" }
rescue
@master.logger.error { "failed to send #{signal} signal to worker #{pid}" }
end

I might not be tracing this right.

My understanding of the graceful stop is that it would allow the job to complete before killing the process.

Is this working as intended?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant