Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make GunicornWebWorker exit with APP_LOAD_ERROR in case of a startup error #6968

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

HallBregg
Copy link

Make GunicornWebWorker exit with WORKER_BOOT_ERROR (3) in case of an boot error.

I've found out that GunicornWebWorker exits with 0 error code even when an error occurs during startup. This makes Gunicorn endlessly spawn new workers.

With this pull request I would like to show the problem and ask if there is any other solution. I am aware of the comment # ignore all finalization problems and would like to ask about the context of this comment.

What do these changes do?

These changes set GunicornWebWorker.exit_code to the value expected by the Gunicorn, which is 3 and set GunicornWebWorker.booted to false.

class Arbiter(object):
    """
    Arbiter maintain the workers processes alive. It launches or
    kills them if needed. It also manages application reloading
    via SIGHUP/USR2.
    """

    # A flag indicating if a worker failed to
    # to boot. If a worker process exist with
    # this error code, the arbiter will terminate.
    WORKER_BOOT_ERROR = 3

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@@ -55,6 +56,9 @@ def run(self) -> None:
self.loop.run_until_complete(self._task)
except Exception:
self.log.exception("Exception in gunicorn worker")
self.booted = False
self.exit_code = Arbiter.WORKER_BOOT_ERROR
Copy link
Member

@Dreamsorcerer Dreamsorcerer Sep 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the gunicorn code seems to set self.booted = True before it calls .run(). This suggests to me that it is about recording whether the process was created, rather than whether the application startup succeeded. This seems to me like it should probably exit 1 instead.

If this was how gunicorn was supposed to work, then I'd expect booted to still be False and then we would set it to True in _run() after the setup is complete (line 95).

Looking at some of the other workers, I'm also noticing gtornado using alive and server_alive, not sure if something like that is more appropriate:
https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/gtornado.py#L91-L92

Also wondering if we should actually be using the AsyncWorker as our base class:
https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/base_async.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact there is an APP_LOAD_ERROR, which sounds a lot more accurate for this situation:
https://github.com/benoitc/gunicorn/blob/master/gunicorn/arbiter.py#L34

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about using APP_LOAD_ERROR before, but decided to stay with WORKER_BOOT_ERROR (I didn't know when we can say the app is loaded, but the same problem I have got with a worker). Nevertheless you are probably right about setting booted to Falseafter it has already been set to True, so APP_LOAD_ERROR seems to be correct.

I will take a look at the other workers to have a bit more knowledge.

Copy link
Member

@Dreamsorcerer Dreamsorcerer Sep 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I didn't know when we can say the app is loaded, but the same problem I have got with a worker).

When the setup has completed on line 95, the application has been initialised. If we go this way, then I'd suggest adding a self.started = False and setting it to True after site.start() on line 105. Then only set this exit status if not self.started, otherwise a 1 is probably a reasonable exit code (likely indicating that an exception happened in the app's cleanup).

aiohttp/worker.py Outdated Show resolved Hide resolved
Copy link
Member

@Dreamsorcerer Dreamsorcerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping one of the gunicorn maintainers might provide some feedback, but if we don't hear anything back in a few days, then this looks reasonable to me.

@codecov
Copy link

codecov bot commented Sep 26, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.31%. Comparing base (2be9318) to head (7e27987).
Report is 1788 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6968   +/-   ##
=======================================
  Coverage   97.31%   97.31%           
=======================================
  Files         107      107           
  Lines       31485    31489    +4     
  Branches     3937     3937           
=======================================
+ Hits        30640    30644    +4     
  Misses        641      641           
  Partials      204      204           
Flag Coverage Δ
CI-GHA 97.22% <100.00%> (+<0.01%) ⬆️
OS-Linux 96.88% <100.00%> (+<0.01%) ⬆️
OS-Windows 95.28% <25.00%> (-0.02%) ⬇️
OS-macOS 96.46% <100.00%> (+<0.01%) ⬆️
Py-3.10.11 96.98% <100.00%> (+<0.01%) ⬆️
Py-3.11.0 96.41% <100.00%> (+<0.01%) ⬆️
Py-3.7.16 96.70% <100.00%> (+<0.01%) ⬆️
Py-3.7.9 95.16% <25.00%> (-0.02%) ⬇️
Py-3.8.10 95.07% <25.00%> (-0.02%) ⬇️
Py-3.8.16 96.61% <100.00%> (+<0.01%) ⬆️
Py-3.9.13 95.07% <25.00%> (-0.02%) ⬇️
Py-3.9.16 96.63% <100.00%> (+<0.01%) ⬆️
Py-pypy7.3.11 94.13% <100.00%> (+<0.01%) ⬆️
VM-macos 96.46% <100.00%> (+<0.01%) ⬆️
VM-ubuntu 96.88% <100.00%> (+<0.01%) ⬆️
VM-windows 95.28% <25.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label May 7, 2023
@Dreamsorcerer Dreamsorcerer changed the title Make GunicornWebWorker exit with WORKER_BOOT_ERROR (3) in case of an boot error. Make GunicornWebWorker exit with APP_LOAD_ERROR in case of a startup error May 7, 2023
@Dreamsorcerer
Copy link
Member

OK, they have provided feedback now. Sounds like this is not the desired approach:
benoitc/gunicorn#2867 (comment)

@webknjaz webknjaz added the backport-3.10 Trigger automatic backporting to the 3.10 release branch by Patchback robot label Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-3.10 Trigger automatic backporting to the 3.10 release branch by Patchback robot bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants