Skip to content

Conversation

@vlieven
Copy link
Contributor

@vlieven vlieven commented Nov 3, 2025

We noticed the child process death issue, also reported in #52270 on our kubernetes-based deployment. The issue was partly mitigated by #55707, but the root cause in uvicorn was not fixed at that time (related: Kludex/uvicorn#2397, stackabletech/airflow-operator#641).

Essentially, uvicorn had a healthcheck timeout that was set too low, and could not be configured.
Fortunately, as of release 0.37.0, uvicorn allows configuring this timeout value (PR: Kludex/uvicorn#2711).

This PR does two things:

  1. Set the minimum version for uvicorn to 0.37.0, so that this setting can be provided.
  2. Add "timeout_worker_healthcheck" to the uvicorn_kwargs, and set it to the worker_timeout value that is already used to configure the "timeout_keep_alive" and "timeout_graceful_shutdown" settings. This increases the timeout value from 5 seconds to 120 seconds by default. This increase is the same as for the "timeout_keep_alive" setting.

@vlieven vlieven force-pushed the api/uvicorn-healthcheck branch from d828d10 to 9aa71c0 Compare November 4, 2025 14:34
@vlieven
Copy link
Contributor Author

vlieven commented Nov 5, 2025

@potiuk You confirmed this issue in september (#52270 (comment)), any chance of getting a review for this PR?

@potiuk
Copy link
Member

potiuk commented Nov 5, 2025

Lgtm but maybe also @pierrejeambrun and @ashb can take a look - I mostly tangentially worked on API server and way how we integrate with unicorn - it does look promising though

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems roughly okay, but I'm not sure using the same value is right is all.

(#55707 should have addressed this for most people already)

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I guess we can refine later or add a specific value if needed.

@ashb ashb added the backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch label Nov 5, 2025
@ashb ashb added this to the Airflow 3.1.3 milestone Nov 5, 2025
@ashb ashb merged commit e98dd9f into apache:main Nov 5, 2025
96 checks passed
github-actions bot pushed a commit that referenced this pull request Nov 5, 2025
…r-timeout CLI option (#57731)

- Set the minimum version for uvicorn to 0.37.0, so that this setting can
   be provided.
- Add "timeout_worker_healthcheck" to the uvicorn_kwargs, and set it
   to the worker_timeout value that is already used to configure the
   "timeout_keep_alive" and "timeout_graceful_shutdown" settings. This
   increases the timeout value from 5 seconds to 120 seconds by default.
   This increase is the same as for the "timeout_keep_alive" setting.
(cherry picked from commit e98dd9f)

Co-authored-by: vlieven <verswyvel.lieven@gmail.com>
@github-actions
Copy link

github-actions bot commented Nov 5, 2025

Backport successfully created: v3-1-test

Status Branch Result
v3-1-test PR Link

ashb pushed a commit that referenced this pull request Nov 5, 2025
…r-timeout CLI option (#57731) (#57854)

- Set the minimum version for uvicorn to 0.37.0, so that this setting can
   be provided.
- Add "timeout_worker_healthcheck" to the uvicorn_kwargs, and set it
   to the worker_timeout value that is already used to configure the
   "timeout_keep_alive" and "timeout_graceful_shutdown" settings. This
   increases the timeout value from 5 seconds to 120 seconds by default.
   This increase is the same as for the "timeout_keep_alive" setting.
(cherry picked from commit e98dd9f)

Co-authored-by: vlieven <verswyvel.lieven@gmail.com>
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Nov 10, 2025
ephraimbuddy pushed a commit that referenced this pull request Nov 10, 2025
…r-timeout CLI option (#57731) (#57854)

- Set the minimum version for uvicorn to 0.37.0, so that this setting can
   be provided.
- Add "timeout_worker_healthcheck" to the uvicorn_kwargs, and set it
   to the worker_timeout value that is already used to configure the
   "timeout_keep_alive" and "timeout_graceful_shutdown" settings. This
   increases the timeout value from 5 seconds to 120 seconds by default.
   This increase is the same as for the "timeout_keep_alive" setting.
(cherry picked from commit e98dd9f)

Co-authored-by: vlieven <verswyvel.lieven@gmail.com>
Copilot AI pushed a commit to jason810496/airflow that referenced this pull request Dec 5, 2025
…I option (apache#57731)

- Set the minimum version for uvicorn to 0.37.0, so that this setting can
   be provided.
- Add "timeout_worker_healthcheck" to the uvicorn_kwargs, and set it
   to the worker_timeout value that is already used to configure the
   "timeout_keep_alive" and "timeout_graceful_shutdown" settings. This
   increases the timeout value from 5 seconds to 120 seconds by default.
   This increase is the same as for the "timeout_keep_alive" setting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:CLI backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants