Skip to content

Conversation

@arkadiuszbach
Copy link
Contributor

What:

Use a pre-install Helm hook to generate the jwt-secret only once during installation, instead of regenerating it on each upgrade.

Why:
Currently, the jwt-secret is regenerated on every Helm upgrade. This can lead to inconsistent JWT secrets across Airflow components and result in authentication or communication failures.

Related discussion: #54178

Problem Scenarios

  1. Multiple Airflow API Server Replicas

    • Example: Modify Helm values (e.g., change StatsD resource allocation) — this triggers an upgrade but does not redeploy the API server pods.
    • If one API server pod is manually restarted, it receives a new jwt-secret, while others still use the previous one.
    • Result: UI becomes inaccessible due to an infinite redirect loop with InvalidSignatureError: Signature verification failed.
  2. Scheduler and Worker Using Different JWT Secrets

    • Similar scenario — Helm upgrade without redeployment of Scheduler/Worker pods.
    • Restarting either component causes a mismatch in JWT secrets.
      Result: Tasks fail with InvalidSignatureError: Signature verification failed.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@arkadiuszbach arkadiuszbach force-pushed the fix/helm-create-jwt-with-preinstall-hook branch from 0f4b8f7 to 2574363 Compare October 28, 2025 10:18
@arkadiuszbach arkadiuszbach force-pushed the fix/helm-create-jwt-with-preinstall-hook branch from 2574363 to ee54ff8 Compare October 28, 2025 11:00
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 10, 2026
Comment on lines +39 to +41
"helm.sh/hook": "pre-install"
"helm.sh/hook-delete-policy": "before-hook-creation"
"helm.sh/hook-weight": "0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some time I think that we could go with the Helm lookup function to determine if it exists or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall exactly why, but I have in my mind that there is some funkiness with helm lookup that makes it not great for our needs. But maybe in some of these cases, it's better than nothing?

@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 11, 2026
Copy link
Member

@jedcunningham jedcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was done this way in #51799 so updates from Airflow 2 to Airflow 3 work. And it's best practice to set this explicitly which avoids the problem completely.

@jedcunningham
Copy link
Member

These checksums don't force restarts?

@arkadiuszbach
Copy link
Contributor Author

arkadiuszbach commented Jan 20, 2026

These checksums don't force restarts?

I was using 1.18.0 helm chart, it was not there yet from what i can see, but even if it is restarted:

  • before all instances are restarted we may get random failures i think
  • what about worker, it may have tasks running for 1hour so it wont be able to restart, api servers will have new jwt but worker will still use the old one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants