Skip to content

Airflow core overwrite KRB5CCNAME, will make run_as_user not being able to use the run_as_user ticket cache #53967

@adrian-edbert

Description

@adrian-edbert

Apache Airflow version

3.0.3

If "Other Airflow 2 version" selected, which one?

No response

What happened?

Airflow version: 3.0.3

Airflow core with security kerberos will overwrite KRB5CCNAME

airflow-core/src/airflow/main.py

 if conf.get("core", "security") == "kerberos":
        os.environ["KRB5CCNAME"] = conf.get("kerberos", "ccache")
        os.environ["KRB5_KTNAME"] = conf.get("kerberos", "keytab")

On task_runner when using run_as_user
task-sdk/src/airflow/sdk/execution_time/task_runner.py

cmd = ["sudo", "-E", "-H", "-u", run_as_user, sys.executable, "-c", rexec_python_code]

This will carry over the KRB5CCNAME env variable, so something like this will fail

BashOperator(task_id='test_klist', bash_command='klist', run_as_user='other_user')

even when the 'other_user' have a valid ticket cache with a valid krb5.conf

This is caused by:

  • The KRB5CCNAME will be passed to the run_as_user and will be used, even when it's not valid and the run_as_user will not have permission to the sudoer user ticket cache
  • The KRB5CCNAME will have higher precedence than the krb5.conf

What you think should happen instead?

run_as_user should use it's own KRB5CCNAME, or use the default ticket cache of the run_as_user instead

Airflow core should not change os.environ["KRB5CCNAME"] in the init.py but rather only set this when it's needed by the context, currently I think only Spark Submit Operator specifically need this, since the Airflow Kerberos get this value from the config instead from the KRB5CCNAME env variable

How to reproduce

Setup Airflow with kerberos installed

apt-get update && apt-get install -y build-essential \
        krb5-user \
        libkrb5-dev

And env variable set

AIRFLOW__KERBEROS__CCACHE = /tmp/krb5cc_{uid of airflow user}

Setup worker with sudoers to another user

airflow ALL=(other_user) NOPASSWD: ALL

Run a DAG with kinit to check the ticket cache location

BashOperator(task_id='test_klist', bash_command='klist', run_as_user='other_user')

Setup kerberos on the other_user with valid ticket cache

sudo su - other_user
kinit
# assuming krb5.conf will save the ticket cache into /tmp/krb5cc_{uid of other_user}

Expected:

  • klist command should succeed and show the other_user ticket cache
    Actual:
  • klist command will fails with (klist: No credentials cache found (filename: /tmp/krb5cc_{uid of airflow user}))

Operating System

Debian GNU/Linux 12 (bookworm)

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

image: apache/airflow:3.0.3-python3.12

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions