Add configs to set bundle file & folder permissions #60270

Dev-iL · 2026-01-08T15:44:38Z

- This PR was created with the help of AI

Context

When using DAG bundles (particularly Git bundles) with user impersonation (run_as_user), tasks can fail due to permission conflicts. The root cause is that bundles are initialized by the main airflow user BEFORE impersonation happens, but the impersonated user later needs access to bundle resources like lock files and tracking directories.

Specific issues encountered:

Lock files created with restrictive permissions block impersonated users
Tracking directories inaccessible to impersonated users
Git's "dubious ownership" error when different users access the same repository

Changes

This PR provides the foundation for fixing impersonation + bundle permission issues:

Configuration options allow users to customize permissions based on their environment
Default group-writable permissions (775/664) work out-of-the-box for environments where the airflow user and impersonated users share a common group
Lock and tracking files are now created with appropriate permissions, eliminating the most common permission errors
Mirrors existing patterns - similar to [logging] file_task_handler_new_file_permissions which solves the same problem for log files

Related Changes (follow-up PRs)

This is part 1 of 3 PRs to fully address the impersonation + bundles issue:

PR 1 (this one): airflow-core
PR 2: providers/git - Apply configured bundle permissions to cloned repos for impersonation #60280
PR 3: task-sdk - Add warning when bundle path isn't accessible to impersonated user #60278

Architecture Note

This PR implements a permission-based solution where the bundle repository
is made group-writable so that multiple users (Airflow user + impersonated
users) can all perform git operations.

Long-term improvement: As discussed in comments, a better solution would
be to add bundle fetching to the Execution API, so only the supervisor needs
write access. This PR provides a working solution in the meantime. See:
#60270 (comment)

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

airflow-core/src/airflow/dag_processing/bundles/base.py

potiuk

Nice. I would love though others (@ephraimbuddy @amoghrajesh ) to take a look

amoghrajesh · 2026-01-09T07:16:54Z

I think the problem is basically that:

When running as AIRFLOW user, bundle init happens as part of parse and all locks are created as AIRFLOW user
If run_as_user is set, current process exits and is replaced by new one with impersonated user
When running with that user, when parse is retried, bundle was inited earlier, so it is just loaded, and while trying to load it, permission conflict comes in, and lock cannot be acquired.

Why can we not consider initing the bundle after we know "who" will run the task? Maybe an early pass to detect "run_as_user" and then actually init the bundle with that user?

ephraimbuddy · 2026-01-09T07:30:06Z

Since run_as_user is not always global(from config) and can be per dag, will it not be an issue with this approach?

Dev-iL · 2026-01-09T11:23:41Z

Why can we not consider initing the bundle after we know "who" will run the task? Maybe an early pass to detect "run_as_user" and then actually init the bundle with that user?

I think this is a suitable suggestion if you know in advance who the impersonated user will be and it will never change (incompatibly). Otherwise, I'd say that to support changing the impersonation the final git location should be cleaned by the airflow user and let it be recreated by the impersonation.

(BTW, this is very similar to one of the approaches suggested by the AI)

Dev-iL · 2026-01-09T11:50:57Z

Since run_as_user is not always global(from config) and can be per dag, will it not be an issue with this approach?

What sort of issue do you foresee?
How do you suggest to deal with it?

potiuk · 2026-01-09T14:24:04Z

Since run_as_user is not always global(from config) and can be per dag, will it not be an issue with this approach?

Yeah. @ephraimbuddy is very right about it.

What sort of issue do you foresee?

I think that was mostly a comment to @amoghrajesh 's suggestion.

Why can we not consider initing the bundle after we know "who" will run the task? Maybe an early pass to detect "run_as_user" and then actually init the bundle with that user?

The thing is that we can have celery or edge workers wher you have multiple tasks run on the same machine and sharing (supervisor) processes that are long running. And in this case, you can have the same bare repository used by multiple tasks using different users - including those using airflow user. And in this case what happens is that all those multiple users should be able to execute fetch - in case they want to use commit that had not yet been fetched. And it will fail if they have no write access to the repo.

How do you suggest to deal with it?

I thought initially that the "recommended" solution should be the one I suggested in slack initially as a workaround - i.e. umask with writable access for group + all users used for impersonation (and airlfow user) have to share the same group. That might be of course relaxed if you you have a filesystem that allows to implement those permissions in a different way - like NFS - but generallly our recommendation should be "make sure airfow and all the users used for impersonation can have write access to the bare git repo - this can be done for example by ... <umask + group>.

However after a bit of thinking - this one leaks a bit from airlfow into the deployment and likely makes things more complex.

I think, however, there is another possibility - and maybe we could consider it - instead of running git fetch by the task, we could add extra command to the executin API - so that task can ask supervisor to fetch the commit. This could solve all the problems and only require read permission to the bare repository in order to checkout the commit by task after it's been fetched by the supervisor. That has another advantage as well, we could potentially serialize those requests or batch them in supervisor and tha would avoid another quite likely potential issue where multiple tasks of the same dag run are running in parallel (which will pretty much always happen if we have mapped tasks) and asking to fetch the repository at the same time - this will work but it is not optimal - potentially many git conections opened, rate limiting might start playing a role etc. So leaving all the git fetch operations to the supervisor seems like a good idea.

That however I think will require a bit more coupling between the bundle interface and execution api - because this is not exclusively a problem of GitBundle only - other bundles might have very similar problems and we should solve it in a generic way so that there is a generic task -> supervisor ("get bundle version NNN") has to be added to the execution
API.

But I see that as the best long-term option.

Dev-iL · 2026-01-09T19:41:07Z

But I see that as the best long-term option.

How does this sound for now:

Update the PR's description with an architectural note.
Add a follow-up parent issue to discuss the need and possible approaches for a long term solution, and track its implementation: DAG Bundle/User Impersonation Causes Permission Issues in Resource Preparation #60387
Add the following to the docs in this PR:

### Using DAG Bundles with User Impersonation

When using `run_as_user` with DAG bundles, ensure: 
1. All impersonated users and the Airflow user are in the same group
2. Configure appropriate umask settings
3. Set `dag_bundle_new_folder_permissions = 0o775` (default)
4. Set `dag_bundle_new_file_permissions = 0o664` (default)

**Note:** This is a temporary solution. Future versions will handle
multi-user access through supervisor-based bundle operations.

Add the following in-code explanation for why the fallback is needed:

# Fallback needed for backward compatibility with old config files

Dev-iL · 2026-01-13T10:19:14Z

I'm thinking about changing the default permissions to 755/600 so as not to silently enforce weaker access controls by simply updating airflow. However, I don't know how this might affect various environments. Comments will be very welcome!

potiuk · 2026-01-13T17:08:42Z

Question., If we really think it's a temporary solution only, I think it will add quite some pollution to config space - we will have two new config values that will become obsolete with the new solution - and they do not solve the concurrency overhead problem..

Why not simply make into a documentation for GitBundle impersonation:

a) set group-writeable umask (002)
b) make sure all your impersonated users have the same group as the user that runs DagFileProcessor

That should (?) have the same effect as adding our code to manage the permissions, it would still require deployment level modifications, but at least it would not add new config params that will become useless in next version.

Or maybe I missed something ?

Dev-iL · 2026-01-14T02:18:03Z

Closing because we rather have a permanent solution, and a workaround requiring no code changes was proposed.

Dev-iL requested review from ephraimbuddy and jedcunningham as code owners January 8, 2026 15:44

boring-cyborg bot added area:ConfigTemplates area:DAG-processing labels Jan 8, 2026

This was referenced Jan 8, 2026

Add warning when bundle path isn't accessible to impersonated user #60278

Merged

Apply configured bundle permissions to cloned repos for impersonation #60280

Closed

Dev-iL force-pushed the 2601/bundle_impersonation branch from c940c5a to f9e7869 Compare January 8, 2026 17:05

potiuk reviewed Jan 8, 2026

View reviewed changes

airflow-core/src/airflow/dag_processing/bundles/base.py Show resolved Hide resolved

potiuk reviewed Jan 8, 2026

View reviewed changes

airflow-core/src/airflow/dag_processing/bundles/base.py Show resolved Hide resolved

potiuk approved these changes Jan 8, 2026

View reviewed changes

Dev-iL force-pushed the 2601/bundle_impersonation branch from f9e7869 to 1f7477d Compare January 8, 2026 21:06

Dev-iL requested a review from amoghrajesh January 8, 2026 21:19

Dev-iL force-pushed the 2601/bundle_impersonation branch from 1f7477d to 0df3557 Compare January 11, 2026 07:24

Dev-iL mentioned this pull request Jan 11, 2026

DAG Bundle/User Impersonation Causes Permission Issues in Resource Preparation #60387

Open

2 tasks

Dev-iL force-pushed the 2601/bundle_impersonation branch from 0df3557 to 1345690 Compare January 11, 2026 09:23

Add configs to set bundle file & folder permissions

0cb45e2

Dev-iL force-pushed the 2601/bundle_impersonation branch from 1345690 to 0cb45e2 Compare January 12, 2026 21:16

Dev-iL closed this Jan 14, 2026

ephraimbuddy mentioned this pull request Jan 30, 2026

Status of testing of Apache Airflow 3.1.7rc1 and Task SDK 1.1.7rc1 #61255

Open

85 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add configs to set bundle file & folder permissions #60270

Add configs to set bundle file & folder permissions #60270

Uh oh!

Dev-iL commented Jan 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

potiuk left a comment

Uh oh!

amoghrajesh commented Jan 9, 2026

Uh oh!

ephraimbuddy commented Jan 9, 2026

Uh oh!

Dev-iL commented Jan 9, 2026

Uh oh!

Dev-iL commented Jan 9, 2026

Uh oh!

potiuk commented Jan 9, 2026 •

edited

Loading

Uh oh!

Dev-iL commented Jan 9, 2026 •

edited

Loading

Uh oh!

Dev-iL commented Jan 13, 2026

Uh oh!

potiuk commented Jan 13, 2026

Uh oh!

Dev-iL commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add configs to set bundle file & folder permissions #60270

Add configs to set bundle file & folder permissions #60270

Uh oh!

Conversation

Dev-iL commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Specific issues encountered:

Changes

Related Changes (follow-up PRs)

Architecture Note

Uh oh!

Uh oh!

Uh oh!

potiuk left a comment

Choose a reason for hiding this comment

Uh oh!

amoghrajesh commented Jan 9, 2026

Uh oh!

ephraimbuddy commented Jan 9, 2026

Uh oh!

Dev-iL commented Jan 9, 2026

Uh oh!

Dev-iL commented Jan 9, 2026

Uh oh!

potiuk commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dev-iL commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dev-iL commented Jan 13, 2026

Uh oh!

potiuk commented Jan 13, 2026

Uh oh!

Dev-iL commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Dev-iL commented Jan 8, 2026 •

edited

Loading

potiuk commented Jan 9, 2026 •

edited

Loading

Dev-iL commented Jan 9, 2026 •

edited

Loading