-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Apply configured bundle permissions to cloned repos for impersonation #60280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| ("dag_processor", "dag_bundle_new_folder_permissions"): "0o775", | ||
| ("dag_processor", "dag_bundle_new_file_permissions"): "0o664", | ||
| } | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should have included the other PR and mark this as dependent on that one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it would be easier to review in isolation.
Is there any specific way to mark a dependency? I did note it at the top of the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote the changes to be standalone via common.compat.
| Git 2.35.2+ refuses to operate on repositories owned by different users | ||
| without explicit safe directory configuration. This is needed when using | ||
| user impersonation (run_as_user) where the repository is created by one | ||
| user but accessed by another. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the run_as_user is configured in only one task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is perhaps slightly more permissive than strictly necessary. However, even if only one task uses run_as_user, the repository still needs to be accessible during DAG parsing, and having consistent permissions simplifies the security model.
How about I add the following docstring explaining this design decision?
def _apply_permissions_recursively(path: Path) -> None:
"""
Apply configured bundle permissions to a directory tree.
This ensures that when user impersonation is used, the impersonated user
can access the cloned repository files. Permissions are applied at clone
time regardless of whether all or only some tasks use run_as_user, because:
1. DAG parsing needs access before task execution
2. Bundles may serve multiple DAGs with different impersonation settings
3. Applying permissions upfront provides a consistent security model
:param path: The root path to apply permissions to recursively
"""803e112 to
8ec3539
Compare
|
|
||
| try: | ||
| subprocess.run( | ||
| ["git", "config", "--global", "--add", "safe.directory", str(path)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure if this is a good idea… I’d not expect my global Git config to be silently changed. Can this be repo-local instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why does this need to use a subprocess instead of the GitPython API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why global: I believe repository-local config is ignored for security reasons (otherwise malicious repos could whitelist themselves). Airflow typically runs in controlled environments (containers/VMs), modifying the global git config for the airflow user should be acceptable.
- Why not GitPython:
- We need to configure
safe.directorybefore creating theRepoobject (L234), becauseGitPythoninternally runs git commands duringRepoinitialization. If the ownership check fails at that point, we get an exception before we can configure anything. - While
GitPythonhasGitConfigParserthat can write to any config file, using it for~/.gitconfigwould essentially reimplementgit config --globalwith more code.
- We need to configure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another approach to consider:
If cloning a bare repository or a repository intended for multiple users (using
git clone --shared), Git can be told to ignore the umask and use a specific scheme via thecore.sharedRepositorysetting:
group(ortrue): Makes the repository group-writable (ignores umask for the group bits).all(orworld): Makes the repository readable by everyone.0xxx: An explicit octal mode (e.g.,0660).
bf76aed to
41b5007
Compare
41b5007 to
a22d5cc
Compare
|
Closing because a broader solution is preferred. See also: #60270 (comment) |
- This PR was created with the help of AIDepends on: #60270
Specific issues addressed:
Changes
_apply_permissions_recursively(path)New helper function that walks a directory tree and applies the configured dag_bundle_new_folder_permissions and dag_bundle_new_file_permissions to all directories and files.
_configure_git_safe_directory(path)New helper function that adds the repository path to git's global
safe.directoryconfiguration, allowing git operations by users who don't own the repository.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.