Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested output symlinks behavior for RE #6547

Closed
ola-rozenfeld opened this issue Oct 29, 2018 · 4 comments
Closed

Nested output symlinks behavior for RE #6547

ola-rozenfeld opened this issue Oct 29, 2018 · 4 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: process

Comments

@ola-rozenfeld
Copy link
Contributor

Let's discuss cornercase issues with symbolic links as outputs here, especially pertaining to remote execution. Examples:

  1. Action declares outputs foo/bar and foo/. Expected: remote execution returns both the output directory digest and the output file digest. The file is not downloaded twice (as part of the output directory).

  2. Action declares only bar output, but actually creates bar as a symlink to foo/bar. Expected behavior: remote execution returns a dangling symlink as output, because foo was not declared, therefore not uploaded to the CAS or returned.

  3. Similarly, if action declared both foo/ and bar->foo/bar, then everything should work -- both the directory and the symlink pointing inside it should be returned.

  4. Action declares outputs foo/bar and baz/. baz/ is a directory, and foo is a symlink pointing to baz. In the current API, there is no way for RE to tell Bazel that foo is actually a symlink. So the current behavior would be for the action to return foo/bar as a regular file, and the directory foo will then be created on the client as a regular directory, copying the bar file, which will be different than the local execution of the same action. If we think this edge-case is important enough to amend the API for, we can discuss options to fix it.

@buchgr buchgr added P2 We'll consider working on this in future. (Assignee optional) P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P2 We'll consider working on this in future. (Assignee optional) labels Oct 30, 2018
@buchgr
Copy link
Contributor

buchgr commented Oct 30, 2018

Thanks for summarizing this!

Action declares outputs foo/bar and foo/. Expected: remote execution returns both the output directory digest and the output file digest. The file is not downloaded twice (as part of the output directory).

Seems like a minor optimization.

Action declares only bar output, but actually creates bar as a symlink to foo/bar. Expected behavior: remote execution returns a dangling symlink as output, because foo was not declared, therefore not uploaded to the CAS or returned.

Bazel doesn't support dangling symlinks neither with local nor remote execution. So I think this is handled correctly as is.

Action declares outputs foo/bar and baz/. baz/ is a directory, and foo is a symlink pointing to baz.

I am not aware of any tools behaving differently if any part of a path is a symlink. I don't think the API should support this.

@buchgr buchgr added this to the Remote Execution milestone Oct 30, 2018
@ola-rozenfeld
Copy link
Contributor Author

Seems like a minor optimization.

Yes, the not downloading twice bit is minor -- however, I have a suspicion that currently it will just crash in this case, because it always expects the output file to not exist, so we need to fix it and add a test case.

I am not aware of any tools behaving differently if any part of a path is a symlink. I don't think the API should support this.

The problem is not just that the tool behaves differently if part of the path is a symlink -- which, btw, according to George is a very common case with virtually all tools -- see this discussion. An additional problem is that two actions in the same build might create inconsistent output trees:

  • action A declares dir/ as output with foo/ being a directory symlink under it
  • action B declares dir/foo/bar as output.

The order of these actions matters, and when B runs after A, B will attempt to call createDirectoryAndParents on dir/foo/bar, when dir/foo is an existing symbolic link. I don't know what will happen in this case -- I think we at least need to make sure it will not crash.

Same with all other corner cases -- I agree they are minor, we just need to make sure Bazel behaves somewhat reasonably and doesn't crash.

@github-actions
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 2.5 years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Mar 16, 2023
@github-actions
Copy link

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please reach out to the triage team (@bazelbuild/triage). Thanks!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: process
Projects
None yet
Development

No branches or pull requests

5 participants