TreeArtifactValue collects a symlink to a directory as a TreeFileArtifact with a DirectoryArtifactValue #20418
Description
Consider a tree artifact containing a symlink to a directory, as e.g. would be produced by the following action:
d = ctx.actions.declare_directory(...)
ctx.actions.run_shell(
outputs = [d],
command = "cd $1 && mkdir dir && touch dir/file.txt && ln -s dir sym",
arguments = [d.path],
)
Currently, this will result in ActionOutputMetadataStore#constructTreeArtifactValueFromFilesystem
collecting a TreeArtifactValue
with two entries: dir/file.txt
, mapping to a FileArtifactValue
, and sym
, mapping to a DirectoryArtifactValue
.
In addition to DirectoryArtifactValue
being unsound, the mismatch between the "artifact type" and "metadata type" is a source of bugs (#20415 is a recent example). I think it would be better for the symlink to be transparently followed (i.e., collect a TreeFileArtifact
+FileArtifactValue
for sym/file.txt
).
One thing we need to be careful about is that the symlink could point outside of the tree artifact (which is explicitly permitted), and in this case, while we still want to collect the metadata, we should avoid calling chmod
on the subtree. (Alternatively, we could disallow that form of symlink, but I suspect our hands might already be tied by existing use cases.)