Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow TreeArtifacts with any name in Starlark C++ API #8269

Conversation

kkiningh
Copy link

@kkiningh kkiningh commented May 9, 2019

In the Starlark C++ API, artifacts are checked for an extension that matches their use (e.g. C++ source files must have a ".cpp" extension or something similar). However, the typical use case for TreeArtifacts results in multiple types of files inside a single directory (e.g. generated C++ source and headers being placed in a single directory.) Since, the current check means a single directory can only be used as either source (with a ".cpp" extension) or header (with a ".h" extension), the user is required to filter out the sources/headers into their own separately named directories, which seems cumbersome.

In practice, I don't know of a good reason why there needs to be a restriction on TreeArtifact filenames in the first place (really the restriction should be on the files contained by the TreeArtifact), so I eliminated the check in that case. This is just a proposed solution though; if there's a better fix (or you think this isn't an issue in the first place) let me know.

@kkiningh kkiningh requested a review from lberki as a code owner May 9, 2019 00:27
@laurentlb laurentlb removed their assignment May 9, 2019
@laurentlb laurentlb requested a review from oquenchil May 9, 2019 13:31
@laurentlb laurentlb added team-Rules-CPP Issues for C++ rules and removed team-Starlark labels May 9, 2019
@kkiningh
Copy link
Author

I found a potential issue with this pull request, although I'm not sure how to resolve it. When compiling C++ source from a TreeArtifact, arguments passed as cxxopts on the command line are not passed to the underlying compile command (although options passed as copts are). I've created a repo here that reproduces the issue.

The way the repo works is it tries to pass --std=c++17, and uses a static assert to verify that the source is being compiled with that option (by default bazel uses c++11).

Using cxxopt:

kkiningh:~/Workspace/bazel_issue_tree_noflags$ ../bazel/bazel-bin/src/bazel build :cpp_tree --cxxopt='--std=c++17'
INFO: Analyzed target //:cpp_tree (1 packages loaded, 1 target configured).
INFO: Found 1 target...
ERROR: /Users/kkiningh/Workspace/Research/bazel_issue_tree_noflags/BUILD:5:1: C++ compilation of rule '//:cpp_tree' failed (Exit 1) wrapped_clang failed: error executing command external/local_config_cc/wrapped_clang '-D_FORTIFY_SOURCE=1' -fstack-protector -fcolor-diagnostics -Wall -Wthread-safety -Wself-assign -fno-omit-frame-pointer -O0 -DDEBUG '-std=c++11' -iquote . -iquote ... (remaining 17 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
bazel-out/darwin-fastbuild/bin/cpp_tree/test.cpp:2:1: error: static_assert failed "Compiled as C++11"
static_assert(__cplusplus >= 201701L, "Compiled as C++11");
^             ~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Target //:cpp_tree failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.210s, Critical Path: 0.06s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

Using copt:

kkiningh:~/Workspace/bazel_issue_tree_noflags$ ../bazel/bazel-bin/src/bazel build :cpp_tree --copt='--std=c++17'
INFO: Build options --copt and --cxxopt have changed, discarding analysis cache.
INFO: Analyzed target //:cpp_tree (0 packages loaded, 117 targets configured).
INFO: Found 1 target...
Target //:cpp_tree up-to-date:
  bazel-bin/_objs/cpp_tree/cpp_tree
INFO: Elapsed time: 0.339s, Critical Path: 0.06s
INFO: 1 process: 1 darwin-sandbox.
INFO: Build completed successfully, 2 total actions

Note that compiling a single file or renaming the directory to have a ".cpp" extension still passes the correct options.

kkiningh:~/Workspace/Research/bazel_issue_tree_noflags$ ../bazel/bazel-bin/src/bazel build :cpp --cxxopt='--std=c++17'
Starting local Bazel server and connecting to it...
INFO: Analyzed target //:cpp (9 packages loaded, 117 targets configured).
INFO: Found 1 target...
INFO: Deleting stale sandbox base /private/var/tmp/_bazel_kkiningh/eee402ac3dd0a6a158af086e00b07baf/sandbox
Target //:cpp up-to-date:
  bazel-bin/_objs/cpp/cpp.o
INFO: Elapsed time: 4.221s, Critical Path: 0.17s
INFO: 2 processes: 2 darwin-sandbox.
INFO: Build completed successfully, 3 total actions

Please let me know what the best way to address this is.

@lberki lberki requested review from hlopko and removed request for lberki May 15, 2019 06:42
@hlopko hlopko removed their request for review May 16, 2019 11:46
@kkiningh
Copy link
Author

@oquenchil friendly bump.

@oquenchil
Copy link
Contributor

Hi Kevin,

Thank you for the pull request. As you state the API should definitely allow tree artifacts. The problem is that we cannot separate the sources from the headers inside the tree artifact during analysis. I need to look at this more carefully. I will be OOO for a week and a half. The earliest I can take a look at this is on Friday next week.

@kkiningh
Copy link
Author

Thanks for the response. Yes, that makes sense. I'm not familiar with the Bazel internals, but my understanding is that different actions will be generated for each file in a TreeArtifact, dependent on it's filetype (e.g in generateActionForInputArtifacts). In that case, is there a need to separate the source and headers?

@oquenchil
Copy link
Contributor

That was introduced for J2ObjcAspect.java. I'm not sure how well it works for C++ rules and even there if I understand correctly each tree artifact will contain either only headers or only sources. This was introduced because outputs were not predictable.

Have you tried using tree artifacts as you propose here with srcs and hdrs in the cc_library rule? If it doesn't work there then it won't work for the C++ Starlark API.

If we start supporting this properly it won't be a simple change. We will have to take a lot more into consideration than what the code you linked to does.

@kkiningh
Copy link
Author

I'm not sure how well it works for C++ rules and even there if I understand correctly each tree artifact will contain either only headers or only sources.

At least for my usecase it worked well enough. The restriction to only headers/source files is annoying, but IMO reasonable.

Have you tried using tree artifacts as you propose here with srcs and hdrs in the cc_library rule?

Haven't tested with HEAD, but at least at one point it did work with cc_library (although using TreeArtifacts with cc_library is broken for me in other ways.) It also worked in pervious versions of the Starlark C++ API.

If we start supporting this properly it won't be a simple change.

That's disappointing, but if that's the case I understand. What other things need to be taken into consideration?

@oquenchil
Copy link
Contributor

Tree artifacts were added for a specific use case. Of course they are useful for normal C++ rules but they were only an afterthought, so some things weren't taken into account:

  • include scanning (not available in Bazel yet but will be eventually)

  • header parsing (this can be turned on in Bazel already with the supports_header_parsing attribute in the cc_toolchain rule and the feature parse_headers). The current implementation of compile action templates does not take compilation of headers into account

  • modules (not available in Bazel but will be eventually). Headers in tree artifacts won't work with modules as they are implemented right now

That's if we allow tree artifacts in the API where there would be a separate tree artifact for sources and headers. If we allow a tree artifact with mixed files then we get into other problems. As the code is right now I believe we have no other choice but to pass the .cc files together with the headers for a compilation action downstream when all we want are the headers. The .cc files could get filtered during execution but this feels wrong. There are more implementation related problems.

All in all, tree artifact support for C++ needs be revisited carefully because as they are right now they are supported poorly. I don't want to open the gates in the C++ Starlark API to another source of problems for something that is not properly supported. Be sure that we will work on this in the future though.

Meanwhile, I don't encourage you to use it but there is a workaround for your specific use case that my colleague describes here:
https://stackoverflow.com/questions/48417712/how-to-build-static-library-from-the-generated-source-files-using-bazel-build/48524539#48524539

You could have a separate rule that writes out 2 tree artifacts, each with the extensions .h and .cc to fool Bazel to trigger the compile action template code path.

@kkiningh
Copy link
Author

kkiningh commented Jun 6, 2019

I'm a bit confused. Currently, there is de facto support for TreeArtifacts in the both the cc_* rules and the Starlark API in the sense that I can pass a TreeArtifact and it works; this pull request isn't intending to change anything other than what names are allowed. It sounds like you're saying that there's a broader problem of some features not supporting TreeArtifacts, in which case either those features should be fixed, or there should be somewhere in the Bazel documentation that explicitly says what isn't supported.

The .cc files could get filtered during execution but this feels wrong.

Headers/source files are commonly generated together, so there will have to be a filtering step at some point. As I pointed out in my original request, having this be a separate action is cumbersome and largely unnecessary. It's also already supported by the builtin cc_* rules.

You could have a separate rule that writes out 2 tree artifacts, each with the extensions .h and .cc to fool Bazel to trigger the compile action template code path.

I am aware of this workaround. In fact, I specifically call it out as the thing I was trying to address in my pull request ("the user is required to filter out the sources/headers into their own separately named directories".)

Anyways, it sounds like this is being tied up in a larger question of C++ and TreeArtifacts support generally. If you want to deny this pull request on the grounds that extending the current support is a bad idea since it's going to be revisited, that's fine, but I do think additional documentation needs to be added to clearly delineate what is and isn't supported.

@oquenchil
Copy link
Contributor

Yes, that's right, they are supported. They started being supported by accident but indeed they do work now with the naming workaround.

You are right that we don't want to extend the support without the proper consideration and design that this feature should have had from the very beginning. As I mentioned tree artifacts are buggy and do not work properly with include scanning, modules and header parsing.

I hear you and understand that the lack of proper support is a pain. Rest assured we will revisit this eventually.

@meisterT
Copy link
Member

meisterT commented Mar 4, 2020

What's the status here?

@kkiningh
Copy link
Author

kkiningh commented Mar 4, 2020

@oquenchil has there been any progress in support for TreeArtifacts?

@sventiffe
Copy link
Contributor

@kkiningh can you rebase please to resolve the conflicts, we will try to merge then

@kkiningh
Copy link
Author

@sventiffe should be done. Let me know if you need anything else to merge this.

@oquenchil
Copy link
Contributor

oquenchil commented Oct 26, 2020

Sven if you are going to merge this, I think this PR should at least come with an extensive list of tests, including:

And any other test checking ways in which this might introduce bugs.

@konste
Copy link

konste commented Apr 15, 2021

I wonder if this MR is abandoned or there still is a chance to have it merged?

@kkiningh
Copy link
Author

Looks like this was more or less added in 596653d

@konste
Copy link

konste commented Apr 15, 2021

Ah-h, great to know! Thank you!
Now let's hope it will show up in 4.1 when it is finally released.

@kkiningh
Copy link
Author

@konste I don't think it will make 4.1 - I don't see it in the list of cherry picks in #13099 and it was authored well past the baseline of 6b33bdb.

@konste
Copy link

konste commented Apr 15, 2021

Oh, I see... Let me try to ask for it. Who knows...

@konste
Copy link

konste commented Apr 29, 2021

@kkiningh Do I get it right that with 596653d TreeArtifacts are supported for sources and headers in cc_common.compile but still do not supported in cc_common.link? compilation_outputs returned by cc_common.compile is also a TreeArtifact and when it is fed to cc_common.link it does not understand it. How it supposed to be used then?

@aiuto
Copy link
Contributor

aiuto commented Jun 24, 2021

Friendly ping to author and reviewers.

@oquenchil
Copy link
Contributor

Since ac5aafa no more validation of header extensions is done.

@oquenchil oquenchil closed this Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes team-Rules-CPP Issues for C++ rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants