Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit outputs for wasm_cc_binary #1047

Merged
merged 7 commits into from
May 20, 2022

Conversation

zaucy
Copy link
Contributor

@zaucy zaucy commented May 6, 2022

I originally started this PR because I was running into something similar to #986 and found out that my archive was libfoo.so and not the expected foo.so wasm_binary.py required. This PR doesn't just solve that problem for me, but it also allows me to specify the name of the outputs as well as which outputs I would like.

wasm_cc_binary(
    name = "foo",
    cc_target = ":foo_cc",
    # NEW
    outputs = [
        "foo.wasm",
        "foo.js",
    ],
)

I think this is a good change for 2 reasons:

  1. rule outputs atttribute is deprecated
  2. Outputs are more strict. If I want a sepcific output and my configuration doesn't allow that output, I get an error.

Let me know what you think!

@zaucy zaucy force-pushed the feat/explicit-outputs branch 4 times, most recently from b9f16b9 to eb0aea8 Compare May 7, 2022 01:55
@zaucy zaucy marked this pull request as ready for review May 7, 2022 02:02
Copy link
Collaborator

@walkingeyerobot walkingeyerobot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I'm all for moving away from deprecated bazel APIs :)

I'd like to learn more about what exactly is allowed by the new outputs parameter. What happens if I specify ['foo.wasm', 'bar.js']? I imagine this would fail during link in wasm_binary.py because bar.js does not exist? If this is the case, I'd prefer to fail earlier, i.e. in the rule, so we don't waste time doing a full build that's not going to work.

It's also noteworthy that I believe we can't just specify extensions in an outputs attribute because those would not be known to bazel ahead of time and therefore wouldn't be able to be used by rules that consume wasm_cc_binary files (although it would work if users were using wasm_cc_binary as their final rule).

@@ -134,6 +123,10 @@ wasm_cc_binary = rule(
"simd": attr.bool(
default = False,
),
"outputs": attr.output_list(
allow_empty = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to set this to True instead? We'd rather not break existing users.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok we might be able to do something with a macro that sits in front of wasm_cc_binary and sets outputs to a default list if it's not set, effectively keeping the behavior. I am happy to help with this part as it might be a bit tricky.

The big thing here is I don't want to break an unknown number of projects depending on this rule with a significant breaking API change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a default outputs when the outputs attribute is empty. That does mean that #986 would still be broken (maybe?) when the outputs are empty.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly this is still broken. To reproduce, create a new folder next to hello-world called "foo", add a BUILD file with these contents:

filegroup(
    name = "foo",
    srcs = ["//hello-world:hello-world.wasm"],
)

Because hello-world.wasm is not an explicit output, the build fails. If you add it under the outputs attribute, this works (although you might need to mess with visibility or pass --nocheck_visibility).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Unfortunately simply wrapping the rule in a macro for default outputs wouldn't work either since currently on the main branch the outputs depend on the cc_target's name, not just the wasm_cc_binary's name.

If the rule was wrapped in a macro and it's okay to break the default output paths not considering cc_targets name then that's great! Otherwise there could be two internal rules that the macro chooses from.

I'm capable of doing either option. If you have a preference let me know and I'll make the necessary changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this is more tricky than I initially thought.

I think we should parse the cc_target label in the macro to keep existing behavior as much as possible. Inside the macro, you should just be able to treat the cc_target as a string. I acknowledge that if someone is select()ing on their cc_target this will break as the select()s are not resolved when the macro is resolving. I don't see a way to avoid breaking that, but I don't believe the number of users affected would be very high.

I still believe this change is quite valuable overall. Thank you for your patience and effort here.

Copy link
Contributor Author

@zaucy zaucy May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and created a macro that decides to use either a "legacy" rule or the new rule. I thought that might be better than parsing the cc_target label in the macro. ce1cea7

@zaucy
Copy link
Contributor Author

zaucy commented May 11, 2022

I'd like to learn more about what exactly is allowed by the new outputs parameter. What happens if I specify ['foo.wasm', 'bar.js']? I imagine this would fail during link in wasm_binary.py because bar.js does not exist?

Ah yes that would fail. My main usecase has been standalone wasm so I'm a little out of touch with the other output files. Would checking for certain output filenames matching be scalable? Or is that a painful maintenance burden? I don't know the details for which files are required with which settings.

It's also noteworthy that I believe we can't just specify extensions in an outputs attribute because those would not be known to bazel ahead of time and therefore wouldn't be able to be used by rules that consume wasm_cc_binary files

The output paths (and extname) are known in the rule with ctx.outputs.outputs. Would creating a Provider for each type of output (or maybe groups of outputs?) be flexible enough for rules consuming the wasm_cc_binary files?

Alternatively since the outputs are explicit they could also explicitly be passed into another rule, correct?

@walkingeyerobot
Copy link
Collaborator

I'd like to learn more about what exactly is allowed by the new outputs parameter. What happens if I specify ['foo.wasm', 'bar.js']? I imagine this would fail during link in wasm_binary.py because bar.js does not exist?

Ah yes that would fail. My main usecase has been standalone wasm so I'm a little out of touch with the other output files. Would checking for certain output filenames matching be scalable? Or is that a painful maintenance burden? I don't know the details for which files are required with which settings.

It's also noteworthy that I believe we can't just specify extensions in an outputs attribute because those would not be known to bazel ahead of time and therefore wouldn't be able to be used by rules that consume wasm_cc_binary files

The output paths (and extname) are known in the rule with ctx.outputs.outputs. Would creating a Provider for each type of output (or maybe groups of outputs?) be flexible enough for rules consuming the wasm_cc_binary files?

Alternatively since the outputs are explicit they could also explicitly be passed into another rule, correct?

I'm not sure exactly how providers would work here. There is a solution that I'm confident will work. We can go back to disallowing outputs to be empty and add a macro in front of wasm_cc_binary that detects that outputs is empty and passes the full list there. That would make the outputs explicit if the user doesn't provide any.

I'll ask some people (@trybka) to see if they know more about how providers might help us here.

@trybka
Copy link

trybka commented May 17, 2022

The output paths (and extname) are known in the rule with ctx.outputs.outputs. Would creating a Provider for each type of output (or maybe groups of outputs?) be flexible enough for rules consuming the wasm_cc_binary files?

They are only known if specifying the attribute (i.e. non-empty)--this doesn't help with supporting pre-existing legacy usecase(s) where folks are just depending on these implicit outputs existing.

e.g. you could do the following (and people do this):

genrule(
    name = "compile_js",
    srcs = ["//mywasm:foo.js", "mycustomfooter.js"],
    outs = ["compiled.js"],
    cmd = ["cat $(SRCS) > $(OUTS)"]
)

Providers for specific use-cases are a fine extension to add for, e.g. js_library, but I don't think they help here.

OutputGroupInfo is a provider specifically for doing different file outputs, but still doesn't solve our pre-declared output file problem, I'm afraid.

Alternatively since the outputs are explicit they could also explicitly be passed into another rule, correct?

Yes. See above.

@@ -105,6 +191,12 @@ def _wasm_binary_outputs(name, cc_target):

return outputs

_wasm_cc_binary_legacy = rule(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you set name = "wasm_cc_binary" for both of these?

that way the target type matches for things that care about the rule type.

Copy link
Contributor Author

@zaucy zaucy May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I didn't know that existed. Thanks!

I read the doc about it, but I have two questions if you don't mind answering:

  1. Is it not a problem that both _wasm_cc_binary_legacy and _wasm_cc_binary have the same name?
  2. Is it recommended that the rule name attribute is set for any rule that is wrapped in this style? I'd like to update my internal rules if that's the case. I'm unsure what the consequences of not doing it is.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This is basically the only use for the name parameter AFAIK--when you need to do something spicy like this.
    See also: cc_test.bzl's usage

  2. I dunno if it's "recommended" or not, but when you care about the rule type being the same (as, in this case, we do -- for things like bazel query "kind(wasm_cc_binary...)" or native.existing_rules), you have two options:
    a) either use name or
    b) do some file import dance like cc_binary does. cc_binary does this thing where it defines the rules in separate files so they can both be named cc_binary, but then imports them with a different name to do this macro wrap.

I personally prefer using name, but I suppose both are correct so it might just be a matter of style.

@zaucy
Copy link
Contributor Author

zaucy commented May 18, 2022

I've cherry picked #1049 to get past the macOS issue (I can remove if this PR gets approved) and I also rebased onto 891b449 because the latest on main (93f21c9) linux bazel is failing.

@walkingeyerobot
Copy link
Collaborator

This looks good to me! I think I'd like to get the bazelisk changed merged before merging this, but other than that it looks good to go.

Thanks very much for your patience and hard work on this!

@walkingeyerobot walkingeyerobot merged commit 71f5fb0 into emscripten-core:main May 20, 2022
@zaucy zaucy deleted the feat/explicit-outputs branch May 20, 2022 05:22
radekdoulik pushed a commit to radekdoulik/emsdk that referenced this pull request May 31, 2022
* Explicit outputs for wasm_cc_binary

* Backwards compatibility

* data_runfiles restore

* restore test_bazel.sh

* Using wrong path on accident

* two separate rules for legacy support

* Added name attribute to wasm_cc_binary rule
lewing pushed a commit to dotnet/emsdk that referenced this pull request Feb 7, 2023
* [bazel] Set CLOSURE_COMPILER to workaround RBE+symlinks issue (emscripten-core#1037)

* [bazel] Set CLOSURE_COMPILER to workaround RBE+symlinks issue

* space

* specify node_js

* 3.1.10 (emscripten-core#1046)

* Optimize sandbox performance (emscripten-core#1045)

* Optimize sandbox performance

Link just the files needed to compile, link, or archive, rather than the entire directory tree. This drastically improves build times on macOS, where the sandbox is known to be slow (bazelbuild/bazel#8230).

* Linux wants clang to link

* all_files not needed?

* Linux wants llc

* And llvm-ar

* Templated build_file_content

* include node modules glob with linker files. also some minor formatting fixes. (emscripten-core#1052)

* Using bazelisk on macOS CI (emscripten-core#1049)

* Explicit outputs for wasm_cc_binary (emscripten-core#1047)

* Explicit outputs for wasm_cc_binary

* Backwards compatibility

* data_runfiles restore

* restore test_bazel.sh

* Using wrong path on accident

* two separate rules for legacy support

* Added name attribute to wasm_cc_binary rule

* 3.1.11 (emscripten-core#1053)

* 3.1.12 (emscripten-core#1054)

* 3.1.13 (emscripten-core#1055)

* [bazel] Add additional files necessary for building with closure and on RBE (emscripten-core#1057)

* 3.1.14 (emscripten-core#1061)

* 3.1.15 (emscripten-core#1066)

* Pin `latest` to a specific version for arm64-linux (emscripten-core#1065)

Fixes: emscripten-core#1040

* 3.1.16 (emscripten-core#1071)

* 3.1.17 (emscripten-core#1076)

* Exclude msys from path fix function. (emscripten-core#1078)

Fixes: #911

* 3.1.18 (emscripten-core#1081)

* 3.1.18

* Update LLVM include path in Bazel files

* Version 3.1.18-2 (emscripten-core#1083)

3.1.18 had a bad release binary on ARM64 Mac so push an updated version of the release.

* 3.1.19 (emscripten-core#1090)

* Add EMSDK_QUIET to make emsdk_env less chatting (emscripten-core#1091)

Without this the recommended way to silence emsdk_env was to pipe its
stderr to /dev/null.. but then you also loose potentially useful error
message.

Fixes: #946

* 3.1.20 (emscripten-core#1095)

* Add double-quotes to allow spaces in path (emscripten-core#1097)

* 3.1.21 (emscripten-core#1101)

* Update latest-arm64-linux to 3.1.21 (emscripten-core#1102)

* Update XCode version on CircleCI (emscripten-core#1103)

12.2 is being deprecated

* 3.1.22 (emscripten-core#1107)

* 3.1.23 (emscripten-core#1111)

* Avoid exporting EM_CONFIG for modern SDK versions (emscripten-core#1110)

Newer versions of emscipten, starting all the way back in 1.39.13, can
automatically locate the `.emscripten` config file that emsdk creates so
there is no need for the explicit EM_CONFIG environment variable.  Its
redundant and adds unnessary noisce/complexity.

Really, adding emcc to the PATH is all the is needed these days.

One nice thing about this change is that it allows folks to run
whichever emcc they want to and have it just work, even if they have
configured emsdk.   Without this change, if I activate emsdk and I run
`some/other/emcc` then emsdk's `EM_CONFIG` will still be present and
override the configuration embedded in `some/other/emcc`.

e.g. in the same shell, with emsdk activated, I can run both these
commands and have them both just work as expected.

```
$ emcc --version
$ /path/to/my/emcc --version
```

* Use x64 version for Windows on Arm (emscripten-core#1115)

* 3.1.24 (emscripten-core#1122)

* 3.1.25 (emscripten-core#1130)

* [bazel] Switch to platforms-based toolchain resolution (emscripten-core#1036)

* remove "name" attribute from bazel rules (emscripten-core#1131)

* 3.1.26 (emscripten-core#1134)

* Update remote_docker version in CircleCI config (emscripten-core#1117)

20.10.17 is the current default.

* docker image: Change base to Ubuntu 22.04 LTS (jammy) (emscripten-core#1135)

Done to upgrade from CMake 3.16.3 to 3.22.1. CMake 3.21 or newer is needed to build the Qt 6.4.1 sources with emscripten.

Also update to libidn12 to resolve an "Unable to locate package libidn11" error.

* 3.1.27 (emscripten-core#1139)

* Use constants for fixed paths. NFC (emscripten-core#1140)

* Add standalone_wasm feature to bazel emscripten_toolchain (emscripten-core#1145)

* 3.1.28 (emscripten-core#1149)

* Upgrade to rules_nodejs 5.8.0 (emscripten-core#1150)

Fixes emscripten-core#1020

* 3.1.29 (emscripten-core#1160)

* Pin Windows CI to Bazel 5.4.0 (emscripten-core#1163)

* Remove reference to fastcomp-latest. NFC (emscripten-core#1164)

fastcomp can only be install using explicit versions names so this name
doesn't work.

* Remove fastcomp SDK and fastcomp build rules. NFC (emscripten-core#1165)

Folks that want to work with fastcomp will now need to use an older
checkout of emsdk.

* 3.1.30 (emscripten-core#1167)

* Bump emscripten to 3.1.30

* Bump clang version

* Do not include test directory

* Update eng/emsdk.proj

Co-authored-by: Alexander Köplinger <alex.koeplinger@outlook.com>

---------

Co-authored-by: Kevin Lubick <kjlubick@users.noreply.github.com>
Co-authored-by: Sam Clegg <sbc@chromium.org>
Co-authored-by: John Firebaugh <john.firebaugh@gmail.com>
Co-authored-by: walkingeyerobot <mitch@thefoley.net>
Co-authored-by: Ezekiel Warren <zaucy@users.noreply.github.com>
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
Co-authored-by: Tim Ebbeke <Tim06TR@gmail.com>
Co-authored-by: Derek Schuff <dschuff@chromium.org>
Co-authored-by: Joel Van Eenwyk <joel.vaneenwyk@gmail.com>
Co-authored-by: Pierrick Bouvier <101587250+pbo-linaro@users.noreply.github.com>
Co-authored-by: Trevor Hickey <TrevorJamesHickey@gmail.com>
Co-authored-by: Fredrik Orderud <forderud@users.noreply.github.com>
Co-authored-by: Robbert van Ginkel <570934+robbertvanginkel@users.noreply.github.com>
Co-authored-by: Alexander Köplinger <alex.koeplinger@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants