Skip to content

[Bazel] Support for feature debug fission in emsdk-bazel-toolchain #1479 #1531

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

trzeciak
Copy link
Contributor

Support for debug fission. In short, debugging fission should:

  • reduce linking time, RAM usage and disk usage
  • speed up incremental builds
  • speed up debugger work (reduce startup and breakpoint time)

To use this, follow the --fission=yes flag.

References:

@trzeciak
Copy link
Contributor Author

This fixes issue 1479

@trzeciak
Copy link
Contributor Author

trzeciak commented Feb 19, 2025

Can I sync bazel for windows (from old ones 5.4.0), to be sync with linux/mac?
https://github.com/emscripten-core/emsdk/pull/1531/checks?check_run_id=37473055285

I'll try this, but so far no luck: #1532

(edited)
Okay, I managed to make Bazel 5.4.1 supported too, so it doesn't block this PR.

"emdwp-emscripten_bin_linux.sh",
"emdwp-emscripten_bin_mac_arm64.sh",
"emdwp-emscripten_bin_mac.sh",
"emdwp-emscripten_bin_win.bat",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this tool require 5 files where the above tools all just use two (one .sh and one .bat)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why the dwp/emdwp_script tool doesn't have access to the environment variables, OR
I don't know how to make it pass these variables to this tool.

I've seen how to do it for emcc and emcc_link in bazel/emscripten_toolchain/toolchain.bzl,
that this is done by means of env_set for specific actions, I tried to complete the action list but it didn't help.
So I got around this by using five dedicated files.

I also ask about this on bazel-slack (Fri, Feb 14th), no answer yet: https://bazelbuild.slack.com/archives/CGA9QFQ8H/p1739488236571719

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

printenv from dwp tool:

...
SUBCOMMAND: # //:main [action 'CcGenerateDwp main.dwp', configuration: 77bdbcfa4ce6db4cf8539adb5465d4c7ee5d7734a59bd2c6e99a60f4e1c191e4, execution platform: @@platforms//host:host, mnemonic: CcGenerateDwp]
(cd /private/var/tmp/_bazel_damian/1c7f831e8ac7ba8a4146146efff137a8/execroot/_main && \
  exec env - \
  external/emsdk/emscripten_toolchain/emdwp-emscripten_bin_mac_arm64.sh bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/bar.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/foo.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/main/main.dwo -o bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/main.dwp)
# Configuration: 77bdbcfa4ce6db4cf8539adb5465d4c7ee5d7734a59bd2c6e99a60f4e1c191e4
# Execution platform: @@platforms//host:host
INFO: From CcGenerateDwp main.dwp:
+ printenv
TMPDIR=/tmp
__CF_USER_TEXT_ENCODING=0x1F5:0x0:0x0
PWD=/private/var/tmp/_bazel_damian/1c7f831e8ac7ba8a4146146efff137a8/sandbox/darwin-sandbox/30/execroot/_main
SHLVL=1
_=/usr/bin/printenv
+ exec external/emscripten_bin_mac_arm64/bin/llvm-dwp bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/bar.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/foo.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/main/main.dwo -o bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/main.dwp
…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason why environment variables are unavailable is in bazel itself.
The action named CcGenerateDwp is declared here: LINK

ctx.actions.run(
   mnemonic = "CcGenerateDwp",
   tools = packager["tools"],
   executable = packager["executable"],
   toolchain = cc_helper.CPP_TOOLCHAIN_TYPE,
   arguments = [packager["arguments"]],
   inputs = packager["inputs"],
   outputs = packager["outputs"],
)

As you can see, the env variable is not declared for this action.

For comparison, I found the CppArchive action where the env variable is set LINK:

   env = cc_common.get_environment_variables(
       feature_configuration = feature_configuration,
       action_name = CPP_LINK_STATIC_LIBRARY_ACTION_NAME,
       variables = archiver_variables,
   )

   # TODO(bazel-team): PWD=/proc/self/cwd env var is missing, but it is present when an analogous archiving
   # action is created by cc_library
   ctx.actions.run(
       executable = archiver_path,
       toolchain = cc_helper.CPP_TOOLCHAIN_TYPE,
       arguments = [args],
       env = env,
       inputs = depset(
           direct = object_files,
           transitive = [
               cc_toolchain.all_files,
           ],
       ),
       use_default_shell_env = True,
       outputs = [output_file],
       mnemonic = "CppArchive",
   )

So as you can see, on the master version (>8.1.0), these variables will still not be there.
So I don't know if it's worth waiting for it at all, or using what I suggested, or improving it somehow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It we do end up landed this alternative method of have 5 different scripts, I think we should at least document why its needed, and perhaps link to a bazel bug, so we can potentially revert to the more common pattern in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can report this to bazel and add a link to the report here.
The question is, where should I add a comment/TODO with a link in the sourcecode to make it clear enough?

In a comment in each of the five scripts? Or somewhere in a collective place? Maybe link the comment to this PR there?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a comment in each of the 5 scripts sounds good.

Something like "This scripts is needed because we cannot use the emscripten emdwp.py entry point because of a limiation/bug with bazel: "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll prepare something for tomorrow.

Copy link
Contributor Author

@trzeciak trzeciak Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think about it, it's probably possible to do it like this:

  • add an emdwp-template.py file with the following content:
#!/bin/bash
#
#  This script…
#

export EMCC_WASM_BACKEND=__EMCC_WASM_BACKEND__
export EM_BIN_PATH=__EM_BIN_PATH__
export EM_CONFIG_PATH=__EM_CONFIG_PATH__
export NODE_JS_PATH=__NODE_JS_PATH__

source $(dirname $0)/env.sh

exec python3 $EMSCRIPTEN/tools/emdwp.py "$@"
  • add some bazel-stuff to extract those variable using select functionality to something like TemplateVariableInfo
  • add genrule to update emdwp-template.py based on TemplateVariableInfo
  • and connect the output of this genrule as a input of cc_toolchain.dwp_files

I think this should work. Unfortunately it's a bit beyond the time I wanted to spend on it.
On the other hand, I think it will be less readable than it is now (although I may be biased/lazy).

Copy link
Contributor Author

@trzeciak trzeciak Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reported it to bazel: bazelbuild/bazel#25336

I also updated all scripts to add this comment:

This script differs in form from emcc.{py,bat}/…, because bazel are limited/bugged in the way of executing dwp tool.
Bazel dwp action configuration does not pass environment variables, so we cannot use them in this script.
For more info, see PR discussion and bazel issue:
- https://github.com/emscripten-core/emsdk/pull/1531#discussion_r1962090650
- https://github.com/bazelbuild/bazel/issues/25336

filegroup(
name = "dwp_files",
srcs = [
"bin/llvm-dwp{bin_extension}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need emdwp.py like the above uses emar.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also related to the problem of passing environment variables to the dwp tool:
emdwp.py script base on some envs to ensure where are llvm bins are.
#1531 (comment)

@@ -0,0 +1,3 @@
#!/bin/bash

exec external/emscripten_bin_linux/bin/llvm-dwp "$@"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emdwp.py wrapper should be able to find these files based on LLVM_ROOT, so I don't think you need 5 different wrappers here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but it doesn't work / or I don't know how to do it, #1531 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants