-
Notifications
You must be signed in to change notification settings - Fork 741
[Bazel] Support for feature debug fission in emsdk-bazel-toolchain #1479 #1531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Bazel] Support for feature debug fission in emsdk-bazel-toolchain #1479 #1531
Conversation
This fixes issue 1479 |
Can I sync bazel for windows (from old ones 5.4.0), to be sync with linux/mac? I'll try this, but so far no luck: #1532 (edited) |
"emdwp-emscripten_bin_linux.sh", | ||
"emdwp-emscripten_bin_mac_arm64.sh", | ||
"emdwp-emscripten_bin_mac.sh", | ||
"emdwp-emscripten_bin_win.bat", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this tool require 5 files where the above tools all just use two (one .sh and one .bat)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why the dwp
/emdwp_script
tool doesn't have access to the environment variables, OR
I don't know how to make it pass these variables to this tool.
I've seen how to do it for emcc and emcc_link in bazel/emscripten_toolchain/toolchain.bzl
,
that this is done by means of env_set
for specific actions
, I tried to complete the action list but it didn't help.
So I got around this by using five dedicated files.
I also ask about this on bazel-slack (Fri, Feb 14th), no answer yet: https://bazelbuild.slack.com/archives/CGA9QFQ8H/p1739488236571719
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
printenv
from dwp
tool:
...
SUBCOMMAND: # //:main [action 'CcGenerateDwp main.dwp', configuration: 77bdbcfa4ce6db4cf8539adb5465d4c7ee5d7734a59bd2c6e99a60f4e1c191e4, execution platform: @@platforms//host:host, mnemonic: CcGenerateDwp]
(cd /private/var/tmp/_bazel_damian/1c7f831e8ac7ba8a4146146efff137a8/execroot/_main && \
exec env - \
external/emsdk/emscripten_toolchain/emdwp-emscripten_bin_mac_arm64.sh bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/bar.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/foo.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/main/main.dwo -o bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/main.dwp)
# Configuration: 77bdbcfa4ce6db4cf8539adb5465d4c7ee5d7734a59bd2c6e99a60f4e1c191e4
# Execution platform: @@platforms//host:host
INFO: From CcGenerateDwp main.dwp:
+ printenv
TMPDIR=/tmp
__CF_USER_TEXT_ENCODING=0x1F5:0x0:0x0
PWD=/private/var/tmp/_bazel_damian/1c7f831e8ac7ba8a4146146efff137a8/sandbox/darwin-sandbox/30/execroot/_main
SHLVL=1
_=/usr/bin/printenv
+ exec external/emscripten_bin_mac_arm64/bin/llvm-dwp bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/bar.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/sdk/foo.dwo bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/_objs/main/main.dwo -o bazel-out/wasm-dbg-ST-6bab6f2a10f7/bin/main.dwp
…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason why environment variables are unavailable is in bazel itself.
The action named CcGenerateDwp is declared here: LINK
ctx.actions.run( mnemonic = "CcGenerateDwp", tools = packager["tools"], executable = packager["executable"], toolchain = cc_helper.CPP_TOOLCHAIN_TYPE, arguments = [packager["arguments"]], inputs = packager["inputs"], outputs = packager["outputs"], )
As you can see, the env
variable is not declared for this action.
For comparison, I found the CppArchive
action where the env
variable is set LINK:
env = cc_common.get_environment_variables( feature_configuration = feature_configuration, action_name = CPP_LINK_STATIC_LIBRARY_ACTION_NAME, variables = archiver_variables, ) # TODO(bazel-team): PWD=/proc/self/cwd env var is missing, but it is present when an analogous archiving # action is created by cc_library ctx.actions.run( executable = archiver_path, toolchain = cc_helper.CPP_TOOLCHAIN_TYPE, arguments = [args], env = env, inputs = depset( direct = object_files, transitive = [ cc_toolchain.all_files, ], ), use_default_shell_env = True, outputs = [output_file], mnemonic = "CppArchive", )
So as you can see, on the master version (>8.1.0), these variables will still not be there.
So I don't know if it's worth waiting for it at all, or using what I suggested, or improving it somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It we do end up landed this alternative method of have 5 different scripts, I think we should at least document why its needed, and perhaps link to a bazel bug, so we can potentially revert to the more common pattern in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can report this to bazel and add a link to the report here.
The question is, where should I add a comment/TODO with a link in the sourcecode to make it clear enough?
In a comment in each of the five scripts? Or somewhere in a collective place? Maybe link the comment to this PR there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a comment in each of the 5 scripts sounds good.
Something like "This scripts is needed because we cannot use the emscripten emdwp.py entry point because of a limiation/bug with bazel: "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll prepare something for tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think about it, it's probably possible to do it like this:
- add an
emdwp-template.py
file with the following content:
#!/bin/bash
#
# This script…
#
export EMCC_WASM_BACKEND=__EMCC_WASM_BACKEND__
export EM_BIN_PATH=__EM_BIN_PATH__
export EM_CONFIG_PATH=__EM_CONFIG_PATH__
export NODE_JS_PATH=__NODE_JS_PATH__
source $(dirname $0)/env.sh
exec python3 $EMSCRIPTEN/tools/emdwp.py "$@"
- add some bazel-stuff to extract those variable using
select
functionality to something like TemplateVariableInfo - add
genrule
to updateemdwp-template.py
based onTemplateVariableInfo
- and connect the output of this
genrule
as a input ofcc_toolchain.dwp_files
I think this should work. Unfortunately it's a bit beyond the time I wanted to spend on it.
On the other hand, I think it will be less readable than it is now (although I may be biased/lazy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reported it to bazel: bazelbuild/bazel#25336
I also updated all scripts to add this comment:
This script differs in form from emcc.{py,bat}/…, because bazel are limited/bugged in the way of executing dwp tool. Bazel dwp action configuration does not pass environment variables, so we cannot use them in this script. For more info, see PR discussion and bazel issue: - https://github.com/emscripten-core/emsdk/pull/1531#discussion_r1962090650 - https://github.com/bazelbuild/bazel/issues/25336
filegroup( | ||
name = "dwp_files", | ||
srcs = [ | ||
"bin/llvm-dwp{bin_extension}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this need emdwp.py like the above uses emar.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also related to the problem of passing environment variables to the dwp tool:
emdwp.py
script base on some envs to ensure where are llvm bins are.
#1531 (comment)
@@ -0,0 +1,3 @@ | |||
#!/bin/bash | |||
|
|||
exec external/emscripten_bin_linux/bin/llvm-dwp "$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The emdwp.py wrapper should be able to find these files based on LLVM_ROOT, so I don't think you need 5 different wrappers here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but it doesn't work / or I don't know how to do it, #1531 (comment)
Also, install before `proxy-wasm-cpp-sdk` to ensure it is used/patched Currently its installed after, making our setup redundant includes patch of emscripten-core/emsdk#1531 the context for this is that its required to use a hermetic llvm toolchain (other than being generally useful) without it, redundant hackery/patching would be required as the newer platforms way of setting up toolchains expects _something_ to be setup for various bins. Rather than adding a stub i figured add the fix. This has historically been the main blocker to adding llvm toolchains, or using the bazel platforms setup Signed-off-by: Ryan Northey <ryan@synca.io>
would be great for this to land it unblocks us (envoy) |
@phlax do you know much about Bazel itself? Would you be interested in sending them very small (maybe just one line) PR so that we don't need maintain all these extra wrapper scripts: bazelbuild/bazel#25336 |
parser.add_argument('--allow_empty_outputs', help='If an output listed in --outputs does not exist, create it anyways.', action='store_true') | ||
args = parser.parse_args() | ||
|
||
args.archive = os.path.normpath(args.archive) | ||
args.outputs = args.outputs.split(",") | ||
args.dwp_file = os.path.normpath(args.dwp_file) if args.dwp_file else None | ||
|
||
if args.dwp_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put args.dwp_file = os.path.normpath(args.dwp_file)
here instead of line 36 (one less conditional).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm.. using dwp
files is optional, won't that be a problem for the normpath
method?
|
||
if args.dwp_file: | ||
for idx, output in enumerate(args.outputs): | ||
if output.endswith(".dwp"): # also update extension 'binary.dwp' to 'binary.wasm.dwp' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use single quotes here
@@ -26,11 +27,20 @@ def main(): | |||
parser = argparse.ArgumentParser() | |||
parser.add_argument('--archive', help='The archive to extract from.') | |||
parser.add_argument('--outputs', help='Comma separated list of files that should be extracted from the archive. Only the extname has to match a file in the archive.') | |||
parser.add_argument('--dwp_file', help='Optional dwp input file, generated when fission flags set.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't the dwp file live in the archive like the rest on the files?
Is is because bazel actually supports the dwp file natively (unlike the other extra files we generate). If there is support for dwp natively in bazel why do we need to involve the wasm_binary.py script at all here? i.e. if its not part of the archive why we even involve the wasm-binary script at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is more complex, although I am no expert, I will try to shed more light on it:
- during compilation we can generate .o files containing debugging information, which can be extracted from the binary after the linking phase (but this is not the case),
- debug fission, is the extraction of part of the debug information during compilation, then during each compilation, two files are created: .o and .dwo,
- a .dwp file is a collection of .dwo files
- bazel aggregates .dwo files into (multiple, if necessary) .dwp files, and multiple .dwp files into a single .dwp file (~100-leaf tree)
- dwp file is created as a separate target (file) next to the output file from the cc_binary target (the first time it wasn't obvious to me that if I have target
cc_binary(name = "foo", …)
, then to get a .dwp file I have to add a requirement to the target (file):foo.dwp
) - considering this, if we want to generate both files at once, we can add a filegroup that will have a dependency on both files (
foo
andfoo.dwp
) - forwarding dependencies to the dwp file via the wasm_binary.py file (~via wasm_cc_binary), simply makes the same configuration apply to both files,
doing it differently would involve duplicating thewasm_cc_binary
method.
A bit of text, but I hope you know what I mean.
@sbc100 i can take a look if i get some spare cycles - but can i suggest we land this with the multiple scripts ... iiuc the fix would depend on the end users bazel version - and i know at least for us if it lands on bazel main now it may be a year+ before we are going to be up to that version |
I would also suggest not to wait for changes bazelbuild/bazel#25336 as it will block merging this MR only with the latest Bazel version (which is not necessary now) and in worst case this version will be released in the next major version. If we agreed not to wait for those changes, just add TODO in the code, then I can try to come back to this MR, merge the current branch, apply the reported fixes, etc. |
Support for debug fission. In short, debugging fission should:
To use this, follow the
--fission=yes
flag.References: