-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix parsing of dwarfdump output to be more resilient #20777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tools/wasm-sourcemap.py
Outdated
@@ -198,12 +211,10 @@ def read_dwarf_entries(wasm, options): | |||
|
|||
entries = [] | |||
debug_line_chunks = re.split(r"debug_line\[(0x[0-9a-f]*)\]", output.decode('utf-8')) | |||
maybe_debug_info_content = debug_line_chunks[0] | |||
map_stmt_list_to_comp_dir = parse_debug_info_contents(debug_line_chunks[0]) | |||
for i in range(1, len(debug_line_chunks), 2): | |||
stmt_list = debug_line_chunks[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, but it looks like these two lines can be written as:
for stmt_list in debug_line_chunks[1::2]:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I included this change in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall this looks right to me.
@sbc100 should the minimal_runtime test failures go away with a rebase?
tools/wasm-sourcemap.py
Outdated
@@ -178,6 +178,19 @@ def remove_dead_entries(entries): | |||
block_start = cur_entry | |||
|
|||
|
|||
def parse_debug_info_contents(text): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it doesn't parse all the debug info contents (and since "debug info" is a bit ambiguous in this context since it could mean debug info generally or the specific .debug_info section), maybe this function could be called something that indicates that it extracts comp_dir information (e.g. "extract_comp_dir_map" or something like that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Renamed to extract_comp_dir_map
b6ce2cb
to
87bba5b
Compare
One other request: if this is your first contribution, please add yourself to the AUTHORS file. |
@dschuff We recently found that adding to AUTHORS is optional, Lines 1 to 7 in f9f6f13
So in general it is fine to either add to it, or not. Separately, would it be practical to add testing in this PR? I commented in one of the related PRs here: #20776 (comment) But if it would be simpler to test this after those PRs that's find of course. |
Anyway, I added myself to AUTHORS :)
I added a test but I have one question: |
test/test_other.py
Outdated
@@ -45,6 +46,8 @@ | |||
from tools import webassembly | |||
from tools.settings import settings | |||
|
|||
wasm_sourcemap = importlib.import_module('tools.wasm-sourcemap') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps move this into the test itself? But, if it is simpler to run it as an executable, that seems fine as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved import_module into the test.
IIUC running it as an executable is consistent with how most of these auxiliary scripts are executed. It might be more optimal to do a broader change there, but I think keeping consistent here is fine. |
Thanks for these fixes @kon72 ! |
To generate source maps, we run llvm-dwarfdump against wasm file and parse its plain-text output heavily relied on regex matches.
Previously, in order to retrieve comp dir from debug_info section, we search regex patterns assuming that the output has entries
DW_AT_stmt_list
andDW_AT_comp_dir
consecutively in this order, but I noticed that assumption doesn't hold in some cases.The output that the previous code can parse comp_dir successfully:
The output that the previous code fails to parse comp_dir:
With this change, it can parse comp_dir regardless of the order of
DW_*
entries