-
Notifications
You must be signed in to change notification settings - Fork 780
Add support for (ref ...) / (ref null ...) constructs #2562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@sbc100 @keithw since I don't know enough about the project, these are just some random changes in the parser. The aim is parsing https://github.com/WebAssembly/function-references/blob/main/test/core/call_ref.wast It is doing something (parsing correctly is surely an exaggeration) with Could you check that these changes makes sense? There is a Thank you for the help. |
|
This patch goes to a new direction. The 32 bit type filed of |
c804a98 to
c40c098
Compare
|
There is something I don't understand. There is a The key change of the patch is that Btw, may I ask how actively maintained the wabt project? |
|
I believe |
|
That is possible. Unfortunately not everything is described clearly in the text. For example, |
|
I have tried to track down the v8 implementation for https://github.com/v8/v8/blob/main/src/wasm/function-body-decoder-impl.h#L386 It calls https://github.com/v8/v8/blob/main/src/wasm/function-body-decoder-impl.h#L217 Starts with: So it looks like it uses the s33 decoding which the specification mentioned. |
c94e5a1 to
a959acb
Compare
|
I have some good news! This test is fully working now (including the interpreter): Only 9 fails remained, they all look like CallRef syntax change related, so some tests need to be updated. The patch is quite large. There are some parts which are questionable. I will comment about these parts later to help review. |
78596d6 to
b79aead
Compare
| }; | ||
|
|
||
| struct Var { | ||
| // Var can represent variables or types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a major internal change. It does not increase the size of Var. I think it is better than constructing a Var/Type pair.
include/wabt/leb128.h
Outdated
| size_t ReadU32Leb128(const uint8_t* p, const uint8_t* end, uint32_t* out_value); | ||
| size_t ReadU64Leb128(const uint8_t* p, const uint8_t* end, uint64_t* out_value); | ||
| size_t ReadS32Leb128(const uint8_t* p, const uint8_t* end, uint32_t* out_value); | ||
| size_t ReadS33Leb128(const uint8_t* p, const uint8_t* end, uint64_t* out_value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec adds this new type, the output is uint64_t. It could be merged with ReadS32Leb128 by adding a bool* argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is S33 used somewhere in the binary format now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. And they use it even more places in gc :(
https://webassembly.github.io/function-references/core/binary/types.html#heap-types
a 33 bit integer can hold a 32 bit unsigned index or a type value. Its maximum (leb) size is the same as 32 bit integers.
ref.null uses this for example.
https://webassembly.github.io/function-references/core/syntax/instructions.html#syntax-instr-ref
src/binary-reader-logging.cc
Outdated
| Result BinaryReaderLogging::name(Type type) { \ | ||
| LOGF(#name "(%s)\n", type.GetName().c_str()); \ | ||
| return reader_->name(type); \ | ||
| #define DEFINE_TYPE(name) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this might be reverted. I will check.
src/leb128.cc
Outdated
| } | ||
| } | ||
|
|
||
| size_t ReadS33Leb128(const uint8_t* p, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These readers don't check that the minimum amount of bytes is used for encoding. This is important for utf, not sure for leb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain a little more what you mean here?
For sure we allow encoding that are over-long, i.e. zero-padded. This is used a lot in the llvm wasm writer, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not know that. Some encodings, such as utf8 expects a minimum number of byte encodings, so a 7 bit character cannot be encoded more than a single byte, even if the type bits are correct for longer encodings. If padding for leb is allowed then it is perfectly ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can avoid duplicating code here and instead just use ReadS64Leb128 here (with a bounds check after reading it)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is possible. We could add a default false boolean flag to the ReadS64Leb128 as well.
Anyway, what is the reason for the lot of code duplications (and manual loop unrollings) in this file? Performance? For consistency, I also added another duplication, but this part could be improved (reducing code size) in many ways. It can be done in another patch as well, before or after this patch.
What is your suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, to keep this patch focused how about just removing the LEB changes from this PR?
I think its reasonable for now to just read S33 values using ReadS64Leb128. Apparently this is what binaryen does. We can come back and try to cleanup this stuff and remove the duplication later perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I removed the S33 parsing. We should do something with it later. Please, let me know your opinion, and I will do it.
src/wast-parser.cc
Outdated
| } | ||
|
|
||
| if (func) { | ||
| if (!func->local_type_list.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of a bugfix. Named references has been never resolved for local types. Maybe we could do it on the compressed list, but that can be done in another patch.
6587cf7 to
0f7353f
Compare
| @@ -0,0 +1,19 @@ | |||
| ;;; TOOL: run-interp-spec | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out that call_ref.wast is already in the repository, and it is passing now. There are other tests in the directory (e.g. binary.wast) which is also passing now, they could be updated later as well.
|
I think the patch is ready. There are lots of follow-up work ahead. |
58411a5 to
bf0400f
Compare
9cb09d6 to
293236b
Compare
|
I have found a bug in RecursiveMatch and fixed it. It seems there will be no other comments to this patch, @sbc100 you are my only hope. |
I think this change is mostly looking good. I'm still trying to find some time to really understand what is going on with the Var / opt_type thing as it seems a little odd to me still. |
|
I'm on vacation for the next couple of week, but I will make a concerted effort to review this change when I get back in July. |
|
Thank you. The GC support will also likely completed by that time. Only 5 instructions remained, all type cast related. |
ac4adaf to
cacacc7
Compare
|
It looks like WebAssembly 3.0 is released, and GC is part of it: https://webassembly.org/news/2025-09-17-wasm-3.0/ |
sbc100
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I finally go around looking at this PR again. Sorry for the long delay.
I think we are pretty much ready to land this. Just a couple more nits.
Should we update the README to mention that we now support the function references proposal?
include/wabt/shared-validator.h
Outdated
|
|
||
| struct LocalReferenceMap { | ||
| Type type; | ||
| Index bit_index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, this is an index into local_ref_is_set_? Perhaps it should use be size_t rather than Index?
Perhaps it should be called is_set_index? Or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I have added a comment which should clear its use.
src/shared-validator.cc
Outdated
| void SharedValidator::IgnoreLocalRefs() { | ||
| if (!local_ref_is_set_.empty()) { | ||
| std::fill(local_ref_is_set_.begin(), local_ref_is_set_.end(), true); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just remove the if check here?
| return AppendExpr(std::make_unique<CallRefExpr>()); | ||
| Result BinaryReaderIR::OnCallRefExpr(Type sig_type) { | ||
| auto expr = std::make_unique<CallRefExpr>(); | ||
| expr->sig_type = Var(sig_type, GetLocation()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the contructor for CallRefExpr tage this sig_type as an argument? (Like the RefNullExpr below? Or is there some reason they should be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized (I made the code months ago) I just followed the other Call structs.
https://github.com/WebAssembly/wabt/blob/main/include/wabt/ir.h#L725
https://github.com/WebAssembly/wabt/blob/main/include/wabt/ir.h#L694
The wast parser first creates the object, then parses its arguments. Shall I change this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the best pattern is but we should try to reach consistency eventually. Happy to land this now and cleanup later.
test/parse/bad-references.txt
Outdated
| ) | ||
| (;; STDERR ;;; | ||
| out/test/parse/bad-references.txt:6:31: error: reference type out of range: 2 (max: 2) | ||
| out/test/parse/bad-references.txt:6:4: error: reference 2 is out of range in params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok that we will the mac: xxx in these error messages? They seem maybe useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added it back. However, I am not sure it is that helpful in this case. The original messages were produced by the parser, now the shared validator produces them. The difference is, that implicit function declarations are now part of the max, since the shared validator has no idea that they are implicit declarations. Furthermore, just because you pass a number less than max, it is not necessarily valid, which can also be confusing for some people. An interesting side effect is that max also grows with every new type declaration, so it is a bit more precise.
| @@ -0,0 +1,78 @@ | |||
| ;;; TOOL: run-interp-spec | |||
| ;;; STDIN_FILE: third_party/testsuite/proposals/function-references/br_table.wast | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the update-spec-tests.py script be updated to remove the exclusion of function-references?
|
The title was misleading, I am sorry. I thought I can do this in one patch, but the full support is divided into three patches. I have updated the title. |
cacacc7 to
ba771cb
Compare
Needed by function references proposal Support named references for globals, locals, tables, elems Support named references for call_ref, ref_null Extend Var variables with an optional type field
ba771cb to
aebb15c
Compare
|
I have rebased and updated the patch. Please let me know if you need more changes. |
|
lgtm! (What about update-spec-tests.py?) |
I think we should update it in the third patch when the feature is complete. |
|
Thank you Zoltan, and again sorry for the absurd review time. |
|
No problem. These changes are rather complex which means higher risk from project perspective, and needs more time to review. There are two more patches for function references: #2567 implements new opcodes (I have rebased it) and #2593 implements table initializers. I will add README and update-spec-tests.py changes to the last one. |
This large patch adds function references support for those parts of the wabt library, which already present in the code. The current code is designed for an older (outdated) variant of the function references proposal.
Key changes: