Skip to content

Conversation

@Mossaka
Copy link
Member

@Mossaka Mossaka commented Mar 8, 2023

This PR adds wit parsing for include and include ... with { } syntax as proposed by the union of worlds proposal and followed the WIT lexical structured proposed by this PR.

This PR, however, does not implement the include path resolver. Since the path to include to is a world id path instead of a interfaceid, making the code for resolve_path un-reusuable for resolving include path. A follow up PR or continuation of this PR will resolve this by introducing a resolve_include_path function to assist.

Signed-off-by: Jiaxiao Zhou <jiazho@microsoft.com>
let (name, kind, desc, spans, interfaces) = match item {
// handled in `resolve_types`
ast::WorldItem::Use(_) | ast::WorldItem::Type(_) => continue,
ast::WorldItem::Use(_) | ast::WorldItem::Type(_) | ast::WorldItem::Include(_) => continue,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is temporary and needs to be implemented for the resolver to work.

@Mossaka Mossaka changed the title feat(wit-parser): add parser for include ... with { } feat(wit-parser): add parser for include Mar 8, 2023
@alexcrichton
Copy link
Member

Oh I should also clarify, I would personally prefer to get more of include implemented before landing incremental support. The patch here is pretty small and I don't mind reviewing something larger, and I think it's better to land a whole feature if possible.

@Mossaka Mossaka marked this pull request as draft March 9, 2023 23:42
@Mossaka Mossaka mentioned this pull request May 23, 2023
@Mossaka
Copy link
Member Author

Mossaka commented May 25, 2023

I've implemented the basic shape for include worlds to form a bigger world and added a few
examples to make self, pkg and deps working. This PR is not ready to be merged yet because I still need to implement the de-duplicate logic and add lots of tests. But if you could give an early quick review, it would be very helpful for me to iterate through this. @alexcrichton

@Mossaka
Copy link
Member Author

Mossaka commented May 25, 2023

Oh, should clarfiy that you can ignore all the log debug that's just temporary for me to debug something and will be cleaned up later.

@Mossaka Mossaka marked this pull request as ready for review May 25, 2023 16:43
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me!

from: UsePath<'a>,

/// The names of the imports and exports to be replaced with
names: Vec<UseName<'a>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this isn't quite exactly UseName since the as keyword is required? For example with { a, b, c } doesn't mean much and probably shouldn't parse

itsrainy and others added 19 commits May 29, 2023 09:03
- Move AddCustomSectionMutator to custom.rs
- Add CustomSectionMutator to lib.rs so it starts getting used

Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
…iance#955)

This gives you the line-by-line WAT disassembly for the given Wasm bytes, along
with the binary offset associated with each line.
No major updates necessary in the implementation, just updating tests
and their expectations.
Include the parsed documentation comment in the AST's `Import` and `Export`
records so that, for example, documentation for the following function
can be included in the generated bindings.

```
default world component {
    /// Say hello!
    export hello-world: func() -> string
}
```

This will make the documentation generated by `cargo-component doc`
added in bytecodealliance/cargo-component#63 more useful, and will
allow me to fix a bug in a test which is supposed to be testing that the
documentation comments show up.
Match the maximum number of imports/exports to ensure these are all in
alignment.
* preserve `cabi_post_` exports when appropriate

Also, consider `canonical_abi_realloc` import equivalent to
`cabi_realloc` when scanning adapter imports.  This helps when
adapting older modules.

Signed-off-by: Joel Dice <joel.dice@fermyon.com>

* add `adapt-export-with-post-return` test

Signed-off-by: Joel Dice <joel.dice@fermyon.com>

---------

Signed-off-by: Joel Dice <joel.dice@fermyon.com>
)

Sync to WebAssembly/wasi-http@53e2d25

Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
…n` (bytecodealliance#957)

* `wasmparser`: turn table index in the elements section into an `Option`

* Update test
* wasmparser: Pack `RefType` into a `[u8; 3]`

Fixes bytecodealliance#924.

This does not have the validation speed ups that bytecodealliance#970 has. Its performance delta
is a little more amorphous: we have a couple small speed ups in parsing real
world benchmarks but some slow downs on the lots-of-types synthetic benchmark.

Overall, I'm still inclined to go with this version over bytecodealliance#970 because it is less
disruptive to consumers of the crate and continues to allow the better
ergonomics of exhaustive matching on `ValType`.

```
parse/tests             time:   [2.4281 ms 2.4392 ms 2.4521 ms]
                        change: [-2.2326% -1.1193% -0.1863%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

validate/tests          time:   [8.2324 ms 8.2653 ms 8.3027 ms]
                        change: [-1.8686% -1.1821% -0.4847%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 16 outliers among 100 measurements (16.00%)
  8 (8.00%) high mild
  8 (8.00%) high severe

validate/spidermonkey   time:   [45.287 ms 45.443 ms 45.623 ms]
                        change: [-0.4603% +0.1749% +0.7378%] (p = 0.57 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

parse/spidermonkey      time:   [22.109 ms 22.239 ms 22.396 ms]
                        change: [-1.9989% -1.2525% -0.4325%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  7 (7.00%) high mild
  5 (5.00%) high severe

Benchmarking validate/pulldown-cmark: Warming up for 3.0000 s
validate/pulldown-cmark time:   [1.8312 ms 1.8472 ms 1.8651 ms]
                        change: [-0.1019% +0.8062% +1.5879%] (p = 0.08 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

parse/pulldown-cmark    time:   [883.62 us 888.87 us 895.30 us]
                        change: [-3.5930% -2.6022% -1.5580%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe

Benchmarking validate/lots-of-types: Warming up for 3.0000 s
validate/lots-of-types  time:   [981.43 us 987.44 us 994.30 us]
                        change: [+2.7850% +4.3651% +6.3771%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  6 (6.00%) high mild
  7 (7.00%) high severe

parse/lots-of-types     time:   [234.12 us 235.16 us 236.44 us]
                        change: [+5.5025% +7.6456% +9.4458%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe

validate/bz2            time:   [770.89 us 774.68 us 779.05 us]
                        change: [-3.2093% -0.6903% +1.4449%] (p = 0.60 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) high mild
  4 (4.00%) high severe

parse/bz2               time:   [360.49 us 362.07 us 363.93 us]
                        change: [-4.7194% -2.3152% -0.5706%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

Benchmarking validate/intgemm-simd: Warming up for 3.0000 s
validate/intgemm-simd   time:   [1.6264 ms 1.6409 ms 1.6576 ms]
                        change: [-4.3743% -0.1229% +4.3713%] (p = 0.96 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe

parse/intgemm-simd      time:   [728.06 us 731.08 us 734.71 us]
                        change: [-2.7176% -1.9405% -1.0816%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe
```

* Move index size checking to `FromReader for RefType`
* Implement the GC proposal bottom types

The latest version of the GC proposal has a
'bottom' heap type for each type hierarchy
any, func, and extern. It's purpose is to be
a type for nulls in each hierarchy.

* Update GC instructions to latest proposal snapshot

This commit updates the GC proposal instructions to the
V8 snapshot version 6 [1].

New instructions:
  1. array.fill, init_data, init_elem
  2. refactored casting instructions
    - ref.cast/test/br_on_cast now take reftypes
    - ref.as_HEAP_TYPE are gone

[1] https://docs.google.com/document/d/1DklC3qVuOdLHSXB5UXghM_syCh-4cMinQ50ICiXnK3Q/edit#heading=h.rz955hp0nqsw

* Fix nesting order of rec/sub definitions

The text format is not completely specified for type
definitions, but there's current agreement that the
nesting of constructs goes:

(rec (type (sub (func/struct/array)))

'rec' and 'sub' may be omitted, but 'type' may not.

This commit fixes the text parser to follow this.
Previously it went rec -> sub -> type.
…e#974)

* wit-component: Implement merging `Resolve`s together

This commit addresses the fundamental issue exposed by bytecodealliance#967, namely that
worlds were not managed well between adapters and the "main module". On
the surface the issue is that wit-component maintains worlds separately
for each adapter and for the main module, and then ends up tripping over
itself when the worlds overlap in imports, for example. The first fix
applied to handle this was to refactor wit-component to instead merge
the worlds of the adapters into the main module's world. This merging
operation is similar to the union-of-worlds proposed in
WebAssembly/component-model#169.

Prior to this commit, however, the merge operation between worlds was
pretty naive and would bail out if there was any overlap at all. This
meant that the panic from bytecodealliance#967 was turned into a first-class error
message, but the use case shouldn't return an error and instead should
succeed as the underlying packages and definitions were all shared. This
comes back to bytecodealliance#962 which is an initial attempt at merging worlds which
are structurally equivalent. The full implementation of this, however,
was deeper than what bytecodealliance#962 uncovered and further required updating merging.

One of the issues with bytecodealliance#962 (and initial implementations of this commit)
is that a `Resolve` would still have multiple `Interface`s and
`TypeDef`s and such floating around which are all the same where some
were referred to by worlds and some weren't. What this commit chose to
address this was to update the `Resolve::merge` operation. Previously
this was an infallible simple "move the things from A to B" operation
but now it's a much more involved "weaving things together" operation
where overlap between A and B doesn't copy between them but instead uses
items in-place. For example if a document in A isn't present in B, it's
still added like before. If the document A is already present in B,
though, then it's no longer added and instead it's recursed within to
see if extra items are added to the new document.

This new "fancy" merge operation is intended to be a more long-term
implementation of this. At a high level this enables projects to
separately depend on WASI, for example, without every project
coordinating and using the exact same WIT-level dependency within one
build invocation (which isn't possible with separate compilation). All
WASI worlds will, in the end, get merged together into one picture of
the WASI world when a component is produced.

This was a fair bit of support which wasn't easily supported or tested
before. I've modified the `components` test suite to be able to have
shared WIT dependencies between the adapter and the main module to add
tests for that. Additionally I've added a dedicated set of tests for
just merging resolves which showcases some of the features at this time.

With all that said, there are a few aspects and limitations of merging
resolves that are worth mentioning:

* Packages are merged as keyed by their name and URL. For now this
  basically means that only packages under `deps/` are candidates for
  merging (as only they have URLs) and they must be placed in same-named
  subfolders.

* When merging packages are allowed to pick up more documents. This is
  intended to model where each resolve may have a subset of the whole
  picture file-wise. The use case in bytecodealliance#967 specifically exercises this,
  for example.

* When merging documents it's allowed for new interfaces and worlds not
  previously present in the document to get merged in. This is intended
  to model a bit of forward-compatible-ness where for example the
  preview1 adapter may have been built against a historical version of
  WASI whereas a project using that may use a bleeding edge version that
  has more interfaces. This merge should still be ok since they're both
  only using what they declared, so by adding new interfaces and worlds
  it shouldn't ever tamper with the others' view of the world.

* Merging interfaces and worlds, however, currently require exact
  "structural equivalence". The thinking behind this is that if one
  component exports an `interface` but then a merge operation suddenly
  adds items to that `interface` then the component would no longer
  validly export the interface. This should be possible for the
  converse, though, where adding items to an interface that's only ever
  imported should be valid. This doesn't implement this sort of
  detection logic and takes a more conservative stance that interfaces
  and worlds must exactly match.

* Currently merging does not actually test for structural equivalence
  beyond top-level names. That's deferred to later if necessary, for
  example by updating bytecodealliance#962 to change where the tests happen.

The result of all this is to fix the panic found in bytecodealliance#967 and more
generally enable adapters and main modules in `wit-component` to
import/use the same WIT interfaces. This means it's now possible to have
a project use new preview2 interfaces with a preview1 standard library
and the preview1 adapter and everything should work.

Closes bytecodealliance#962
Closes bytecodealliance#967

* Fix an outdated comment
* Shrink the size of the `Type` enum with a `Box` on less-frequent types
  like instances/modules/components/etc.

* Merge the `push_{anon,defined}` functions into one, avoiding a hash
  table insertion on `push_defined`. This may have been needed
  historically but it's no longer required since pushing a fresh index
  will always return a fresh unique ID.
* Encode instance exports properly

* Fix tests

* Bless test

* Test component by decoding it with wit_component

* Make error message clearer
alexcrichton and others added 25 commits May 29, 2023 09:03
This commit adds a simple `addr2line` subcommand to `wasm-tools` which
can be used for parsing a WebAssembly binary's custom sections looking
for DWARF and mapping addresses to filenames and line numbers. This is
modelled somewhat after the native `addr2line` utility where multiple
addresses can be passed and inlined frames and such are printed. Overall
this is a pretty thin wrapper around the `addr2line` crate and is
similar to Wasmtime's support for symbolicating a backtrace.
* Add a check on CI that `addr2line` builds in isolation.
* Add a mention in the README of the new subcommand.
)

This commit updates `wasmprinter` to print an index/name for
imports-in-component-types  where previously the printed indices were
incorrect in the face of imports (since imports weren't accounted for).
* Add --color

* Recognize --color

* Colorize sections

* Some fixes

* Fix color choice
* Freeze field layout of `TypeId`

When compiling with `-Zrandomize-layout` the fields of `TypeId` can be rearranged in a way where the type becomes larger than 16 bytes which triggered a static assert.

Resolves: bytecodealliance#1013

* Add comment
Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>
Signed-off-by: Jiaxiao Zhou <jiazho@microsoft.com>

apply rustfmt

Signed-off-by: Jiaxiao Zhou <jiazho@microsoft.com>

apply review suggestions

Signed-off-by: Jiaxiao Zhou <jiazho@microsoft.com>
…odealliance#1026)

* Add encoders for coremodules and coreinstances custom sections

* Update wasmparser coredump parsing to include coremodules and
coreinstances sections.

* [wasm-encoder/parser] Add roundtrip tests for coremodules and coreinstances
* Add some color to error messages

This commit updates the errors printed by the CLI to have `error` in
red, the main error message in bold, and the rest of it is unchanged.

* Fix tests

* Fix no-default-features build
The wasm-compose crate now depends on wit-component
I forgot this in bytecodealliance#1025 and otherwise installs of `wit-bindgen` are not
working at this time since the 0.8 track of wit-component is using 0.5
of wasm-metadata resulting in duplicate version errors.
Apparently upstream has decided that `drop(foo)` is not how one should
ignore `foo`, instead it must be spelled `let _ = foo`

apply rustfmt

Signed-off-by: Jiaxiao Zhou <jiazho@microsoft.com>

apply review suggestions

Signed-off-by: Jiaxiao Zhou <jiazho@microsoft.com>
Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>
* Add trace-level log to verbose.

* modify the verbose flag docs

Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>

---------

Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>
* GC: wasmparser: Add user defined array types

* Address review comments

* Not sure how it even compiled.

* Address feedback

- Make RefType::indexed_array(), RefType::indexed_struct() and RefType::indexed_func() separate public functions.
- Make RefType::indexed() non-public (as common implementation for the above).
- Fail on invalid input parsing array mutability byte.
- Remove unneeded `types: &TypeList` arguments
- Make stop-list instead of allow-list of GC tests.

* broken windows

* Address review feedback.
`HeapType::TypedFunc(u32)` has been (as of [typed references](https://github.com/WebAssembly/function-references) proposal) used to refer function types defined in the module.

It is parsed from `(ref null? {index})`. With introduction of new indexed types for GC, there's unfortunately no way to tell what type the index is pointing to while parsing: they used to be all `func` types, but now can also be `array`, `struct` and more.

Renaming HeapType::TypedFunc(u32) to Indexed(u32) reflects this change.

Summary of the changes:
- Match RefTypes where one or both has HeapType::Indexed.
- `RefType::new()` will now create indexed RefType with unknown kind.
- Use `RefType::indexed_func()` instead of `RefType::new()` for indexed func refs.
- Add `Self::ARRAY_KIND` to the list of matched RefType kinds - when getting HeapType of a RefType.
- Check if two indexed types are equal.
All types are final by default unless the sub keyword
is used. A type marked `(sub final)` will be final.
Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>

fixed regressed test failures

Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>

renaming works

rustfmt
This commit adds textual syntax for the `producers` custom section to
the `wast` parser. This support is added with the [annotations
proposal][annot] where the `@producers` field is now reserved for this
custom section. The impact of this commit is that the `@producers` field
will now show up by default for many binaries such as those produced by
LLVM. This will make the textual format unparseable for any text tool
which doesn't support the custom annotations proposal. Note, though,
that `wast` has supported this proposal for awhile now and the text
should work seamlessly with older versions.

Some extra validation is added during parsing that the field names are
`language`, `sdk`, and `processed-by` to ensure other names don't leak
through.

[annot]: https://github.com/WebAssembly/annotations/
This commit adds gates for types in the GC proposal when present in
value types which gates presence in locations like function parameters,
block types, instruction payloads, etc.
* Add gc feature gate to validate

Fix bytecodealliance#1048.

* Add missing-features test case for `(type (array i8))`
Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>

adds span to failed resolve-include

cleanup

Signed-off-by: Jiaxiao Zhou (Mossaka) <duibao55328@gmail.com>
@Mossaka
Copy link
Member Author

Mossaka commented May 29, 2023

Closed by #1054

@Mossaka Mossaka closed this May 29, 2023
@Mossaka Mossaka deleted the union-wit-parser branch June 23, 2023 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.