-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coredump: switch format to Wasm module #197
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of using the data section to encode the dumped memory makes sense in principle. It does raise some things to think about.
- Memory size. If there's a data section it probably makes sense to have a memory section as well, to declare the memory's size, and then this could also eventually extend to multiple memories.
- Partial dumps. Often memory dumps only include part of the process' memory space, but I don't see a way here to tell the difference between a full dump that just includes a lot of
0
values (which wouldn't be included in any segment) and a partial dump
I wonder if it makes sense to put the process info or some other custom section as the first section in the binary, to make it easier to identify coredumps. Especially given that we're using some known sections (i.e. memory) the fact that it's a coredump may change how a tool might process or interpret those known sections.
Thanks for the review @dschuff! I agree about 1.
We can specify multiple data segments with their corresponding offset in memory: (data (i32.const 1) "...10 bytes")
(data (i32.const 100) "...100 bytes") This also plays well with mulitple memories. As opposed to ELF, Wasm modules don't include a memory mapping table that would help with partial dump or identifying the memory segments. Compilers could emit such a section that coredumps could rely on but this is outside of the scope of coredumps. Maybe it's only me, but I don't see any reference to partial coredump in the ELF spec.
Yes, I agree. The only reason I haven't done this is because it would break my early tooling. I like that ELF coredumps can be identified by reading the first few bytes. |
8403469
to
1d495fa
Compare
What I meant about data segments is: |
FWIW, Wizer's memory snapshots are literally Wasm files with data segments for the nonzero ranges (although we have to be careful not to run into implementation limits for number of data segments, and merge near by data segments together when we get close to the limit). (Aside: I'm interested in this proposal! But I haven't had time to dig in yet, unfortunately. Sorry about that!) |
@dschuff got it now. Coredumps aren't instantiated like regular Wasm files (they use the Wasm binary encoding for generation/decoding convenience).
Glad to hear, I'm happy to have a video chat if that helps. |
6b9e373
to
d77eb85
Compare
Reuse the Wasm module container to store a coredump. Debugging informations are stored in custom sections and the main memory in the data section.
d77eb85
to
0db0810
Compare
@dschuff could you please merge the PR or you have more questions about memory segments? I also added a global section in the Coredump. I'm planning to reach out to potential implementers to get more feedback / input. |
Sorry, I didn't mean to hold this up. The way it's currently written suggests that any dump using multiple segments is partial (i.e. incomplete), but that situation is still indistinguishable from a dump that is known to be complete, but is encoded with multiple segments (so that zeros don't need to be written into the image). Practically speaking it may not matter, since a debugging tool might not do anything different. But if e.g. it finds a pointer that points to a missing portion, it might be good to know whether the pointed-to data is expected to be zero or is just missing from the dump. |
Reuse the Wasm module container to store a coredump. Debugging informations are stored in custom sections and the main memory in the data section.
Note that naming of the custom sections is still work in progress at the moment and that we can add more information in process/thread-info, if/when needed.