[wasm] Jiterpreter monitoring phase take 2 #83489

kg · 2023-03-16T04:19:34Z

This PR adds a monitoring phase like #83432 , but instead of measuring the displacement returned by the trace, it monitors it (using the cheapest kind of instrumentation I could come up with) to track whether the trace bails out for specific reasons like calls early on.
All traces are now passed a pointer to a small 'jiterpreter call info' blob, that contains a "has taken backward branch" flag along with an offset field (in # of compiled 'high value' jiterpreter opcodes) that is written to if an undesirable bailout happens (referred to as an 'exit' in the trace generator). Undesirable bailouts are things that, if they happen frequently, indicate that the trace may be unproductive. A bailout for a failed null check or for a 'throw' opcode is not undesirable, since those are both by nature unlikely events and it's not important to measure them. A method call or delegate call is something the jiterpreter can't handle and if it happens frequently early in a trace, that trace is probably just calling a method early on by design.
If the backward branch flag is set, we assume that the trace ran at least once to completion before looping, since most backward branches are going to be loops. It's fine for a loop to eventually bail out due to a function call, since it almost certainly got work done.

A more ideal form of instrumentation would actually count opcodes or have a loop counter, but we don't have the luxury of compiling traces multiple times, so we have to keep the instrumentation as cheap as possible. The read + increment for a loop counter would make tight loops like reversing or comparing spans much slower, I think.

EDIT: No merge because I want to delay this until after P3.

ghost · 2023-03-16T04:19:45Z

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR adds a monitoring phase like #83432 , but instead of measuring the displacement returned by the trace, it monitors it (using the cheapest kind of instrumentation I could come up with) to track whether the trace bails out for specific reasons like calls early on.
All traces are now passed a pointer to a small 'jiterpreter call info' blob, that contains a "has taken backward branch" flag along with an offset field (in # of compiled 'high value' jiterpreter opcodes) that is written to if an undesirable bailout happens (referred to as an 'exit' in the trace generator). Undesirable bailouts are things that, if they happen frequently, indicate that the trace may be unproductive. A bailout for a failed null check or for a 'throw' opcode is not undesirable, since those are both by nature unlikely events and it's not important to measure them. A method call or delegate call is something the jiterpreter can't handle and if it happens frequently early in a trace, that trace is probably just calling a method early on by design.
If the backward branch flag is set, we assume that the trace ran at least once to completion before looping, since most backward branches are going to be loops. It's fine for a loop to eventually bail out due to a function call, since it almost certainly got work done.

A more ideal form of instrumentation would actually count opcodes or have a loop counter, but we don't have the luxury of compiling traces multiple times, so we have to keep the instrumentation as cheap as possible. The read + increment for a loop counter would make tight loops like reversing or comparing spans much slower, I think.

Author:	kg
Assignees:	-
Labels:	`arch-wasm`, `area-Codegen-Jiterpreter-mono`
Milestone:	-

kg · 2023-03-16T04:21:32Z

In a local run, this doesn't appear to reject any important traces from browser-bench, but it's hard to say for sure without performance runs.

…average distance (in bytes) they travel Then reject traces that have a low average distance after the monitoring period Cleanup Cleanup Do monitoring in terms of opcode count, and record the location of specific bailout types Also track whether a backward branch was taken during a trace so that it can be factored in Cleanup Cleanup Cleanup Cleanup

… distance For the monitoring opcode counter, use (opcodes_to_first_branch + opcodes_from_last_branch_target) Fix traces aborting for monitor points

…eshold to the long one

kg · 2023-03-21T08:01:54Z

Reworked the value being measured based on some suggestions, it now assigns a 'penalty score' to trace executions depending on how far they got before bailing out. This removes the need for a dummy distance value for traces that didn't bail out, and allows smoothly ramping up and down by setting two (short and long) distance thresholds. Then you set an average penalty threshold, where traces above that threshold are rejected, between 0 and 200.

I also changed the opcode count it uses so that instead of the number of opcodes compiled, it's the number of opcodes we can be certain have run at this point - the number of opcodes leading up to the first branch, and the number of opcodes since the last branch target. We could do some static analysis to come up with a better number than this, but it seems to be good enough as a starting point so far.

In local testing this reduces Perf_Dictionary.ContainsValue from 1.0sec to around 650ms, by rejecting ContainsValue. It's a massive branchy trace that doesn't actually run much code and always bails out for a call before looping, so the previous two metrics (displacement & average number of opcodes) didn't work.

Don't assert when hitting the traceinfo limit, just stop compiling traces and print a warning Raise traceinfo limit to make sure it's not realistic to hit it

kg · 2023-03-21T20:44:57Z

Test failures are unrelated.
From local testing this shouldn't cause any browser-bench regressions, but it's worth checking it in now to see what kind of deltas we get in dotnet/performance. I expect to see some improvements and some regressions there, and then I'll be able to dig in to those and figure out whether this is actually an acceptable iteration of the heuristic.

kg added arch-wasm WebAssembly architecture area-Codegen-Jiterpreter-mono labels Mar 16, 2023

kg requested review from lewing, pavelsavara, vargaz, lambdageek, BrzVlad and kotlarmilos as code owners March 16, 2023 04:19

ghost assigned kg Mar 16, 2023

This was referenced Mar 16, 2023

Roslyn source generator crash on mono/linux/arm64 #81123

Closed

IOException running NuGet-Migrations during tests in dotnet CLI first run #80619

Closed

Crashes when initializing System.Net.Security.Native #83540

Closed

kg added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Mar 17, 2023

kg added 4 commits March 20, 2023 17:05

Fix bailout counting

d1f6e1a

Switch to a penalty value based on distance to bailout instead of avg…

c0b4fc5

… distance For the monitoring opcode counter, use (opcodes_to_first_branch + opcodes_from_last_branch_target) Fix traces aborting for monitor points

Fix threshold calculation and raise threshold

18b1f63

kg force-pushed the wasm-jiterpreter-monitoring-phase-2 branch from 66989da to c78801f Compare March 21, 2023 07:37

Improve penalty calculation to slope down linearly from the short thr…

962d137

…eshold to the long one

kg force-pushed the wasm-jiterpreter-monitoring-phase-2 branch from c78801f to 962d137 Compare March 21, 2023 07:38

kg added 4 commits March 21, 2023 01:05

Code cleanup

3452f31

Remove now unused size_of_trace

f0c260e

Don't assert when hitting the traceinfo limit, just stop compiling traces and print a warning Raise traceinfo limit to make sure it's not realistic to hit it

Remove dead comment

01b8812

Improve approximate opcode counter to work better for large methods

6048bce

build-analysis bot mentioned this pull request Mar 21, 2023

WasmTestOnBrowser-System.* test failures in CI #83655

Closed

Adjust thresholds

23b1dbf

vargaz approved these changes Mar 21, 2023

View reviewed changes

build-analysis bot mentioned this pull request Mar 21, 2023

[release/6.0] Doublelinklist GC failures on Mono #83245

Closed

kg merged commit 2ea6288 into dotnet:main Mar 21, 2023

build-analysis bot mentioned this pull request Mar 21, 2023

[wasm] SimpleSourceChangeRebuildTest.SimpleStringChangeInSource - Long running test error, cannot access file used by another process #83752

Closed

ghost locked as resolved and limited conversation to collaborators Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wasm] Jiterpreter monitoring phase take 2 #83489

[wasm] Jiterpreter monitoring phase take 2 #83489

kg commented Mar 16, 2023 •

edited

Loading

ghost commented Mar 16, 2023

kg commented Mar 16, 2023

kg commented Mar 21, 2023

kg commented Mar 21, 2023

[wasm] Jiterpreter monitoring phase take 2 #83489

[wasm] Jiterpreter monitoring phase take 2 #83489

Conversation

kg commented Mar 16, 2023 • edited Loading

ghost commented Mar 16, 2023

kg commented Mar 16, 2023

kg commented Mar 21, 2023

kg commented Mar 21, 2023

kg commented Mar 16, 2023 •

edited

Loading