Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS Self-Profiling API #366

Closed
3 of 5 tasks
acomminos opened this issue Apr 25, 2019 · 11 comments
Closed
3 of 5 tasks

JS Self-Profiling API #366

acomminos opened this issue Apr 25, 2019 · 11 comments
Assignees
Labels
Progress: in progress Review type: CG early review An early review of general direction from a Community Group Topic: performance Topic: privacy Topic: scripting ECMA, Web Assembly bindings, etc. Venue: WICG

Comments

@acomminos
Copy link

acomminos commented Apr 25, 2019

Góðan dag TAG!

I'm requesting a TAG review of:

Further details (optional):

You should also know that...

  • An experimental Blink/V8 implementation is in the works.
  • I'd be interested in feedback on whether we should work on optimizing a structurally compressed JS object profile format, or allow traces to be returned GZIP'd in an ArrayBuffer. The latter is much more viable for transmission over the wire, but harder to work with on the client side.
    • The mean GZIP compression ratio in our testing was 7.5.
  • Additionally, it would be great to get feedback on whether or not we should set a maximum sample count for each profiler, or maximum trace duration. The latter seems easier to reason about.

We'd prefer the TAG provide feedback as (please select one):

  • open issues in our GitHub repo for each point of feedback
  • open a single issue in our GitHub repo for the entire review
  • leave review feedback as a comment in this issue and @-notify @acomminos

Thanks!

@annevk
Copy link
Member

annevk commented May 17, 2019

What's the plan for standardizing the profile format across engines?

@acomminos
Copy link
Author

What's the plan for standardizing the profile format across engines?

For function labels, we want to leverage the engine-set "name" property of captured functions from ECMA-262 wherever possible. Function line/column/resource info will be specified normatively, which will be critical to ensure compatibility with sourcemaps.

On the topic of the inclusion of functions, whether or not we should allow some inlined/optimized functions to be omitted from traces is an open question- folks from Apple preferred this, but it may make it more difficult to aggregate traces across different browsers/JS engines.

@lknik
Copy link
Member

lknik commented May 22, 2019

That's an interesting but wondering if it might result in some unexpected leaks around CPU speed, etc. I'm happy you already mention timing. Let's look in more detail?

@lknik lknik added Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review Progress: unreviewed and removed Progress: untriaged Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review labels May 22, 2019
@ylafon ylafon added Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review Topic: scripting ECMA, Web Assembly bindings, etc. and removed Progress: unreviewed labels May 22, 2019
@lknik
Copy link
Member

lknik commented May 22, 2019

Another question, what's the intended use case. Do you foresee it used against live users, like in A/B testing (without the user being aware), or perhaps in a more testing-oriented environment in developer settings? Did you consider using permissions?

@ylafon ylafon added Review type: CG early review An early review of general direction from a Community Group Progress: unreviewed and removed Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review labels May 22, 2019
@acomminos
Copy link
Author

That's an interesting but wondering if it might result in some unexpected leaks around CPU speed, etc. I'm happy you already mention timing. Let's look in more detail?

Absolutely. We don't anticipate this providing more timing information than SharedArrayBuffer/performance.now() (see slides here regarding a polyfill), but UA-specific mitigations may be necessary, as with any timing source.

Another question, what's the intended use case. Do you foresee it used against live users, like in A/B testing (without the user being aware), or perhaps in a more testing-oriented environment in developer settings? Did you consider using permissions?

The intended use case is for live users in the wild, to get representative traces. Permissions aren't a great fit for this use case, particularly because that biases sampling (limiting the ability to collect profiles from the first load, as well as selecting for users with a higher propensity to accept permissions).

@lknik
Copy link
Member

lknik commented Jun 14, 2019

Absolutely. We don't anticipate this providing more timing information than SharedArrayBuffer/performance.now() (see slides here regarding a polyfill), but UA-specific mitigations may be necessary, as with any timing source.

How about timing the renderer/etc areas? It's not quite the same concern as with SharedArrayBuffer / performance.now().

The intended use case is for live users in the wild, to get representative traces. Permissions aren't a great fit for this use case, particularly because that biases sampling (limiting the ability to collect profiles from the first load, as well as selecting for users with a higher propensity to accept permissions).

@dbaron
Copy link
Member

dbaron commented Sep 12, 2019

@kenchris and I are looking at this in a breakout at our Tokyo face-to-face.

First thought:

I'd be interested in feedback on whether we should work on optimizing a structurally compressed JS object profile format, or allow traces to be returned GZIP'd in an ArrayBuffer. The latter is much more viable for transmission over the wire, but harder to work with on the client side.

We both think that, even if doing the compression for sending is a little difficult right now, that's a platform gap that's likely to be fixed in the relatively near future (through work in or related to the WHATWG Streams spec, like the work discussed in #410), so it's probably better to do the easy-to-use thing, with the knowledge that being able to do compression in client JS should become easier pretty soon.

Additionally, it would be great to get feedback on whether or not we should set a maximum sample count for each profiler, or maximum trace duration. The latter seems easier to reason about.

I don't think we have thoughts on this -- but maybe we would if you described what the advantages and disadvantages were a little more clearly.

One additional issue is that @kenchris was concerned about was what the effects on the users randomly chosen for this sort of profiling would be -- it seems like it could have a serious effect on their battery life, etc. (On the flip side, many web features can use a lot of battery life.) It's something worth thinking about, though probably not a blocking issue. (Should there be any interactions with, say, battery saver mode? Though if there were that might pose privacy concerns.)

@kenchris kenchris removed the Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review label Sep 12, 2019
@dbaron dbaron added this to the 2019-10-02-telecon milestone Sep 12, 2019
@dbaron
Copy link
Member

dbaron commented Sep 12, 2019

BTW, one concern I heard from colleagues is that they'd like to avoid content depending on what stack frames look like (which is already apparently a problem with Error.stack) since they'd like to be able to change/improve that over time. Having content depend on it means that, even though the spec says implementations can do what they want, they end up being unable to change things. It's possible this might be an argument for going the other way on compression -- although on the other hand doing un-gzipping is likely to be pretty trivial on the client soon, so it's likely not a very strong argument.

@dbaron
Copy link
Member

dbaron commented Oct 2, 2019

... though (continuing from the previous comment) this is probably actually less of an issue because the API is asynchronous, so it's less likely to result in bad dependencies.

Also, more generally, curious if @acomminos has thoughts on any of the various comments above.

@acomminos
Copy link
Author

Thank you for the feedback!

We both think that, even if doing the compression for sending is a little difficult right now, that's a platform gap that's likely to be fixed in the relatively near future [...]

Yes! The work on stream-based compression will complement this excellently.

Additionally, it would be great to get feedback on whether or not we should set a maximum sample count for each profiler, or maximum trace duration. The latter seems easier to reason about.

I don't think we have thoughts on this -- but maybe we would if you described what the advantages and disadvantages were a little more clearly.

We ended up going for a sample count maximum in the latest proposal, for the reasons that memory usage and ingestion pipelines are likely to be more sensitive to the number of samples rather than the duration of time spent during sampling (especially with a low sample rate).

One additional issue is that @kenchris was concerned about was what the effects on the users randomly chosen for this sort of profiling would be -- it seems like it could have a serious effect on their battery life, etc. (On the flip side, many web features can use a lot of battery life.)

That would depend pretty heavily on the VM's architecture, as well as the sample intervals the user agent chooses to support. IIRC, in V8 we're seeing about ~4% of CPU cycles spent during sampling with a 10ms sample rate.

One note of interest- @toddreifsteck mentioned at the TPAC Web Perf F2F that they had achieved a 1% overhead in Chakra's sampling profiler with a 1ms sample rate.

BTW, one concern I heard from colleagues is that they'd like to avoid content depending on what stack frames look like (which is already apparently a problem with Error.stack) since they'd like to be able to change/improve that over time.

Recently the spec has been updated to require that traces return the ECMA-262 defined "name" property of function objects after some feedback for more rigorous specification from implementors. These are fairly well-specified, although lack some detail for object members.

In practice, I don't suspect that the function names returned by traces will be terribly useful- more often than not, they'll likely be unminified via sourcemaps on the server, and used as a backup when sourcemap lookup fails.

@kenchris
Copy link

Thanks for the answers, this was useful. Thanks for flying TAG!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Progress: in progress Review type: CG early review An early review of general direction from a Community Group Topic: performance Topic: privacy Topic: scripting ECMA, Web Assembly bindings, etc. Venue: WICG
Projects
None yet
Development

No branches or pull requests

7 participants