Skip to content

RFC: add a flag similar to --prof, but based on v8 CPU profiler #26878

Closed
@joyeecheung

Description

@joyeecheung

Hi all,

I am currently working on adding a flag similar to --prof but use the better-maintained v8 CPU profiler instead. Working prototype is here, which reuses the recent refactored NODE_V8_COVERAGE implementation (also depends on another refactor in #26874).

When running

node --cpu-prof entry.js

It starts the CPU profiler at start up, when the Node.js instance (main thread or worker threads) exits, it writes the CPU profile under the current working directory with the file name returned by the unified DiagnosticFileName('CPU', 'cpuprofile') (example: CPU.20190323.191111.95080.0.cpuprofile).

It also provides --cpu-prof-name for the user to specify the path where the profile will be written to (the directory of the path would be created if it does not exist yet). (we can't make that a value of --cpu-prof because then node --cpu-prof entry.js would be ambiguous, or at least our current option parser does not support something smart enough to work around it).

The generated CPU profile can be visualized using Chrome's dedicated DevTools for Node.js (available in chrome://inspect), this is much easier to use than --prof-process for --prof, although V8's CPU profiler is currently only capable of profiling JavaScript. However there are also discussions about including native stacks there as well (according to @psmarshall ).

We may also be able to hook this into our benchmark runner to get CPU profiles for the core benchmarks.

Some questions regarding this feature:

  • Should the CPU profiler be started before bootstrap, or before pre-execution, or right before we start to execute the first user script?
    • Bootstrap includes the setup of the Node.js environment that does not depend on any run time states e.g. CLI flags, environment variables. This part of execution will go away once embedded v8 snapshot is implemented and used in core (because then we'll be deserializing the context from binary data instead).
    • Pre-execution includes the setup that depends on run time states (e.g. process.argv, process.env), as well as preparations done before executing the entry point (e.g. running preloaded modules specified with --require) - currently these two are mixed together but we should be able to refactor and separate them.
  • The prototype is currently doing something similar to coverage collection when it serialize the results: it does a JSON::Parse on the message returned from the inspector protocol, and then does a JSON::Stringify on the specific subtree (in this case, message.result.profile). It should be fine as an initial implementation, however:
    • This is not exactly cheap, but it saves us the trouble of maintaining a CPU profile serializer by ourselves (this is not currently available in v8-profile.h)
    • For reference v8-inspector maintains their own serializer, but the format of the CPU profile is unspecified and subject to changes so that requires manual maintainance unless the support is added in the upstream
    • Is it possible to customize the inspector protocol or improve the inspector API to get the results directly and avoid this detour? cc @nodejs/v8-inspector

Any feedback regarding this idea is welcomed!

Metadata

Metadata

Assignees

No one assigned

    Labels

    cliIssues and PRs related to the Node.js command line interface.discussIssues opened for discussions and feedbacks.inspectorIssues and PRs related to the V8 inspector protocolperformanceIssues and PRs related to the performance of Node.js.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions