Description
Hi all,
I am currently working on adding a flag similar to --prof
but use the better-maintained v8 CPU profiler instead. Working prototype is here, which reuses the recent refactored NODE_V8_COVERAGE
implementation (also depends on another refactor in #26874).
When running
node --cpu-prof entry.js
It starts the CPU profiler at start up, when the Node.js instance (main thread or worker threads) exits, it writes the CPU profile under the current working directory with the file name returned by the unified DiagnosticFileName('CPU', 'cpuprofile')
(example: CPU.20190323.191111.95080.0.cpuprofile
).
It also provides --cpu-prof-name
for the user to specify the path where the profile will be written to (the directory of the path would be created if it does not exist yet). (we can't make that a value of --cpu-prof
because then node --cpu-prof entry.js
would be ambiguous, or at least our current option parser does not support something smart enough to work around it).
The generated CPU profile can be visualized using Chrome's dedicated DevTools for Node.js (available in chrome://inspect
), this is much easier to use than --prof-process
for --prof
, although V8's CPU profiler is currently only capable of profiling JavaScript. However there are also discussions about including native stacks there as well (according to @psmarshall ).
We may also be able to hook this into our benchmark runner to get CPU profiles for the core benchmarks.
Some questions regarding this feature:
- Should the CPU profiler be started before bootstrap, or before pre-execution, or right before we start to execute the first user script?
- Bootstrap includes the setup of the Node.js environment that does not depend on any run time states e.g. CLI flags, environment variables. This part of execution will go away once embedded v8 snapshot is implemented and used in core (because then we'll be deserializing the context from binary data instead).
- Pre-execution includes the setup that depends on run time states (e.g.
process.argv
,process.env
), as well as preparations done before executing the entry point (e.g. running preloaded modules specified with--require
) - currently these two are mixed together but we should be able to refactor and separate them.
- The prototype is currently doing something similar to coverage collection when it serialize the results: it does a
JSON::Parse
on the message returned from the inspector protocol, and then does aJSON::Stringify
on the specific subtree (in this case,message.result.profile
). It should be fine as an initial implementation, however:- This is not exactly cheap, but it saves us the trouble of maintaining a CPU profile serializer by ourselves (this is not currently available in
v8-profile.h
) - For reference v8-inspector maintains their own serializer, but the format of the CPU profile is unspecified and subject to changes so that requires manual maintainance unless the support is added in the upstream
- Is it possible to customize the inspector protocol or improve the inspector API to get the results directly and avoid this detour? cc @nodejs/v8-inspector
- This is not exactly cheap, but it saves us the trouble of maintaining a CPU profile serializer by ourselves (this is not currently available in
Any feedback regarding this idea is welcomed!