Skip to content

Support for file-system based persistent code cache in user-land module loaders #47472

Closed
@joyeecheung

Description

@joyeecheung

This stemmed from a Twitter thread. Specifically I am wondering if there are any concerns over having something similar to what https://github.com/zertosh/v8-compile-cache does in core, the general idea is:

  1. If the user enables this feature (probably should be off by default) e.g. via an environment variable, whenever we compile a module, we produce the code cache for the module, and on process exit, we store any new cache produced in a cache directory on the file system.
  2. The next time the process is launched (with this feature enabled again), whenever we are loading a module, we attempt to load the cache from that directory and use it when compiling the module, in order to speed up the start up (where most of the time is usually spent on compilation).

This is also similar to what Chrome does with the V8 code cache.

The motivation for implementing this in core is that, for a user-land module to do this for CJS, it has to monkey patch the CJS loader, and this increases the compatibility burden (v8-compile-cache has 17M weekly downloads, and it needs to monkey-patch Module.prototype._compile to work. From a glance of its issue tracker it seems some of the issues are not really fixable in the user land either, like piping into the internal source maps cache). For ESM currently the user land can only use --loader to customize the compilation, which has a cost of its own (especially when we move it to a separate thread), creating a disparity from CJS, and also even with --loader I doubt if user-land code can integrate into e.g. the source map cache without asking us to expose too much internals.

The most risky part of this feature might be the growth of the cache, but it seems manageable if:

  1. The feature is opt-in (via an Environment variable, for example, or a method that can be called to enable/disable from user land).
  2. We do some checks for the size of the cache directory when this feature is used, and set a default cache size limit to prevent unbound growth.

This doesn't seem too radical, for example we already persist something like the repl history by default, and we also have features like NODE_V8_COVERAGE that does a similar "writing a lot of data to a directory when enabled" thing. I don't think this would increase the code complexity much either (we might also want a read-only version of this for SEA in the future too). So opening this issue to see if there are any concerns about having this in core before implementing it.

Metadata

Metadata

Assignees

Labels

discussIssues opened for discussions and feedbacks.feature requestIssues that request new features to be added to Node.js.loadersIssues and PRs related to ES module loadersmoduleIssues and PRs related to the module subsystem.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions