Skip to content

Conversation

@michaelsmithxyz
Copy link
Contributor

@michaelsmithxyz michaelsmithxyz commented Oct 27, 2025

Fixes #60397

In #59888 the nearest parent package JSON cache package_json_reader.js was adjusted from a map from any given module path to a representation of its parent package.json file to a map from package.json paths to a deserialized representation of their content. This addressed excessive memory usage caused by repeatedly caching identical deserialized package.json objects for modules that shared a parent package.json, but also reintroduced a filesystem traversal in package_json_reader.js which finds the nearest parent package.json file for a given module. The stat calls in this traversal are not cached, so we end up potentially issuing them for a bunch of duplicate paths. In the reported issue, this leads to poor performance for users using potentially high-latency network filesystems. Similar poor performance is also observed in Node versions that lack #59086, which (re)introduced the JS-side cache initially.

This PR addresses this by unwinding the changes in #59888 and instead making the C++-side package.json cache a bit more expressive, caching both a deserialized representation of a package.json file at a given path, as well as an indicator if no such file exists (modeled as an std::optional). This addresses the poor performance reported in #60397 by:

  1. Removing the repeated stat calls in package_json_reader.js
  2. Avoiding repetitive attempts to read non-existent package.json paths on the C++ side, which also perform poorly on high-latency filesystems

While analyzing the performance of these changes, I noticed a confounding factor which is that the lazy-parsing and caching of imports and exports on deserialized package configuration objects in deserializePackageJSON wasn't working as expected and was also contributing to the varying performance we've been seeing across these changes:

  1. The attempt to define lazy properties to parse and cache the JSON on demand didn't work as expected because the resulting object was immediately spread, meaning we'd immediately run the JSON parsing code anyway
  2. Because the parsed representation of imports and exports is cached on deserialized package.json objects, it's important that a given package.json file map to the same deserialized object. If we don't do this, we repeatedly re-parse these fields redundantly across calls. This motivates the sort of strange two-level caching scheme in getNearestParentPackageJSON that these changes introduce. The downside here is that we potentially redundantly call into modulesBinding.getNearestParentPackageJSON for a given path just to resolve the path to a package.json file that we may already have cached, but I don't see any way to avoid this.

Benchmarks

I benchmarked this change with the same scripts I used in #59888. The first is the reproduction script from #58126:

require('dd-trace').init();
const cdk = require('aws-cdk-lib');

const app = new cdk.App();
for (let i = 0; i < 1000; i++) {
  new cdk.Stack(app, `DdTraceStack${I}`)
}

The second is this:

for (let i = 0; i < 1000; i++) {
  require('date-fns');
}

Each benchmark compares v22.19.0 (which does not include #59888), v25.1.0 (the latest current release, which does include #59888), and this change (which is just the node directory in the output).

Fast disk

ddtrace + CDK

➜ hyperfine --warmup 10 -L node_path ../node/node,../node_worktrees/v25.1.0/node,../node_worktrees/v22.19.0/node "{node_path} dd-cdk-benchmark.js"
Benchmark 1: ../node/node dd-cdk-benchmark.js
  Time (mean ± σ):     161.6 ms ±   1.6 ms    [User: 170.7 ms, System: 20.9 ms]
  Range (min … max):   159.2 ms … 164.9 ms    18 runs

Benchmark 2: ../node_worktrees/v25.1.0/node dd-cdk-benchmark.js
  Time (mean ± σ):     164.9 ms ±   0.9 ms    [User: 174.7 ms, System: 22.8 ms]
  Range (min … max):   163.7 ms … 166.8 ms    17 runs

Benchmark 3: ../node_worktrees/v22.19.0/node dd-cdk-benchmark.js
  Time (mean ± σ):     169.5 ms ±   1.0 ms    [User: 172.4 ms, System: 20.6 ms]
  Range (min … max):   167.9 ms … 172.0 ms    17 runs

Summary
  ../node/node dd-cdk-benchmark.js ran
    1.02 ± 0.01 times faster than ../node_worktrees/v25.1.0/node dd-cdk-benchmark.js
    1.05 ± 0.01 times faster than ../node_worktrees/v22.19.0/node dd-cdk-benchmark.js

date-fns

➜ hyperfine --warmup 10 -L node_path ../node/node,../node_worktrees/v25.1.0/node,../node_worktrees/v22.19.0/node "{node_path} date-fns-benchmark.js"
Benchmark 1: ../node/node date-fns-benchmark.js
  Time (mean ± σ):      71.3 ms ±   1.1 ms    [User: 73.4 ms, System: 13.3 ms]
  Range (min … max):    69.7 ms …  75.2 ms    41 runs

Benchmark 2: ../node_worktrees/v25.1.0/node date-fns-benchmark.js
  Time (mean ± σ):      66.3 ms ±   0.7 ms    [User: 66.8 ms, System: 10.4 ms]
  Range (min … max):    64.9 ms …  69.1 ms    43 runs

Benchmark 3: ../node_worktrees/v22.19.0/node date-fns-benchmark.js
  Time (mean ± σ):     115.9 ms ±   0.7 ms    [User: 139.6 ms, System: 18.4 ms]
  Range (min … max):   114.1 ms … 117.5 ms    25 runs

Summary
  ../node_worktrees/v25.1.0/node date-fns-benchmark.js ran
    1.07 ± 0.02 times faster than ../node/node date-fns-benchmark.js
    1.75 ± 0.02 times faster than ../node_worktrees/v22.19.0/node date-fns-benchmark.js

Slow disk

I emulated this by mounting an NFS volume from localhost with noac (to disable most caching).

ddtrace + CDK

➜ hyperfine --warmup 10 -L node_path ../../node/node,../../node_worktrees/v25.1.0/node,../../node_worktrees/v22.19.0/node "{node_path} dd-cdk-benchmark.js"
Benchmark 1: ../../node/node dd-cdk-benchmark.js
  Time (mean ± σ):      3.976 s ±  1.193 s    [User: 0.210 s, System: 0.148 s]
  Range (min … max):    1.924 s …  5.314 s    10 runs

Benchmark 2: ../../node_worktrees/v25.1.0/node dd-cdk-benchmark.js
  Time (mean ± σ):      7.668 s ±  2.805 s    [User: 0.210 s, System: 0.345 s]
  Range (min … max):    5.307 s … 12.359 s    10 runs

Benchmark 3: ../../node_worktrees/v22.19.0/node dd-cdk-benchmark.js
  Time (mean ± σ):      3.810 s ±  1.343 s    [User: 0.205 s, System: 0.167 s]
  Range (min … max):    2.266 s …  5.542 s    10 runs

Summary
  ../../node_worktrees/v22.19.0/node dd-cdk-benchmark.js ran
    1.04 ± 0.48 times faster than ../../node/node dd-cdk-benchmark.js
    3.01 ± 1.02 times faster than ../../node_worktrees/v25.1.0/node dd-cdk-benchmark.js

date-fns

➜ hyperfine --warmup 10 -L node_path ../../node/node,../../node_worktrees/v25.1.0/node,../../node_worktrees/v22.19.0/node "{node_path} date-fns-benchmark.js"
Benchmark 1: ../../node/node date-fns-benchmark.js
  Time (mean ± σ):     699.7 ms ±  20.3 ms    [User: 65.4 ms, System: 46.3 ms]
  Range (min … max):   668.5 ms … 727.1 ms    10 runs

Benchmark 2: ../../node_worktrees/v25.1.0/node date-fns-benchmark.js
  Time (mean ± σ):     977.5 ms ±  41.0 ms    [User: 68.1 ms, System: 62.0 ms]
  Range (min … max):   923.2 ms … 1038.0 ms    10 runs

Benchmark 3: ../../node_worktrees/v22.19.0/node date-fns-benchmark.js
  Time (mean ± σ):     825.4 ms ± 311.2 ms    [User: 112.7 ms, System: 60.5 ms]
  Range (min … max):   542.0 ms … 1350.4 ms    10 runs

Summary
  ../../node/node date-fns-benchmark.js ran
    1.18 ± 0.45 times faster than ../../node_worktrees/v22.19.0/node date-fns-benchmark.js
    1.40 ± 0.07 times faster than ../../node_worktrees/v25.1.0/node date-fns-benchmark.js

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. module Issues and PRs related to the module subsystem. needs-ci PRs that need a full CI run. typings labels Oct 27, 2025
@michaelsmithxyz michaelsmithxyz force-pushed the fix_module_loading_perf_regression branch from cd265c8 to 9c5c99f Compare October 29, 2025 15:08
@michaelsmithxyz michaelsmithxyz force-pushed the fix_module_loading_perf_regression branch from 9c5c99f to 003f911 Compare November 6, 2025 02:28
@michaelsmithxyz michaelsmithxyz marked this pull request as ready for review November 6, 2025 02:31
@michaelsmithxyz michaelsmithxyz force-pushed the fix_module_loading_perf_regression branch from 003f911 to ede35c9 Compare November 6, 2025 02:36
return nullptr;
}

void BindingData::GetNearestParentPackageJSON(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Can we use existing methods rather then adding a new method?
  2. I believe THROW_IF_INSUFFICIENT_PERMISSIONS call is missing in this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was re-added as a straight revert of the code I deleted in #59888, but yeah it is 99% the same as GetNearestParentPackageJSONType. I can refactor this to remove that redundancy if you'd prefer!

I think TraverseParent handles the permissions checking you're referring to here, but it doesn't throw if there are insufficient permissions, it just returns nullptr. I'm not sure I have enough overall context to judge if that's right or wrong, but it's what this has always done!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled the common path-handling stuff into a method to reduce the redundancy here a bit. There were also changes in this area rebasing onto main pulled in, so it looks a tiny bit different now

@anonrig
Copy link
Member

anonrig commented Nov 6, 2025

After these 2 review comments I've left, I think we should merge this.

@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 96.62921% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.54%. Comparing base (4e7f9c9) to head (86c48ea).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
src/node_modules.cc 90.00% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #60425      +/-   ##
==========================================
- Coverage   88.56%   88.54%   -0.02%     
==========================================
  Files         704      704              
  Lines      208101   208109       +8     
  Branches    40083    40078       -5     
==========================================
- Hits       184299   184279      -20     
- Misses      15827    15870      +43     
+ Partials     7975     7960      -15     
Files with missing lines Coverage Δ
lib/internal/modules/package_json_reader.js 99.44% <100.00%> (+1.82%) ⬆️
src/node_modules.h 100.00% <ø> (ø)
src/node_modules.cc 77.16% <90.00%> (+0.99%) ⬆️

... and 31 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Nov 6, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Nov 6, 2025
@nodejs-github-bot
Copy link
Collaborator

@michaelsmithxyz michaelsmithxyz force-pushed the fix_module_loading_perf_regression branch from ede35c9 to 86c48ea Compare November 8, 2025 15:45
@michaelsmithxyz
Copy link
Contributor Author

Lint issue is unrelated: #60636

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. module Issues and PRs related to the module subsystem. needs-ci PRs that need a full CI run. typings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Perf regression in Node 22/24 when loading JS files

3 participants