Skip to content

Conversation

@ericcornelissen
Copy link
Contributor

Update the regular expression that is used to remove trailing slashes from the path in Mount#getPath - introduced in #98 - to avoid super-linear runtimes1. I believe this does not affect the st export because the only path to this API is through Mount#serve which sanitizes the req.url using path.normalize (in Mount#getUriPath). However, the Mount class is also exported directly, through which getPath could be called on unsanitized inputs.

The super-linear runtime could occur because the regex engine will backtrack through a repetition of '/'s if it is not at the end. This is resolved by anchoring a repetition '/' on a non-'/' with a negative lookbehind.

The issue can be tested and reproduced using this script:

const st = require('st')
const path = require('path')
const mount = new st.Mount({
  path: path.join(__dirname, '/static'),
  url: '/static'
})

for (let i = 0; i < 90_000; i += 10_000) {
  console.time('timer')
  mount.getPath(`a${"/".repeat(i)}a`)
  console.timeEnd('timer')
}

Footnotes

  1. while some consider this an availability problem, I decided to contribute this update in the open because 1) it only affects an undocumented API, and 2) this project lacks a security policy.

st.js Outdated
Comment on lines 219 to 221
// trailing slash removal to fix Node.js v23 bug
// https://github.com/nodejs/node/pull/55527
// can be removed when this is resolved and released
return path.join(this.path, u.replace(/\/+$/, ''))
return path.join(this.path, u.replace(/(?<!\/)\/+$/, ''))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the negative look-behind gets us in the wrong place here because it'll not match multiple slashes, voiding the + all together. I think the right answer here is to just avoid regexes entirely; plus the original comment should be changed since we've now essentially baked in normalised paths that help us get consistent rendering when returning a directory listing.

Suggested change
// trailing slash removal to fix Node.js v23 bug
// https://github.com/nodejs/node/pull/55527
// can be removed when this is resolved and released
return path.join(this.path, u.replace(/\/+$/, ''))
return path.join(this.path, u.replace(/(?<!\/)\/+$/, ''))
// Normalise paths by removing trailing slashes
// This ensures consistent paths for directory content rendering
while (u.length > 0 && u[u.length - 1] === '/') {
u = u.slice(0, -1)
}
return path.join(this.path, u)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the negative look-behind gets us in the wrong place here because it'll not match multiple slashes, voiding the + all together.

That is incorrect, you can test it with let u = "test//"; u.replace(/(?<!\/)\/+$/, '') which will return "test". This is because it will match 1-or-more / starting from a / that is preceded by a non-/. The negative look-behind only applies to the start of \/+, not every \/ in the sequence (though I understand the confusion).

I think the right answer here is to just avoid regexes entirely;

Having said the above, happy to adopt your change if that's what you prefer.

plus the original comment should be changed since we've now essentially baked in normalised paths that help us get consistent rendering when returning a directory listing.

And I'll update this regardless of whether we keep the regex or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @rvagg

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, i didn't realise I needed a follow-up here -- tbh I'd prefer to just avoid the regex, I think it's going to be much quicker anyway and it's much more self-documenting than making the regex more complicated. I appreciate the effort though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem - updated now, see d45d312

@rvagg
Copy link
Collaborator

rvagg commented Aug 5, 2025

Well, thanks for the education, I didn't realise we didn't have $ anchored optimisations in V8, so your critique makes sense in that light. But as per the comment, I think the solution is to just not use regexes for something this simple.

@ericcornelissen
Copy link
Contributor Author

$ anchored optimisations in V8

Never heard this term before 🤔 Are you referring to V8's non-backtracking RegExp engine?

@rvagg
Copy link
Collaborator

rvagg commented Aug 5, 2025

No, just the fact that there's a $ anchor in the regex so the engine should be able to make assumptions about how to handle it rather than take the naive left-to-right approach of processing it. The problem that you've identified is that it's going to spin its wheels looking for consecutive / chars anywhere in the string, but they only should matter when they bump up against the end of the string, but V8 apparently doesn't perform any optimisation. I think the Perl regex engine does, at least. But it's probably due to other complicated factors related to needing to support all the variations JS needs to support.

ericcornelissen and others added 2 commits August 5, 2025 10:35
Update the regular expression that is used to remove trailing slashes
from the path in `Mount#getPath` to avoid super-linear runtimes. I
believe this does not affect the `st` export because the only path to
this API is through `Mount#serve` which sanitizes the `req.url` using
`path.normalize` (in `Mount#getUriPath`). However, the `Mount` class
is also exported directly, through which `getPath` could be called on
unsanitized inputs.

The super-linear runtime could occur because the regex engine will
backtrack through a repetition of '/'s if it is not at the end. This
is resolved by anchoring a repetition '/' on a non-'/' with a negative
lookbehind.

The issue can be tested and reproduced using this script:

    const st = require('st')
    const path = require('path')
    const mount = new st.Mount({
      path: path.join(__dirname, '/static'),
      url: '/static'
    })

    for (let i = 0; i < 90_000; i += 10_000) {
      console.time('timer')
      mount.getPath(`a${"/".repeat(i)}a`)
      console.timeEnd('timer')
    }
Co-authored-by: Rod Vagg <rod@vagg.org>
@rvagg rvagg merged commit 485e7fc into isaacs:master Aug 14, 2025
6 checks passed
@ericcornelissen ericcornelissen deleted the patch-1 branch August 14, 2025 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants