-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: add isFile and isDirectory methods #46345
Conversation
Attempting to add a benchmark, any preliminary reviews would be appreciated cc @anonrig |
lib/fs.js
Outdated
const stats = binding.stat(pathModule.toNamespacedPath(path), true, undefined, ctx); | ||
if (hasNoEntryError(ctx)) { | ||
return false; | ||
} | ||
return validateIfFileFromStats(stats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if a different error occurs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have updated the error handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem in isDirectory
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Such claims really need realistic benchmarks. I would expect sync disk I/O to far outweigh the cost of creating the JS object, except perhaps when repeatedly accessing the same disk block on an SSD in a loop. This also promotes the anti-pattern of fs operations that result in race conditions. |
Have added a benchmark for this, should also commit the comparison benchmark file? |
have run the benchmarks on my local machine the results look something like this:
So looks somewhat faster? also i am unable to understand how to run the benchmark with statistics for proper comparison so any help for that would be appreciated |
couldnt get you here 😅😅 could you elaborate? |
That is going to be tricky. OS kernels are extremely effective at caching inodes and since this feature does not depend on actual file contents, any I/O during a benchmark will likely hit the inode cache. And even if it doesn't, it'll probably hit the kernel's disk cache. And if it doesn't, most SSDs maintain their own internal cache. In other words, as long as the benchmark accesses a predictable and/or smallish part of the disk only (and assuming the disk is not a shared medium, network-attached storage, etc.), it won't reflect actual I/O times.
APIs such as |
Thank you so much for explaining much appreciated |
Went through the discussion it seems like its DX vs Discouraging anti-patterns issue and I don't think there is any consensus on that yet? |
Also for the original intent of the issue nodejs/performance#46 (comment) of speeding up checking during module resolution do you think this could speed it up? since (i think please correct me if am wrong 😅) during module resolution, we would be accessing predictable parts of the disk multiple times? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait for benchmark numbers, but I really don't think we should add these as public API.
lib/fs.js
Outdated
isFile, | ||
isDirectory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like these should have "sync" as part of their name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactoring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, could you say how I could post the comparison benchmarks?
If you're referring to the ESM loader alone, I wonder why we aren't using the InternalModuleStat function there like we are using it in the CJS loader. Maybe that would speed things up without exposing more public APIs 🤔 |
Oh ok didn't know this could explore in a separate PR I guess, I created this one for the general comment in nodejs/performance#46 (comment) |
Ok I tried replacing the implementation of Running on my local machine:
and using
|
2d0f314
to
f20092b
Compare
since there isn't any consensus on making this change yet, and also the benchmark perf isn't hugely better closing this for now I think it may be possible to improve the |
Attempted to add a isFile, isDirectory apis as per the referenced issue, basically this would avoid the creation of the stats object so I think this should provide somewhat of a speed up, I believe we could speedup from the cpp side too but not much luck in finding what to improve there, any help would be appreciated!
Refs: nodejs/performance#46