Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: implement fsp.exists #39968

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Ayase-252
Copy link
Member

Fixes: #39960

cc @bnb

@nodejs-github-bot nodejs-github-bot added fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Sep 1, 2021
doc/api/fs.md Show resolved Hide resolved
@@ -808,11 +808,22 @@ async function readFile(path, options) {
return handleFdClose(readFileHandle(fd, options), fd.close);
}

async function exists(path) {
validateString(path, 'path');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think access() has already validated the path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but access() is called in the try block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the problem is, if we got a wrong path, should exists() return false or throw an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node/lib/fs.js

Lines 291 to 295 in 73d5f8a

try {
path = getValidatedPath(path);
} catch {
return false;
}

fs.existsSync() returns false if path validation failed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to

node/lib/fs.js

Lines 279 to 284 in 73d5f8a

// fs.existsSync never throws, it only returns true or false.
// Since fs.existsSync never throws, users have established
// the expectation that passing invalid arguments to it, even like
// fs.existsSync(), would only get a false in return, so we cannot signal
// validation errors to users properly out of compatibility concerns.
// TODO(joyeecheung): deprecate the never-throw-on-invalid-arguments behavior
, I think we should not make the same mistake with fsPromises.exists.
Maybe we should even go further and also throw if the error is not ENOENT. For example a permission error now ends up in returning false but we actually don't know if the file exists or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, fs.existsSync(path) returns false when path validation failed, even path is not a string. e.g.

> const fs = require('fs')
undefined
> fs.existsSync({})
false

I would think it is a bug(?), since docs says it only allows path of type <string> | <Buffer> | <URL>.

Copy link
Contributor

@XadillaX XadillaX Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to

node/lib/fs.js

Lines 279 to 284 in 73d5f8a

// fs.existsSync never throws, it only returns true or false.
// Since fs.existsSync never throws, users have established
// the expectation that passing invalid arguments to it, even like
// fs.existsSync(), would only get a false in return, so we cannot signal
// validation errors to users properly out of compatibility concerns.
// TODO(joyeecheung): deprecate the never-throw-on-invalid-arguments behavior

, I think we should not make the same mistake with fsPromises.exists.
Maybe we should even go further and also throw if the error is not ENOENT. For example a permission error now ends up in returning false but we actually don't know if the file exists or not.

If we throws, the behaivor of 3 APIs are very different:

  • fs.exists(): wrong callback signature;
  • fs.existsSync(): eats errors;
  • fsPromise.exists(): throws errors.

And actually they just do the same thing in different async ways. I think we shouldn't make the behaivor more chaotic.

I think there're 2 ways to choose:

  1. fsPromise.exists(): never throws;
  2. Delete fs.existsSync()'s TODO and make it throws.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to make fs.existsSync throw, we should probably make the error swallowing go through a complete deprecation first to make sure this won't break the ecosystem.

doc/api/fs.md Outdated Show resolved Hide resolved
@Ayase-252 Ayase-252 added the wip Issues and PRs that are still a work in progress. label Sep 1, 2021
lib/internal/fs/promises.js Outdated Show resolved Hide resolved
@Ayase-252 Ayase-252 removed the wip Issues and PRs that are still a work in progress. label Sep 1, 2021
@tniessen tniessen added the semver-minor PRs that contain new features and should be released in the next minor version. label Sep 1, 2021
* Returns: {Promise} Resolved as `true` if the given path exists, otherwise,
resolved as `false`.

Test whether or not the given path exists by checking with the file system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really check if the file exists, which is why the name is misleading, see #39960 (comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It makes sense. Do you have a better name suggestion? Or like os.path.exists on Python, is a sepecial note sufficient?

@jasnell
Copy link
Member

jasnell commented Sep 3, 2021

Leaving this out in the fs.promises impl was intentional given that it's a fairly significant anti-pattern, especially in an async api like fs.promises. The check is never going to be 100% reliable because the path could be created/deleted in between the time it is checked and the time the microtask continuation happens.

@Linkgoron
Copy link
Member

Linkgoron commented Sep 4, 2021

Leaving this out in the fs.promises impl was intentional given that it's a fairly significant anti-pattern, especially in an async api like fs.promises. The check is never going to be 100% reliable because the path could be created/deleted in between the time it is checked and the time the microtask continuation happens.

While this is true, how is this different than something like access? access's documentation does have a fairly big warning that using it and then accessing the file can lead to race conditions etc. Those warnings should at least be added to this method as well. From what I've seen, users expect exists to exist, and it's not immediately clear to users why there is no exists method in fsPromises, or that they should use access or stat, or just start writing/reading without checking (especially as existsSync is not deprecated, and the docs state that exists is explicitly deprecated because of its non-standard callback).

The main selling point over access of this implementation, IMO, is that using this method users don't need to understand how to use the error code to understand if they don't have access or if the file just doesn't exist, as it throws an error when it can't check the file's status.

@XadillaX
Copy link
Contributor

XadillaX commented Sep 6, 2021

My point about error checking is to choose one way of:

  1. Make fsp.exists() and fs.existsSync() both throw;
  2. Make fsp.exists() and fs.existsSync() both not throw.

Any of the two ways is OK. But they shouldn't be different (e.g. fsp.exists() throws and fs.existsSync() not throws).

@tniessen
Copy link
Member

tniessen commented Sep 7, 2021

Make fsp.exists() and fs.existsSync() both throw;

It might be difficult to change the behavior of existsSync. At the very least, it would take multiple semver-major changes just to deprecate the current behavior. I believe this is the reason why we also don't perform any validation of the parameter to fs.exists, e.g., fs.existsSync({ makesSense: false }) === false.

Copy link
Contributor

@aduh95 aduh95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this as changes requested to make sure it doesn't land by accident. This PR should not land until:

  • a consensus is reached regarding the need for this API to exist at all given the anti-patterns it might encourage.
  • a consensus is reached regarding error handling (throw when FS permissions doesn't let it know if a file exist? Mimic existsSync behavior?).

I also think the documentation could use some "Instead of this, do this" code examples to illustrate which anti-patterns users should avoid.

@theoludwig
Copy link
Contributor

The last comment on this PR was about 1 year ago and given that this is a useful feature to have in Node.js core, I'm asking if discussions happened or if a consensus was reached?

Please reconsider this feature to be included in Node.js core, I often end up copying/pasting this code:

import fs from 'node:fs'

export const isExistingPath = async (path: string): Promise<boolean> => {
  try {
    await fs.promises.access(path, fs.constants.F_OK)
    return true
  } catch {
    return false
  }
}

Whereas, I could be using:

import fs from 'node:fs'

await fs.promises.exists(path)

IMO about the consensus:

  • Checking if a file exists is not an anti-pattern.
    As said by @tniessen in Somehow support checking if a path exists while using fs.promises #39960 (comment):

    I don't think the race condition is the main problem.
    In most cases, in automation scripting, race conditions are less likely to be problematic.

    If you think, introducing this API might encourage an anti-pattern, consider writing detailed documentation, when this API is "safe to use", users are responsible enough to know what is the potential behavior in their code, and when this API might be an anti-pattern or useful.

  • This API should mimic existsSync behavior, giving a path that doesn't exist or that simply can't be accessed (e.g: '/root/snap') should returns false, it's better that similar APIs behave the same way, easier to think about. If needed we could also introduce a new boolean option checkAccess, to change this behavior according to user needs.

Comment on lines +817 to +822
if (err.code === 'ENOENT') {
return false;
}
// We are unable to determine if the file exists or not,
// like encountering a permission error.
throw err;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should always (including the fact that this is a permission error) return false in this catch block, and should never throw an error to mimic existsSync API.

@tniessen
Copy link
Member

tniessen commented Sep 11, 2022

Please reconsider this feature to be included in Node.js core, I often end up copying/pasting this code:

@divlo Since you claim to use this quite often and because you appear to prefer the variant that does not distinguish between ENOENT and other errors, maybe you can describe a few use cases? In particular, use cases that

  • do not care about actual existence but rather if the calling process can see the file system entry at the given path (otherwise, exists() is insufficient),
  • do not care if the file system entry is readable or writable or has some other properties (otherwise, exists() is insufficient),
  • do not care if the file system entry is a file or a directory or some other file system entry (otherwise, exists() is insufficient),
  • do not benefit from directly performing some operation on the file system entry and gracefully handling failure due to non-existence (which would avoid race conditions).

@theoludwig
Copy link
Contributor

@tniessen In my use cases specifically, I could replace isExistingPath with fs.existsSync with no problems, and it will already avoid me to copy/paste this function in all my projects. However, to be consistent, I try to use only fs promises-based APIs.
I don't really care about permissions issues, it should not be relevant.

The only reason, I suggest not distinguishing between errors, is to have the same behavior.
My use cases mostly include checking if a directory exists before deleting it, checking if a directory already exists and throwing a custom error, saying that the directory already exists, and things like that...
So yeah I don't really have use cases meeting your conditions.
Nevertheless, I don't see why Node.js would not want to add the promise version, given that the synchronous version exists.

Note: You missed the ping for my name (typo).

@bnb
Copy link
Contributor

bnb commented Sep 11, 2022

@tniessen I'm 90% certain that basically every use case I have meets those criteria.

Literally anytime I'm checking to see if something exists, it's just to ensure I'm not overwriting a file that I've already written or to follow up with a read, which AFAIK also doesn't necessarily require any of those situations to be the case (I can handle checking if the file is readable separately).

1 similar comment
@bnb
Copy link
Contributor

bnb commented Sep 11, 2022

@tniessen I'm 90% certain that basically every use case I have meets those criteria.

Literally anytime I'm checking to see if something exists, it's just to ensure I'm not overwriting a file that I've already written or to follow up with a read, which AFAIK also doesn't necessarily require any of those situations to be the case (I can handle checking if the file is readable separately).

@tniessen
Copy link
Member

The only reason, I suggest not distinguishing between errors, is to have the same behavior.

Personally, I am not convinced that the existing behavior is desirable. The existing function is even deprecated.

So yeah I don't really have use cases meeting your conditions.
Nevertheless, I don't see why Node.js would not want to add the promise version, given that the synchronous version exists.

As far as I can tell, based on the discussion here, the main questions are:

  • Is exists() misleading in that it does not actually tell the caller if the given path exists?
  • Is exists() mostly a footgun, e.g., by introducing race conditions?
  • Are most use cases better handled using access(), stat(), or by attempting to open the file system entry?

Maybe the answer to all of these is "no", in which case adding the Promise version of exists() might make sense. But, personally, I am not convinced that this is the case.

Literally anytime I'm checking to see if something exists, it's just to ensure I'm not overwriting a file that I've already written

I have a feeling that these are the race conditions that were mentioned earlier. When an application is about to write a file and does not want to overwrite an existing file, it's often a good idea to let the OS handle it by specifying O_EXCL when opening the file. This guarantees that the existence of the file does not change between the existence check and its creation.

But I do realize that such explicit checks might be needed if a significant amount of work needs to be done between checking the existence of the file and creating it.

or to follow up with a read, which AFAIK also doesn't necessarily require any of those situations to be the case (I can handle checking if the file is readable separately).

Similarly, in many cases, it probably makes sense to just read the file and to handle ENOENT. Otherwise, again, a race condition occurs and the file might be deleted or renamed in-between the existence check and the read operation.

Again, I do acknowledge that there are use cases where an existence check really is the best approach, even at the risk of introducing race conditions etc. Those use cases can already be handled with a few lines of code, and the question remains: do these use cases justify adding a utility function to node that might be misleading or even incorrect and that might promote anti-patterns, including race conditions, like the existing fs.exists() function?

@bnb
Copy link
Contributor

bnb commented Sep 11, 2022

Are most use cases better handled using access(), stat(), or by attempting to open the file system entry?

My genuine and honest response to all discussion about the general topic is that the DX of having to go to access() and stat() is sub-par and exists() should do what it's name implies (setting aside whatever it currently does).

I don't really care what this looks like. I honestly just want... better DX around this. It's a relatively common thing to do and it's unnecessarily complicated as-is.

If that means we semver major replace exists() functionality with the best practice for doing what you can do with access(), stat(), or directly opening the file system entry, I'm fine with that. If that means doing all that (or something else) under a new method that accurately describes the function in a similar way to exists() with a strong API, I'm fine with that too.

@jasnell
Copy link
Member

jasnell commented Sep 11, 2022

I'm still -1 on this for all the reasons already expressed. I don't see having this available is any improvement over what can already be done

@theoludwig
Copy link
Contributor

theoludwig commented Sep 27, 2022

Quick question, given that example:

import fs from 'node:fs'

const files = [] // array of strings, containing the path of different files

await Promise.all(
  files.map(async (file) => {
    if (await fs.promises.exists(file)) {
      // do something
    }
  })
)

Is the promise version (with Promise.all), more performant than the synchronous version?
The more there are files, the more the performance gain of using fs.promises will be perceptible as it is running in parallel, right?

Am I making wrong assertion?
Then, if it is indeed the case, isn't that already an improvement (mostly DX) over what can already be done?
By adding this API, we ease the usage of promises for this use case.

@tniessen
Copy link
Member

Is the promise version (with Promise.all), more performant than the synchronous version?
The more there are files, the more the performance gain of using fs.promises will be perceptible as it is running in parallel, right?

It's difficult to say since it depends on various properties related to the thread pool, but it is likely to perform better.

Then, if it is indeed the case, isn't that already an improvement (mostly DX) over what can already be done?
By adding this API, we ease the usage of promises for this use case.

If you want the existing buggy behavior of fs.exists, it doesn't need to be as complicated as in #39968 (comment):

const exists = util.promisify(fs.exists);
// or
const exists = (p) => new Promise((r) => fs.exists(p, r));

It's also not difficult to arrive at a solution that handles errors properly. For example:

const exists = (p) =>
  fs.promises.access(p).then(
    () => true,
    (e) => (e.code === 'ENOENT' ? false : Promise.reject(e))
  );

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Somehow support checking if a path exists while using fs.promises
10 participants