Skip to content

Suggested lint: use "!is_dir()" instead of "is_file()" #4503

Closed
@kentfredric

Description

@kentfredric

A lot of people have a tendency to assume files and directories are all there are they have to think about, and that if its not a file, its clearly a directory, and vice versa.

The documentation even suggests this if you don't think it through, by saying that is_file() and is_dir() are mutually exclusive.

However, not all things that return false from is_file() are directories, and plenty of special file types return false for both is_dir() and is_file()

A lot of the time, when people make the is_file() check, they're not asking "is this a regular file as opposed to a special one", they're asking "is this a thing that I can read and get bytes from".

A particular situation where one might be calling is_file() and prematurely rejecting it is when the file in question is a user specified path, and that specified path is a pipe, as constructed by the bash command:

fooprogram <( program that outputs )

fooprogram gets passed /dev/fd/63 (or similar) as the first parameter, and that is not a regular file, but it is for all general purposes something that can be safely treated as a file.

Subsequently, the better advice here is to check for if !file.is_dir() in preparation for doing reads from it.

Excerpts from man 7 inode

 The following mask values are defined for the file type:

           S_IFMT     0170000   bit mask for the file type bit field

           S_IFSOCK   0140000   socket
           S_IFLNK    0120000   symbolic link
           S_IFREG    0100000   regular file
           S_IFBLK    0060000   block device
           S_IFDIR    0040000   directory
           S_IFCHR    0020000   character device
           S_IFIFO    0010000   FIFO

       Thus, to test for a regular file (for example), one could write:

           stat(pathname, &sb);
           if ((sb.st_mode & S_IFMT) == S_IFREG) {
               /* Handle regular file */
           }

strace output of an example of this usage (noting the value of st_mode):

strace -v cat <( echo hello )
execve("/bin/cat", ["cat", "/dev/fd/63"], .... )
... 
openat(AT_FDCWD, "/dev/fd/63", O_RDONLY) = 3
fstat(3, {
  st_dev=makedev(0, 0xb),
  st_ino=63079295,
  st_mode=S_IFIFO|0600, 
  st_nlink=1,
  st_uid=1000, 
  st_gid=1000,
  st_blksize=4096,
  st_blocks=0,
  st_size=0,
  st_atime=1567670286 /* 2019-09-05T19:58:06.632258848+1200 */, 
  st_atime_nsec=632258848, 
  st_mtime=1567670286 /* 2019-09-05T19:58:06.633258850+1200 */, 
  st_mtime_nsec=633258850, 
  st_ctime=1567670286 /* 2019-09-05T19:58:06.633258850+1200 */, 
  st_ctime_nsec=633258850
}) = 0
...
read(3, "hello\n", 131072)              = 6
write(1, "hello\n", 6)                  = 6
read(3, "", 131072)                     = 0
close(3)                                = 0
close(1)                                = 0

Current rust code for sys::unix::fs::FileType
https://github.com/rust-lang/rust/blob/a24f636e60a5da57ab641d800ac5952bbde98b65/src/libstd/sys/unix/fs.rs#L195-L201

impl FileType {
    pub fn is_dir(&self) -> bool { self.is(libc::S_IFDIR) }
    pub fn is_file(&self) -> bool { self.is(libc::S_IFREG) }
    pub fn is_symlink(&self) -> bool { self.is(libc::S_IFLNK) }

    pub fn is(&self, mode: mode_t) -> bool { self.mode & libc::S_IFMT == mode }
}

But it may be worth noting the users assumptions do seem to hold true on Windows, where "is_file()" is just shorthand for "!is_symlink() && !is_directory()". But even then, if you're only asking "Can I read a thing", the "!is_symlink()" check is something you don't want to be doing, so even there, the "!is_dir()" is possibly the right call. Ugh.

https://github.com/rust-lang/rust/blob/a24f636e60a5da57ab641d800ac5952bbde98b65/src/libstd/sys/windows/fs.rs#L586-L617

impl FileType {
    fn new(attrs: c::DWORD, reparse_tag: c::DWORD) -> FileType {
        FileType {
            attributes: attrs,
            reparse_tag: reparse_tag,
        }
    }
    pub fn is_dir(&self) -> bool {
        !self.is_symlink() && self.is_directory()
    }
    pub fn is_file(&self) -> bool {
        !self.is_symlink() && !self.is_directory()
    }
    pub fn is_symlink(&self) -> bool {
        self.is_reparse_point() && self.is_reparse_tag_name_surrogate()
    }
    pub fn is_symlink_dir(&self) -> bool {
        self.is_symlink() && self.is_directory()
    }
    pub fn is_symlink_file(&self) -> bool {
        self.is_symlink() && !self.is_directory()
    }
    fn is_directory(&self) -> bool {
        self.attributes & c::FILE_ATTRIBUTE_DIRECTORY != 0
    }
    fn is_reparse_point(&self) -> bool {
        self.attributes & c::FILE_ATTRIBUTE_REPARSE_POINT != 0
    }
    fn is_reparse_tag_name_surrogate(&self) -> bool {
        self.reparse_tag & 0x20000000 != 0
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-lintArea: New lintsL-suggestionLint: Improving, adding or fixing lint suggestionsgood first issueThese issues are a good way to get started with Clippy

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions