Skip to content

fs::read_dir iterator loops forever on same entry #50619

Closed
@sharkdp

Description

@sharkdp

Bug description

There are certain directories that cause the fs::read_dir iterator (ReadDir) to loop indefinitely. While I am not sure what the exact properties of these directories are, I know that they do appear in /proc in the presence of zombie processes.

Consider the following Rust program ...

use std::env;
use std::fs;

fn main() {
    let path = env::args().nth(1).unwrap();

    for entry in fs::read_dir(path).unwrap() {
        println!("{:?}", entry);
    }
}

... and a zombie process with process id $ZOMBIE_PID (see below how to create a zombie process on purpose). Running the above program with:

cargo run -- /proc/$ZOMBIE_PID/net

results in an infinite loop, printing:

Err(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })
Err(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })
Err(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })
...

How to create a zombie process to reproduce this?

  1. Copy the code from https://stackoverflow.com/a/25228579/704831 into a file called zombie.c
  2. Compile it gcc -o zombie zombie.c
  3. Run it: ./zombie
  4. Get the PID of the "defunct"/zombie process: ps -ef | grep '<defunct>'

Analysis

I did some debugging and I believe I found the cause of this.

When called on /proc/$ZOMBIE_PID/net, the readdir_r(3) function

int readdir_r(DIR *dirp, struct dirent *entry, struct dirent **result);

returns error code 22 and - at the same time - returns NULL in *result, signalling the end of the directory stream.

If I am reading the code in the standard library correctly, this case can not be handled properly at the moment:

Code from the next function of impl Iterator for ReadDir:

loop {
if readdir64_r(self.dirp.0, &mut ret.entry, &mut entry_ptr) != 0 {
return Some(Err(Error::last_os_error()))
}
if entry_ptr.is_null() {
return None
}
if ret.name_bytes() != b"." && ret.name_bytes() != b".." {
return Some(Ok(ret))
}
}

To handle this properly (without looping forever), one would probably have to check for entry_ptr.is_null() in the first (Some(Err(...))) case as well. The result (whether or not it returned a NULL pointer) would probably have to be stored in some internal state of the iterator. On the forthcoming next call, the iterator could then return None.

Meta

> rustc --version
rustc 1.25.0 (84203cac6 2018-03-25)

> uname -s -r -v -m -p -i -o
Linux 4.16.7-1-ARCH #1 SMP PREEMPT Wed May 2 21:12:36 UTC 2018 x86_64 unknown unknown GNU/Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    O-linuxOperating system: LinuxT-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions