-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amend RFC 517: Add material on std::fs #739
Conversation
A general comment: some of the functionality currently in These are not permanent changes, nor or they intended to be stabilized immediately. In many cases, the movement just signifies that we do not have a cross-platform implementation yet. Very little in In other words, this RFC stabilizes only the parts that land in |
Note that with this revision, both The recursive make/remove directory APIs have not yet been renamed, though the proposed names are not ideal. I'd like to suggest |
By renamed, do you mean in your branch where you are making these changes? |
Ah, no sorry. I meant, from the original version of the RFC. |
There were some worries about the name |
For reference, here's why |
#### Files | ||
[Files]: #files | ||
|
||
The `File` type will largely stay as it is today, except that it will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens when close fails, because you are using an async filesystem? (note this is different than caring about the server crashing). and please don't say fsync/flush each time because some people might want to run some rust programs from a script and then run 'sync' manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RFC is largely concerned with the API of std::fs
and how it's affected by the implementation details. I believe that we'll always have a Drop
implementation which calls close
on the file descriptor, so in some sense this is an implementation detail. Do you have an idea in mind where this would change the API of File
, however?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API http://doc.rust-lang.org/std/io/fs/struct.File.html doesn't tell me how I know close succeeded, so I view it as incomplete. I also think what functions panic is part of the public API that shouldn't change.
As I posted http://discuss.rust-lang.org/t/fs-file-should-panic-if-implicit-close-fails-and-no-panic-is-active/1349 , I think there should be an explicit .close() method that people can call, that returns an error, and otherwise drop should panic if implicit close() fails and there is not already a panic in progress. I think that makes the most code correct.
edit: and having an explicit .close() could change what errors are returned by other functions, e.g. read() could then panic or return an error ("file is closed").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to continue the discussion about explicit close over here.
Are you sure you want hard- and sym-links in the cross-platform portion of the filesystem API? They're ubiquitous on POSIX, but (as I understand it) not terribly well supported on Windows, and then there's platforms that don't support them at all, or support links that are neither hard nor soft (like Mac OS/HFS+ "shortcuts"). It seems to me that should definitely be squirreled away in std::platform, along with things like mknod() and mkfifo(). |
The `open_mode` function will take an `OpenOptions` struct, which will | ||
encompass today's `FileMode` and `FileAccess` and support a | ||
builder-style API. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One method I think we forgot to consider is truncate
on files today. Right now the function matches Unix semantics where it basically acts as a set_len
function. It will shorten the file or extend it (filling in with 0s) if necessary.
This function, however doesn't exist on Windows to my knowledge (cc @retep998, I may be missing something). Currently we're using SetEndOfFile
which operates slightly different than the Unix variant. On Windows you must first seek
to a position and then "truncate" the file, which is to say that Windows will simply set the length of the file on disk to the current pointer in the file.
Unfortunately, the current semantics may lead to some loss on windows. Our "emulation" of ftruncate
on Windows involves seeking the file to the desired length, calling SetEndOfFile
, and then seeking back to where the file used to be. The "seek back" operation, however, may fail, which would mean the operation actually completed successfully but we're forced to still return an error.
I think I would personally recommend matching Windows semantics rather than unix semantics in this case. We would drop the u64
argument to the function, call SetEndOfFile
on Windows, and unix would be a combination of lseek
(to learn the position) followed by ftruncate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! @retep998 has pointed out SetFileInformationByHandle
to me which worked like a charm!
In that case I think we're fine :)
@Screwtapello Windows supports hard and symbolic links just fine. Really the issue is not whether the OS supports them, but whether the filesystem supports them. In which case all we have to do is error if the filesystem doesn't support hard or symbolic links. |
It may also be worth talking about Should we return |
my 2cents on EINTR. If the user goes out of their way to set up a signal handler for SIGINT, they probably don't want you to blindly loop on it when it happens. And if they really want to mask that signal temporarily (e.g. during a certain close() or other operation), they can also do that with global sigprocmask() state. (threading issues and unwinding / restore issues around that not withstanding). |
I'd also note that on bsd and linux, SA_RESTART can be used with sigaction(2), to give more control over syscall restarting. edit: and my gist is, the user can do this if they dont want the rust functions to fail, rather than have rust re-implement the wheel here |
**Links**: | ||
|
||
* `hard_link` (renamed from `link`). Take `AsPath` bound. | ||
* `sym_link` (renamed from `symlink`). Take `AsPath` bound. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never see 'symlink' broken up into 'sym link', it's a full-blown portmanteau. Seems like this should be either 'symlink' or 'soft_link'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in favor of soft_link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the value in adding an underscore the sym_link
here? Even the wikipedia article on symbolic links notes the term symlink
as an alias.
As @retep998 mentioned, Windows does have good support for these functions. I believe that the symlink-creating function did not exist in XP which is certainly one of the pain points of exposing the API. We will likely just return an error for XP, however (as @retep998 mentioned).
Could you elaborate a bit on this? Do you mean that the some filesystems that OSX uses do not support this? Or are you thinking that some distributions simply don't have the @kjpgit I find your comments pretty convincing, it sounds like we should punt |
I take back what I said about EINTR handling, it's pretty hard (forgot about SIGCHLD, SIGWNCH, etc., and SA_RESTART special cases are a mile long). Also, Python appears to be going down the 'retry everything` route: https://www.python.org/dev/peps/pep-0475/ ("PEP 475 - Retry system calls failing with EINTR") That does seem saner and makes things much easier to test and reason about. edit: to be clear, python can get away with that because they have integrated the 'catch SIGINT' into the interpreter and raise an exception on that, because that's one thing people actually want to let change the control flow of their program (although it's still pretty hard / impossible to test that, I can say from experience, when literally every single line of code can throw an exception). for rust that might be harder, as panics aren't recoverable and I'm not sure if you register your own signal handlers by default (would be obviously bad for FFI). However, I think it's generally good practice to force people to 'opt in' to potentially unsafe or buggy or really hard to test behavior, and having to 'opt in' to use system calls that randomly fail due to EINTR (based on some totally random SIGCHLD caused by another library) might be more prudent, even if it irritates someone who wants a blocking read/write to be 'cancelable' WITH A CUSTOM SIGNAL HANDLER THAT KEEPS THE PROGRAM STILL RUNNING (Default ctrl-c behavior terminates process which will still work). |
`perm` accessors. The various `os::platform` modules will offer | ||
extension methods on this structure. | ||
|
||
* `set_perm` (renamed from `chmod`). Take `AsPath` bound, and a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider calling this set_permissions
. perm
is not a commonly used abbreviation of permissions
, while dir
is for directory
for instance.
I just looked at my web server code and I have a bare .write() call on a socket (as well as .push), and no retry loop (it was in a try!). I've been writing sockets code for 15 years and I still wouldn't expect that to fail spuriously with EINTR due to some random library potentially having a signal handler. So I think it's time to put 1970s-era signals interrupting 'blocking' system calls in the grave, and for people who want interruption either 1) use non blocking io, which is is far saner (you control where and what thread(s) get notified) 2) use explicit _noretry versions of "blocking" system calls. |
Modern filesystems can store lots of different kinds of things. Probably nothing is universal, but files and directories are nearly so. Hard-links and symlinks are widespread, followed by a long tail of less-common things like HFS+ "shortcuts", FIFOs, alternate data streams, and so forth. A cross-platform API has to draw the line somewhere, and I was expecting Rust's cross-platform API to draw the line at files and directories. It's true that both POSIX and Windows support creating symlinks, but first-class support would also require If it's really possible to implement solid support for hard-links and symlinks on all supported platforms, then fair enough, let's do it. |
What is the thread safety (and expensiveness there of) of File / BufferedWriter? Is it going to be crappy like glibc, where every fwrite() takes a mutex lock, and you have to use fwrite_unlocked() to get decent performance, and pinky swear it's not being used in multiple threads? It would be great if the ownership semantics of rust mean you don't have to internally lock for no reason. But what about stdout/stderr handles? It looks like stdout() creates a private buffer for the existing fd (1,2) and flushes but doesn't close on drop. So... no internal mutex lock should ever be needed, right? Just clarifying. If a close() method gets added to Writer, the stdout()/stderr() handles might have to intercept that and actually not close the underlying fd. Only when main function ends would stdout/stderr get closed? (you could close stdout first and if that fails, panic to stderr). |
fyi LineBufferedWriter.into_inner seems like a bad method WRT error handling. it calls flush but no error response? Also, should stdout() return a fully buffered writer if there is no terminal, like c does? |
* `remove_dir` (renamed from `rmdir`). Take `AsPath` bound. | ||
* `remove_dir_all` (renamed from `rmdir_recursive`). Take | ||
`AsPath` bound. | ||
* `walk_dir`. Take `AsPath` bound. Yield an iterator over `IoResult<DirEntry>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between read_dir
and walk_dir
as they both return iterators over DirEntry
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
walk
is recursive, read
is not, as they are today in std::old_io
. This is standard terminology.
This RFC looks like it's fairly close to being accepted, so I've created a PR for its implementation: rust-lang/rust#21936. As with the previous PRs, nothing is set in stone and I'll be sure to update with any final revisions to this RFC! |
One part I'd like to bring up here as well is that the It's unclear to me to what extent we should be providing this contextual information. |
When creating/opening a file, there are actually 3 combinations that need to be represented:
The current implementation proposed merely has a |
Another possible option along that route would be: /// If the `create` option is specified then this flag indicates whether
/// creation will fail if the file already exists.
fn exclusive(&mut self, exclusive: bool) -> &mut OpenOptions; |
I've updated the RFC to match the implementation; the changes are pretty minor renamings that have already been proposed on the thread above. |
This discussion on this RFC seems to have slowed down quite a bit and it seems like the overall design is well agreed-upon. There's still a number of points to bikeshed on in a few aspects such as various naming here and there, but this RFC has reached a point where it's ready to merge. There's also a number of extensions to this API which can happen over time which should all be backwards compatible within the framework created here. I'd encourage lingering comments to be placed on the current PR (rust-lang/rust#21936) as well. Thanks again for the comments everyone! |
This commit is an implementation of [RFC 739][rfc] which adds a new `std::fs` module to the standard library. This module provides much of the same functionality as `std::old_io::fs` but it has many tweaked APIs as well as uses the new `std::path` module. [rfc]: rust-lang/rfcs#739
This commit is an implementation of [RFC 739][rfc] which adds a new `std::fs` module to the standard library. This module provides much of the same functionality as `std::old_io::fs` but it has many tweaked APIs as well as uses the new `std::path` module. [rfc]: rust-lang/rfcs#739
The IO reform RFC is being split into several semi-independent pieces, posted as PRs like this one.
This RFC amendment adds the file system section.
Rendered