Description
Recently I've been looking at ways I can improve the standard library implementation on Windows.
There are many issues which have been open for a long time which I believe I can help to fix but I was worried about the disruption doing so may cause. @m-ou-se suggested I create an issue here documenting my current plans in case there were anything the libs team felt needed discussing. Or at least to give everyone a heads up.
A big chunk of the open Windows issues fall into one of two categories:
std::env
andstd::process
. SpecificallyCommand
but also argument parsing (and #85270 fixes an issue with environment variables). I recently made two other PRs (#86406 and #86467) to address a couple of smaller issues withCommand
.- File paths. This is often an issue because Rust's understanding of Windows paths is limited. Which, to be fair, is quite an involved topic.
There are other issues particularly within std::fs
(e.g. #81357 and #29497) and I've suggested a workaround (#86447) for a driver problem that is the root cause of a number of issues.
But anyway, back to the two categories mentioned above.
std::{env, process}
Argument parsing
In Windows, arguments are passed to a program as a string. This string is commonly parsed by the C runtime (CRT) in order to recover the arguments. In 2008 Microsoft subtly changed the way the CRT parses arguments. However, Rust still uses the old method. The change is small, it's centred around how consecutive double-quotes (e.g. ""
) are handled when encountered in a quoted block. See #44650. Fixing this should be fairly easy but is it technically a breaking a change?
Command
uses the CRT rules to build an argument string to pass to an application. However, not every application parses commands according to these rules. Currently there's no good way to bypass Rust's handling. #85832 (by kornelski) seeks to address this. It might also be good to be able to override the zeroth argument, which by convention is set to the name of the binary being run but it needn't be. But I think both of these would be purely additions to the library and not change existing behaviour.
Command
executable search
The way Command
finds executables is weird. By default it uses the CreateProcessW
search sequence. However if you use Commad::env
at all then it first attempts to emulate Linux's behaviour, which is rather surprising. See #37519 and #50870. The specific implementation of that could be improved but I question if it's a good idea in the first place.
Also I'm not sure that the default search sequence is good. On internals, a CVE was discussed about Command
looking for executables in the current directory. The problem is that CreateProcessW
itself is, as with most of the Windows API, concerned with backwards compatibility rather than doing the "best" thing out of the box. Fortunately it provides for using a path to an executable instead of relying on its search.
It was brought up in that thread that the way of resolving the executable name could be made configurable, which is a good idea. However, I still think changing the defaults might be good. But it might be disruptive.
There are also a number of other issues, mostly around child processes inheriting handles from the parent (#70719, #67616, #54760 and #38227. But I have not looked too deeply into these yet.
File paths
Most of the open issues here centre around the legacy MAX_PATH
limits and verbatim paths (aka \\?\
prefixed paths) which allow bypassing the limit.
- #77576: rust-lang/rust CI build failed when generating test file paths exceeding about 260 chars on Windows
- #76586: fs::write fails on long file name in deep directory path on windows
- #67403: fs::copy on Windows has a 260 character filename limit
- #32689: Automatically insert
\\?\
where needed on Windows - #31789: Windows paths should not be prefixed with
\\?\
when displayed - #42869: std::fs::canonicalize returns UNC paths on Windows, and a lot of software doesn't support UNC paths
- #56030: Support path prefix which refers to per-user DosDevices (
"\??\C:\..."
) - #49342: Windows: success of canonicalizing
r"\"
depends on whether set_current_dir has been called
Ideally verbatim paths would be purely an implementation detail that's not normally visible unless necessary. The std would use whatever path type makes most sense in the context, unless explicitly asked to do otherwise. I'm not sure if Rust can reach that ideal or not (at least not unless there's a std 2.0
). What I think it can do is handle verbatim paths a little bit better.
In Win32, resolving .
and ..
paths is done purely lexically. That is, the path is made absolute without ever touching the filesystem. The Win32 APIs will usually do this for you. However, verbatim paths are passed to the kernel (almost) verbatim. So .
and ..
are never resolved and are instead treated as if they were actual files or directories (which they're not, according to Windows filesystem drivers). This has consequences for Rust. What should happen when using Path::push
on such paths? Should it automatically resolve the path as you push? Should it panic? Should it resolve them at the point of use (e.g. fs::File::open
)? What if the user really really did want to pass a ..
component to the kernel? (note: I can't think why anyone would but it is hypothetically possible that some device could use such components in some way.)
There is also a problem with /
. The Win32 API will automatically convert it to \
before passing the path to the kernel... unless verbatim paths are used. In that case it's again passed to the kernel verbatim, where it will be treated like any other character in a file name instead of being a path separator.
I'm saying all this not to find answers here (internals or the main repo would be a better venue for discussing the particulars) but to give a flavour of the changes that may be required to handle Windows paths better. But I worry that such improvements may mean breaking the documented behaviour of Path
, if only in small ways. That said, I am looking for ways to improve the implementation that have much less risk of being disruptive.