Skip to content

make GetUserProfileDirectoryW more strictly match documented behavior #4332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented May 19, 2025

I don't see an indication in the docs that size would be written to on success.

However, std seems to rely on that. We get UB otherwise:

error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory
  --> /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/clone.rs:512:5
   |
LL | /     impl_clone! {
LL | |         usize u8 u16 u32 u64 u128
LL | |         isize i8 i16 i32 i64 i128
LL | |         f16 f32 f64 f128
LL | |         bool char
LL | |     }
   | |_____^ using uninitialized data, but this operation requires initialized memory
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
   = note: BACKTRACE:
   = note: inside `std::clone::impls::<impl std::clone::Clone for u16>::clone` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/clone.rs:505:25: 505:30
   = note: inside `std::option::Option::<&u16>::cloned` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/option.rs:1948:29: 1948:38
   = note: inside `<std::iter::Cloned<std::slice::Iter<'_, u16>> as std::iter::Iterator>::next` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/iter/adapters/cloned.rs:41:9: 41:32
   = note: inside `<std::char::DecodeUtf16<std::iter::Cloned<std::slice::Iter<'_, u16>>> as std::iter::Iterator>::next` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/char/decode.rs:46:21: 46:37
   = note: inside `std::sys_common::wtf8::Wtf8Buf::from_wide` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/wtf8.rs:237:21: 237:58
   = note: inside `<std::ffi::OsString as std::os::windows::ffi::OsStringExt>::from_wide` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/os/windows/ffi.rs:94:44: 94:68
   = note: inside `std::sys::pal::windows::os2path` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys/pal/windows/mod.rs:276:19: 276:41
   = note: inside `<for<'a> fn(&'a [u16]) -> std::path::PathBuf {std::sys::pal::windows::os2path} as std::ops::FnOnce<(&[u16],)>>::call_once - shim(for<'a> fn(&'a [u16]) -> std::path::PathBuf {std::sys::pal::windows::os2path})` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5: 250:71
   = note: inside `std::sys::pal::windows::fill_utf16_buf::<{closure@std::sys::pal::windows::os::home_dir_crt::{closure#0}}, for<'a> fn(&'a [u16]) -> std::path::PathBuf {std::sys::pal::windows::os2path}, std::path::PathBuf>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys/pal/windows/mod.rs:269:27: 269:36
   = note: inside `std::sys::pal::windows::os::home_dir_crt` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys/pal/windows/os.rs:201:9: 216:10
   = note: inside `<fn() -> std::option::Option<std::path::PathBuf> {std::sys::pal::windows::os::home_dir_crt} as std::ops::FnOnce<()>>::call_once - shim(fn() -> std::option::Option<std::path::PathBuf> {std::sys::pal::windows::os::home_dir_crt})` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5: 250:71
   = note: inside `std::option::Option::<std::path::PathBuf>::or_else::<fn() -> std::option::Option<std::path::PathBuf> {std::sys::pal::windows::os::home_dir_crt}>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/option.rs:1577:21: 1577:24
   = note: inside `std::sys::pal::windows::os::home_dir` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys/pal/windows/os.rs:252:5: 255:31
   = note: inside `std::env::home_dir` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/env.rs:652:5: 652:23
note: inside `main`
  --> tests/pass/shims/env/home.rs:12:5
   |
LL |     env::home_dir().unwrap();
   |     ^^^^^^^^^^^^^^^
   = note: this error originates in the macro `impl_clone` (in Nightly builds, run with -Z macro-backtrace for more info)

@ChrisDenton any idea why?

The logic for this is here.

@ChrisDenton
Copy link
Member

An undocumented behaviour of the function is that it will indeed set the size on success. The standard library has been relying on this for >10 years at this point https://github.com/rust-lang/rust/blob/70ed3a48dfa301c5bb56de3e0a7c25214539b83c/src/libstd/sys/windows/os.rs#L368.

Maybe we shouldn't and instead the string should just be treated as null-terminated (which is guaranteed). Or maybe the Microsoft docs should just be updated to note the behaviour.

@RalfJung
Copy link
Member Author

Or maybe the Microsoft docs should just be updated to note the behaviour.

Is that realistically achievable? Cc @cgettys-microsoft @sivadeilra

@sivadeilra
Copy link

Or maybe the Microsoft docs should just be updated to note the behaviour.

Is that realistically achievable? Cc @cgettys-microsoft @sivadeilra

Yes. Most of the Windows SDK is managed in a Git repo and Microsoft does accept (after review) edits to them. There is an "Edit this doc" link on the page for that API and it points to this URL:

https://github.com/MicrosoftDocs/sdk-api/blob/docs/sdk-api-src/content/userenv/nf-userenv-getuserprofiledirectoryw.md

@RalfJung
Copy link
Member Author

Wow, nice. :)

I know way too little about Windows to attempt that, so I'll close this PR for now -- rust-lang/rust#141244 at least adds docs to std for this so it's more clear what the Miri comment refers to.

@ChrisDenton do you think it is worth opening a rustc issue about this?

@RalfJung RalfJung closed this May 19, 2025
@ChrisDenton
Copy link
Member

@ChrisDenton do you think it is worth opening a rustc issue about this?

For sure. We should be using the API as documented and it'd be good to track that how ever we end up fixing it.

@RalfJung
Copy link
Member Author

Done: rust-lang/rust#141254

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request May 19, 2025
… r=ChrisDenton

windows: document that we rely on an undocumented property of GetUserProfileDirectoryW

r? `@ChrisDenton`
Also see rust-lang/miri#4332
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 19, 2025
Rollup merge of rust-lang#141244 - RalfJung:GetUserProfileDirectoryW, r=ChrisDenton

windows: document that we rely on an undocumented property of GetUserProfileDirectoryW

r? `@ChrisDenton`
Also see rust-lang/miri#4332
@workingjubilee
Copy link
Member

@RalfJung A PR for this was already opened in 2024: MicrosoftDocs/sdk-api#1810

@cgettys-microsoft
Copy link
Contributor

cgettys-microsoft commented May 21, 2025

I'm of the (personal) opinion that Rust is already within the documented contract laid out by the static analysis annotations on the function signature; they say unambigiously that the size parameter is written on success since it's Inout - see MicrosoftDocs/sdk-api#1810 (comment)

The documentation talks about the failure case because that's not expressed in the SAL annotations.
A documentation update, IMO, is warranted, but I think you could safely close rust-lang/rust#141254 on the basis that the SAL annotations do promise this.

@cgettys-microsoft
Copy link
Contributor

cgettys-microsoft commented May 21, 2025

It might be a tiny bit fuzzy - technically the _Out_writes_opt_ does not guarantee the number of valid characters ("A pointer to an array of s elements (resp. bytes) that will be written by the function. The array elements don't have to be valid in pre-state, and the number of elements that are valid in post-state is unspecified. If there are annotations on the parameter type, they're applied in post-state. For example, consider the following code.") - but in conjunction with the length being Inout, it's at the very least strongly implied.

@cgettys-microsoft
Copy link
Contributor

cgettys-microsoft commented May 21, 2025

Or in other words:

  • the SAL annotations say that we write some value to *lpcchSize on success; docs tell you we write it on failure too, and what value it has in failure case
  • they also say that we will write up to *lpcchSize characters into lpProfileDir
  • Technically, there's a tiny bit of ambiguity in the semantics here - we don't explicitly specify that *lpcchSize after the call == the number of characters written.
  • However, the SAL annotations also do not specify that the return value is null terminated (edit: but the docs do!)
  • So there's no other sane value lpcchSize could have, (edit: removed incorrect statement)

@cgettys-microsoft
Copy link
Contributor

I'm going to read the SAL annotations again tomorrow when I'm better rested... and maybe @sivadeilra can take a look too.
I'm a bit rusty on them at this point, and they're subtle enough to begin with.

@RalfJung
Copy link
Member Author

These SAL annotations do not show up in the docs, right? How would I even query them without having access to the SDK? And is there some statement that makes them normative on their own even if they say more than the docs say?

@cgettys-microsoft
Copy link
Contributor

  • Right, not in the docs :(
  • Not sure what the official story here is. If you don't have a Windows machine, I think win32metadata is your best bet:microsoft/win32metadata.
    I believe this is what powers windows-rs and at least one other C++ language projection of ours ( @kennykerr, anything to add?).

Here's the header in question over there: https://github.com/microsoft/win32metadata/blob/c7ffce1c97bb724580580f3856b94c6d4ad677ed/generation/WinSDK/RecompiledIdlHeaders/um/UserEnv.h#L375

The SDK itself is also freely available, but not necessarily super accessible if you don't have a Windows machine, since it comes as an installer and a bunch of compressed files: https://developer.microsoft.com/en-us/windows/downloads/windows-sdk/

  • RE: them being normative, is something like this sufficient? https://learn.microsoft.com/en-us/cpp/code-quality/understanding-sal?view=msvc-170
    "The Microsoft source-code annotation language (SAL) provides a set of annotations that you can use to describe how a function uses its parameters, the assumptions that it makes about them, and the guarantees that it makes when it finishes. The annotations are defined in the header file <sal.h>. Visual Studio code analysis for C++ uses SAL annotations to modify its analysis of functions. For more information about SAL 2.0 for Windows driver development, see SAL 2.0 Annotations for Windows Drivers.

Natively, C and C++ provide only limited ways for developers to consistently express intent and invariance. By using SAL annotations, you can describe your functions in greater detail so that developers who are consuming them can better understand how to use them."
They're explicitly designed for expressing guarantees/invariants that C or C++ do not have the ability to express.

Of course, they're an imperfect tool, and they're not always used to their fullest extent, as they're a pretty complicated thing to understand. E.g. I'm pretty sure the question we're going back and forth on boils down to "should lpProfileDir be annotated with the more precise _Out_writes_to_opt_(_Old_(*lpcchSize), *lpcchSize) rather than its current _Out_writes_to(*lpcchSize)?" (doc link: https://learn.microsoft.com/en-us/cpp/code-quality/annotating-function-parameters-and-return-values?view=msvc-170 )

To which I personally believe the answer is "yes, as that's the current behavior and as far as I know, has been for a long time, and there's no other conceivable behavior that makes sense given that lpcchSize is annotated _In_out_". But insisting on at least docs being updated or the annotation being updated would be a reasonable stance too, since technically the semantics of the current annotation may not quite guarantee what you're after, even if it's reasonably clear what's meant from context.

@ChrisDenton
Copy link
Member

However, the SAL annotations also do not specify that the return value is null terminated (edit: but the docs do!)

The type is an LPWSTR which has the _Null_terminated_ attribute so it must be null-terminated.

@kennykerr
Copy link

These older APIs can be challenging as you have to sometimes rely on some combination of the signature, SAL annotations, documentation, and observed behavior. Backward compatibility is also taken very seriously. In this case, the implementation specifically notes that the length is returned upon success for backward compatibility.

@cgettys-microsoft
Copy link
Contributor

However, the SAL annotations also do not specify that the return value is null terminated (edit: but the docs do!)

The type is an LPWSTR which has the _Null_terminated_ attribute so it must be null-terminated.

Ah, right. I'm used to working in a code base that doesn't always use LPWSTR and usually uses the z suffix instead

@cgettys-microsoft
Copy link
Contributor

Closing the loop here for completeness - Arlie Davis (@sivadeilra) and Kenny Kerr (@kennykerr) reviewed that the implementation does in fact make this guarantee, and Drew Batchelor (@drewbatgit) kindly reviewed and merged the documentation update (MicrosoftDocs/sdk-api#1810).

It'll probably take a bit to show up in the public doc pages - but it's been merged, so it will happen :).

Thanks @ChrisDenton for the contribution!

@RalfJung
Copy link
Member Author

I already have a PR up based on that. :)
rust-lang/rust#141405

@cgettys-microsoft
Copy link
Contributor

I saw a second after I replied here :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants