Skip to content

Implement 32-bit widestrings for Linux/WSL #1874

Closed
@MarijnS95

Description

@MarijnS95

With recent work we've finally enabled windows-rs to run on Linux, many thanks for opening up 🥳!

However, for my specific use-case of wrapping DirectXShaderCompiler in Rust there's one more thing I need. Widestring types are prevalent in the DirectXShaderCompiler API, but don't currently work on Linux.

Wide chars and strings are typically 32-bit, with Windows being the exception rather than the rule at 16-bit. That is also how they're implemented in DXC when compiled for non-Windows, making them incompatible (read: segfaults) with the current UTF-16 "assumption" inside windows.

To support this format we can "simply" apply some cfg() changes that turn the anyway-custom BSTR / P*WSTR into *mut u32-carrying structures, with an API to match. In my experiment I have simply written a:

#[cfg(windows)]
type WCHAR = u16;
#[cfg(not(windows))]
type WCHAR = u32;

Problem 1: converting to and from str/String

In a previous (unaccepted) contribution I opted to use widestring, since it implements everything we need. That was (understandably) frowned upon as the windows crate should be the base for many other things, not the other way around (and not have many dependencies in any case).

The widestring implementation simply uses .chars() to acquire an iterator over char which is a "Unicode code point" but exactly fits UTF-32 chars too. It works and makes sense, but I'm no Unicode expert and don't know if this is the right way to implement it.

Problem 2: Allocations and length

Current implementations use HeapAlloc + GetProcessHeap + HeapFree to store owned strings. As per #1842 we can leave it up to the users on non-Windows machines to provide fallbacks for these. However, it's probably more efficient and friendly to provide direct fallbacks behind a cfg() in windows (i.e. inside heap_alloc and heap_free, or inside the string implementations even).

Regardless, this API is annoying/impossible to use correctly since most (all) Rust allocation APIs need to know the size of the allocation they're freeing. This includes a theoretical Box::<[WCHAR]> (where the slice has a dynamic size), and std::alloc::dealloc requires the same. Setting it to 0 seems to work (the underlying allocator used on my machine doesn't care about the size) but I don't think that's a fair assumption/workaround to apply.

Perhaps Param::Boxed can and should store this length? The string types can also have a .len() function to compute it at runtime based on a null-character (or BSTRs length-prefix).

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions