Description
Internally, String
is implemented as a Vec<u8>
. However, CString
is implemented as a Box<[u8]>
.
This means that CString
is currently treated as noalias
, like Box
(but not Vec
) is. (See rust-lang/unsafe-code-guidelines#326.) This means that there are additional rules of how pointers pointing into the CString
buffer can be used, and violating these rules lead to UB. As far as I can tell, the fact that this applies to CString
is not documented anywhere.
I don't know if this has produced any issues in practice, but I believe that this is an undesirable footgun that should be eliminated (e.g., by changing CString
to use NonNull<[u8]>
instead), especially since CString
is intended to be turned into a pointer and passed to FFI, potentially resulting in subtle UB that Miri can't detect.
For example, consider the following code, which is a stand-in for how a program with C FFI might use CString
:
use std::ffi::{CString, c_char, CStr};
fn main() {
let my_string = CString::new("abc").unwrap();
let ptr: *const c_char = my_string.as_ptr();
// Suppose that `ptr` is given to some C FFI function here
let _moved_my_string = my_string;
// Suppose that `ptr` is re-obtained from C FFI here
let _my_str = unsafe { CStr::from_ptr(ptr) };
}
This code might seem intuitively fine to many Rust users, but it causes UB according to Miri.
Error from Miri
error: Undefined Behavior: attempting a read access using <2513> at alloc1005[0x0], but that tag does not exist in the borrow stack for this location
--> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ffi/c_str.rs:740:22
|
740 | unsafe { strlen(s) }
| ^^^^^^^^^
| |
| attempting a read access using <2513> at alloc1005[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of an access at alloc1005[0x0..0x1]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <2513> was created by a SharedReadOnly retag at offsets [0x0..0x4]
--> src/main.rs:5:30
|
5 | let ptr: *const c_char = my_string.as_ptr();
| ^^^^^^^^^^^^^^^^^^
help: <2513> was later invalidated at offsets [0x0..0x4] by a Unique retag (of a reference/box inside this compound value)
--> src/main.rs:7:28
|
7 | let _moved_my_string = my_string;
| ^^^^^^^^^
= note: BACKTRACE (of the first span):
= note: inside `core::ffi::c_str::strlen::runtime` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ffi/c_str.rs:740:22: 740:31
= note: inside `core::ffi::c_str::strlen` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics/mod.rs:3886:9: 3886:61
= note: inside `std::ffi::CStr::from_ptr::<'_>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ffi/c_str.rs:267:28: 267:39
note: inside `main`
--> src/main.rs:9:28
|
9 | let _my_str = unsafe { CStr::from_ptr(ptr) };
| ^^^^^^^^^^^^^^^^^^^
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
(It's debatable whether CString
not having extra capacity is good or not, but that's a separate issue.)
@rustbot labels +A-FFI +A-box +T-libs-api +T-opsem
Meta
The Miri error was from running the code on the playground with rust version 1.86.0-nightly (2025-02-08 43ca9d18e333797f0aa3)
.