Skip to content

Add CStr::bytes iterator #104353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 14, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Reduce unsafe code, use more NonNull APIs per @cuviper review
  • Loading branch information
clarfonthey committed Mar 12, 2024
commit a38a556ceb4b65c397a47bcbe35d004bc9b8626f
20 changes: 11 additions & 9 deletions library/core/src/ffi/c_str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -507,6 +507,13 @@ impl CStr {
self.inner.as_ptr()
}

/// We could eventually expose this publicly, if we wanted.
#[inline]
#[must_use]
const fn as_non_null_ptr(&self) -> NonNull<c_char> {
NonNull::from(&self.inner).as_non_null_ptr()
}

Comment on lines +510 to +516
Copy link
Contributor Author

@clarfonthey clarfonthey Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to put this because it seems like something we'd probably add in the future, but I don't really feel like doing an ACP for this right now. It felt reasonable enough to factor into its own method, at least.

I thought about why as_ptr uses addr_of! instead of a safe version (since, well, we require a reference to &self to make this work, and thus any potential UB arguments don't make sense) and concluded that we don't need to here, but if I'm wrong about that, please let me know!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're referring to -- CStr::as_ptr() just calls the slice as_ptr(), which is just a couple pointer casts. I only found addr_of! used in Rc/Arc::as_ptr, and those are digging into a further heap allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, I totally mixed up the implementations for as_ptr and to_bytes_with_nul. The latter uses addr_of to convert the slice into *const [u8] instead of the existing method, but that's going to be unsafe anyway, so, it's fair.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I don't see any reason it's needed there either, just a little bit shorter than it was with double pointer casts before addr_of!. No matter.

/// Returns the length of `self`. Like C's `strlen`, this does not include the nul terminator.
///
/// > **Note**: This method is currently implemented as a constant-time
Expand Down Expand Up @@ -776,20 +783,15 @@ pub struct Bytes<'a> {
impl<'a> Bytes<'a> {
#[inline]
fn new(s: &'a CStr) -> Self {
Self {
// SAFETY: Because we have a valid reference to the string, we know
// that its pointer is non-null.
ptr: unsafe { NonNull::new_unchecked(s.as_ptr() as *const u8 as *mut u8) },
phantom: PhantomData,
}
Self { ptr: s.as_non_null_ptr().cast(), phantom: PhantomData }
}

#[inline]
fn is_empty(&self) -> bool {
// SAFETY: We uphold that the pointer is always valid to dereference
// by starting with a valid C string and then never incrementing beyond
// the nul terminator.
unsafe { *self.ptr.as_ref() == 0 }
unsafe { self.ptr.read() == 0 }
}
}

Expand All @@ -806,11 +808,11 @@ impl Iterator for Bytes<'_> {
// it and assume that adding 1 will create a new, non-null, valid
// pointer.
unsafe {
let ret = *self.ptr.as_ref();
let ret = self.ptr.read();
if ret == 0 {
None
} else {
self.ptr = NonNull::new_unchecked(self.ptr.as_ptr().offset(1));
self.ptr = self.ptr.offset(1);
Some(ret)
}
}
Expand Down