Skip to content

std::io: vectored reads with uninitialized memory (Read::read_buf_vec) #104

Closed
@nrc

Description

@nrc

Proposal

This is an implementation of part of RFC 2930 (read_buf) which was not deeply specified in the RFC

Problem statement

Implement Read::read_buf_vec.

Motivation, use-cases

Vectored IO without having to initialise all the buffers first.

Solution sketches

This mostly follows BorrowedBuf/BorrowedCursor, extended to slices for vectored IO. We need a new version of IoSliceMut which does not assume that all data in it is initialised. Note that this is an improvement over the read_vec API because the underlying buffers can be reused.

An alternative would be to use the old single buffer design, but this has problems with a complex API and a soundness footgun. We could also pass a slice of BorrowedCursors, but this is pretty horrible ergonomically.

/// A buffer type used with `Read::read_buf_vectored`. Unlike `IoSliceMut`, there is no guarantee
/// that its memory has been initialised.
///
/// It is semantically a wrapper around an &mut [MaybeUninit<u8>], but is guaranteed to be ABI
/// compatible with the `iovec` type on Unix platforms and WSABUF on Windows.
pub struct IoSliceMaybeUninit<'a> {}

impl Debug for IoSliceMaybeUninit<'_> {}

/// Create a new `IoSliceMaybeUninit` from an uninitialized buffer.
impl<'a> From<&'a mut [MaybeUninit<u8>]> for IoSliceMaybeUninit<'a> {
    fn from(buf: &'a mut [MaybeUninit<u8>]) -> IoSliceMaybeUninit<'a> {}
}

impl<'a> IoSliceMaybeUninit<'a> {
    /// Create a new IoSliceMaybeUninit from an existing mutable slice of `u8`s.
    ///
    /// SAFETY: all bytes in the slice must be initialized by the time the IoSliceMaybeUninit is
    /// destroyed and thus access to the data is restored for `slice` or other views of the data.
    pub unsafe fn from_slice(slice: &'a mut [u8]) -> IoSliceMaybeUninit<'a> {}

    /// Create a new IoSliceMaybeUninit with its internal cursor advanced.
    pub fn advance(&self, n: usize) -> Self {}

    /// View the slice as a slice of `u8`s.
    ///
    /// # Safety
    ///
    /// The caller must ensure that all elements of the slice have been initialized.
    pub unsafe fn as_slice(&self) -> &[u8] {}

    /// View the slice as a mutable slice of `u8`s.
    ///
    /// # Safety
    ///
    /// The caller must ensure that all elements of the slice have been initialized.
    pub unsafe fn as_mut_slice(&mut self) -> &mut [u8] {}

    /// View the slice as a mutable slice of `MaybeUninit<u8>`.
    pub fn as_maybe_init_slice(&mut self) -> &mut [MaybeUninit<u8>] {}

    /// Returns the number of elements in the slice.
    pub fn len(&self) -> usize {}
}

/// A borrowed byte buffer, consisting of multiple underlying buffers, which is incrementally filled
/// and initialized. Primarily designed for vectored IO.
///
/// This type is a sort of "double cursor". It tracks three regions in the buffer: a region at the beginning of the
/// buffer that has been logically filled with data, a region that has been initialized at some point but not yet
/// logically filled, and a region at the end that is fully uninitialized. The filled region is guaranteed to be a
/// subset of the initialized region.
///
/// In summary, the contents of the buffer can be visualized as:
/// ```not_rust
/// [   |   |      |    |   |       |   ] Underlying buffers
/// [             capacity              ]
/// [ filled |         unfilled         ]
/// [    initialized    | uninitialized ]
/// ```
///
/// A `BorrowedSliceBuf` is created around some existing data (or capacity for data) via a unique reference
/// (`&mut`). The `BorrowedSliceBuf` can be configured (e.g., using `clear` or `set_init`), but otherwise
/// is read-only. To write into the buffer, use `unfilled` to create a `BorrowedSliceCursor`. The cursor
/// has write-only access to the unfilled portion of the buffer (you can think of it as a
/// write-only iterator).
///
/// The lifetime `'a` is a bound on the lifetime of the underlying data.
pub struct BorrowedSliceBuf<'a> {}

impl<'a> BorrowedSliceBuf<'a> {
    /// Create a new `BorrowedSliceBuf` from a slice of possibly initialized io slices.
    pub fn new<'b: 'a>(bufs: &'a mut [IoSliceMaybeUninit<'b>]) -> BorrowedSliceBuf<'a> {}

    /// Returns the length of the filled part of the buffer.
    pub fn len(&self) -> usize {}

    /// Returns the number of completely filled slices in the buffer.
    pub fn len_filled_slices(&self) -> usize {}

    /// Returns the number of filled elements in any partially filled slice.
    ///
    /// If there are no partially filled slices, then this method returns `0`.
    pub fn len_partial_filled_slice(&self) -> usize {}

    /// Iterate over the filled portion of the buffer.
    pub fn iter_filled_slices(&self) -> FilledSliceIterator<'_, 'a> {}

    /// Returns a cursor over the unfilled part of the buffer.
    pub fn unfilled<'this>(&'this mut self) -> BorrowedSliceCursor<'this> {}

    /// Clears the buffer, resetting the filled region to empty.
    ///
    /// The number of initialized bytes is not changed, and the contents of the buffer are not modified.
    pub fn clear(&mut self) -> &mut Self {}

    /// Asserts that a prefix of the underlying buffers are initialized. The initialized prefix is
    /// all of the first `b - 1` buffers and the first `n` bytes of the `b`th buffer. In other words,
    /// `(b, n)` is the coordinates of the first uninitialized byte in the buffers.
    ///
    /// `BorrowedSliceBuf` assumes that bytes are never de-initialized, so this method does nothing when called with fewer
    /// bytes than are already known to be initialized.
    ///
    /// # Safety
    ///
    /// The caller must ensure that all of the `(b, n)` prefix has already been initialized.
    pub unsafe fn set_init(&mut self, b: usize, n: usize) -> &mut Self {}
}

/// A writeable view of the unfilled portion of a [`BorrowedSliceBuf`](BorrowedSliceBuf).
///
/// Provides access to the initialized and uninitialized parts of the underlying `BorrowedSliceBuf`.
/// Data can be written directly to the cursor by using [`append`](BorrowedSliceCursor::append) or
/// indirectly by writing into a view of the cursor (obtained by calling `as_mut`, `next_init_mut`,
/// etc.) and then calling `advance`.
///
/// Once data is written to the cursor, it becomes part of the filled portion of the underlying
/// `BorrowedSliceBuf` and can no longer be accessed or re-written by the cursor. I.e., the cursor tracks
/// the unfilled part of the underlying `BorrowedSliceBuf`.
///
/// The lifetime `'a` is a bound on the lifetime of the underlying data.
pub struct BorrowedSliceCursor<'a> {}

impl<'a> BorrowedSliceCursor<'a> {
    /// Clone this cursor.
    ///
    /// Since a cursor maintains unique access to its underlying buffer, the cloned cursor is not
    /// accessible while the clone is alive.
    pub fn reborrow<'this>(&'this mut self) -> BorrowedSliceCursor<'this> {}

    /// Returns the available space in the cursor.
    pub fn capacity(&self) -> usize {}

    /// Returns the number of bytes written to this cursor since it was created from a `BorrowBuf`.
    ///
    /// Note that if this cursor is a reborrow of another, then the count returned is the count written
    /// via either cursor, not the count since the cursor was reborrowed.
    pub fn written(&self) -> usize {}

    /// Returns a mutable reference to the whole cursor.
    ///
    /// Returns a guard type which dereferences to a `&mut [IoSliceMaybeUninit<'a>]`
    ///
    /// # Safety
    ///
    /// The caller must not uninitialize any bytes in the initialized portion of the cursor.
    pub unsafe fn as_mut<'this>(&'this mut self) -> BorrowedSliceGuard<'this, 'a> {}

    /// Returns a shared reference to the initialized portion of the first (at least partially)
    /// initialised buffer in the cursor.
    ///
    /// Returns a reference to a slice of a single underlying buffer. That buffer will be the first
    /// unfilled buffer which is at least partially initialized. The returned slice is the part of
    /// that buffer which is initialized. If there is no part of any buffer which is both unfilled
    /// and initialised, then this method returns `None`.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is either filled or initialized (e.g., by calling `advance`).
    pub fn next_init_ref(&self) -> Option<&[u8]> {}

    /// Returns a mutable reference to the initialized portion of the first (at least partially)
    /// initialised buffer in the cursor.
    ///
    /// Returns a reference to a slice of a single underlying buffer. That buffer will be the first
    /// unfilled buffer which is at least partially initialized. The returned slice is the part of
    /// that buffer which is initialized. If there is no part of any buffer which is both unfilled
    /// and initialised, then this method returns `None`.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is either filled or initialized (e.g., by calling `advance`).
    pub fn next_init_mut(&mut self) -> Option<&mut [u8]> {}

    /// Returns a mutable reference to the uninitialized portion of the first (at least partially)
    /// uninitialised buffer in the cursor.
    ///
    /// It is safe to uninitialize any of these bytes.
    ///
    /// Returns `None` if `self` is entirely initialized.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is either filled or initialized (e.g., by calling `advance`).
    pub fn next_uninit_mut(&mut self) -> Option<&mut [MaybeUninit<u8>]> {}

    /// Returns a mutable reference to the first buffer in the cursor.
    ///
    /// Returns `None` if `self` is empty.
    ///
    /// Does not iterate over buffers in any way. Calling this method multiple times will return
    /// the same slice unless data is filled (e.g., by calling `advance`).
    ///
    /// # Safety
    ///
    /// The caller must not uninitialize any bytes in the initialized portion of the cursor.
    pub unsafe fn next_mut(&mut self) -> Option<&mut [MaybeUninit<u8>]> {}

    /// Initializes all bytes in the cursor.
    pub fn ensure_init(&mut self) -> &mut Self {}

    /// Initializes all bytes in the first (at least partially unfilled) buffer in the cursor.
    pub fn ensure_next_init(&mut self) -> &mut Self {}

    /// Advance the cursor by asserting that `n` bytes have been filled.
    ///
    /// After advancing, the `n` bytes are no longer accessible via the cursor and can only be
    /// accessed via the underlying `BorrowedSliceBuf`. I.e., the `BorrowedSliceBuf`'s filled portion
    /// grows by `n` elements and its unfilled portion (and the capacity of this cursor) shrinks by
    /// `n` elements.
    ///
    /// # Safety
    ///
    /// The caller must ensure that the first `n` bytes of the cursor have been properly
    /// initialised.
    pub unsafe fn advance(&mut self, mut n: usize) -> &mut Self {}

    /// Appends data to the cursor, advancing position within its buffer.
    ///
    /// # Panics
    ///
    /// Panics if `self.capacity()` is less than `buf.len()`.
    pub fn append(&mut self, mut buf: &[u8]) {}

    /// Sets the initialized region to the minimum of the currently initialized region
    /// and the filled region.
    ///
    /// The caller must ensure all invariants for `self.bufs.filled` are satisfied (see the field definition)
    ///
    /// # Safety
    ///
    /// The caller must ensure that the filled region is entirely initialized.
    unsafe fn update_init_to_filled(&mut self) {}
}

impl<'a> Write for BorrowedSliceCursor<'a> {
    fn write(&mut self, buf: &[u8]) -> Result<usize> {}
    fn flush(&mut self) -> Result<()> {}
}

/// An iterator over the filled slices of a `BorrowedSliceBuf`.
///
/// See `BorrowedSliceBuf::iter_filled`.
pub struct FilledSliceIterator<'buf, 'data> {}

impl<'buf, 'data> Iterator for FilledSliceIterator<'buf, 'data> {
    type Item = &'buf [u8];
    fn next(&mut self) -> Option<&'buf [u8]>
}

/// Guard type used by `BorrowedSliceCursor::as_mut`.
///
/// Presents a view of the cursor containing only the filled data (via the `Deref` impls). Also
/// resets the state of the underlying BorrowedSliceBuf to a view of the complete
/// buffer when dropped.
pub struct BorrowedSliceGuard<'buf, 'data> {}

impl<'buf, 'data> Drop for BorrowedSliceGuard<'buf, 'data> {}
impl<'buf, 'data> Deref for BorrowedSliceGuard<'buf, 'data> {
    type Target = [IoSliceMaybeUninit<'data>];
    fn deref(&self) -> &[IoSliceMaybeUninit<'data>] {}
}
impl<'buf, 'data> DerefMut for BorrowedSliceGuard<'buf, 'data> {}

Links and related work

Draft PR with incomplete implementation: rust-lang/rust#101842

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions