Skip to content

ACP: ref access to core::net::Ip*Addr octet bytes, impl AsRef<[u8]> for core::net::Ip*Addr #535

Closed
@mammothbane

Description

@mammothbane

Proposal

Problem statement

IpAddr, Ipv4Addr, and Ipv6Addr don't provide reference access to their contents, while they do provide an array copy via .octets(). This is a problem in situations where we want the Ip*Addr to own and support a lifetime for the address octets as a slice — &[u8] is very common in function arguments, as is AsRef<[u8]>, for instance.

Current Ip*Addr types can only support a signature like fn(&[u8]) by first stack-allocating a local via .octets():

fn f(_: &[u8]);

let addr = Ipv4Addr::LOCALHOST;

// This works
let ary = addr.octets();
f(&ary); 

// Desugars to the above -- implicit stack local
f(&addr.octets());

// Does not exist
f(addr.as_ref()); 

This is workable, if awkward, in situations where the lifetime of the array is functionally equivalent to the lifetime of the Ip*Addr. However, in other cases, the lack of ref access is a problem:

static ADDR: Ipv4Addr::new(1, 2, 3, 4);

let bytes: &[u8] = {
    // This array's lifetime is limited to this scope.
    let ary = ADDR.octets();

    // We "should" be able to provide &'static access to the contents of ADDR
    // due to its lifetime, but we can't because we need to copy first -- this doesn't work.
    &ary
};

We could always carry around a copy of the contents if we know we need a reference:

static ADDR: Ipv4Addr::new(1, 2, 3, 4);
static ADDR_BYTES: [u8; 4] = ADDR.octets();

let bytes = &ADDR_BYTES;

// But that implies that we probably want something like this:
struct Ipv4AddrWithAccessibleStorage(pub Ipv4Addr, pub [u8; 4]);

// ... But this type rather absurdly duplicates the IpAddr's storage, and we would naturally implement AsRef on it
// Which begs the question why this functionality isn't available on the Ip*Addrs themselves

Motivating examples or use cases

This code is messy/WIP, but see DynAddress here. This is part of a userspace packet switch experiment that can operate at L2 or L3, and I define and provide trait DynAddress for IPv4, IPv6, Ethernet, and IEEE802.15.4 addresses. The addresses are treated as opaque bytestrings to make the switch implementation agnostic to the address representation.

Relevant code subset pulled out for readability:

pub trait DynAddress: Debug {
    fn as_bytes(&self) -> &[u8];
    // ...
}

impl DynAddress for core::net::Ipv4Addr {
    fn as_bytes(&self) -> &[u8] {
        &self.octets() // doesn't work
    }
    // ...
}

impl DynAddress for smoltcp::wire::EthernetAddress {
    fn as_bytes(&self) -> &[u8] {
        self.as_bytes() // this exists
    }
    // ...
}

Notably, I can't associate a const LEN: usize such that fn as_bytes(&self) -> [u8; LEN] because this would make DynAddress not object-safe. I also can't bound DynAddress by Hash e.g. (which I need, and which &[u8] provides) because it's not object-safe.

My code (this line specifically) used to work with smoltcp's independent Ipv4Addr and Ipv6Addr definitions, but when smoltcp switched to the core::net types, my code stopped working, because this removed the as_bytes method. I can't restore the functionality now because Ip*Addr doesn't provide ref access to the contents.

My patch solution is to add an associated type to DynAddress: type BytesRepr<'a>: AsRef<[u8]> = &[u8] where Self: 'a;, which is overridden with [u8; 4] and [u8; 16] for the Ip*Addr types.

Solution sketch

Implement (using the same byte order as .octets()):

  • pub const fn Ipv4Addr::as_array(&self) -> &[u8; 4]
  • pub const fn Ipv6Addr::as_array(&self) -> &[u8; 16]
  • pub const fn IpAddr::as_slice(&self) -> &[u8]
  • impl AsRef<[u8]> for Ipv4Addr
  • impl AsRef<[u8]> for Ipv6Addr
  • impl AsRef<[u8]> for IpAddr

Example from PR (linked below) for visibility:

impl Ipv4Addr {
    pub const fn as_array(&self) -> &[u8; 4] {
        &self.octets
    }
}

I've opened a PR with an initial implementation.

Alternatives

Deal with it

This clearly isn't such a huge problem that people are coming out of the woodwork to demand a change, so this ACP could be rejected. As my example shows, it's possible to work around this with traits, e.g. AsRef<[u8]> or Borrow<Borrowed=[u8]>.

One possible reason for this rejection would be to leave the internals of the Ip*Addr types flexible. Currently, I believe the internal representation of these types is totally undetermined by the API — the address could be stored in a string and parsed at access time repeatedly, for all the user knows. Stabilizing this feature would mean that Ipv4Addr must contain (or eventually point to, I suppose) a contiguous sequence of four bytes representing the address (ditto with Ipv6Addr / 16 bytes), such that we can construct the returned refs for the new API with a valid lifetime. I think the current internal representations are canonical, so I think it is worth doing this, but the argument bears mentioning.

If this result ends up being chosen, I'd at least like to leave a comment explaining the reasoning and linking back to this ACP in the source code.

Endianness

There's an endianness problem here — .octets() is very clear that it represents the octet (big-endian) order of the address, so a user calling this function shouldn't be confused about this. But the names as_slice, as_array, and as_ref don't make the same specification, which could possibly lead users to expect different behavior. Without bikeshedding a specific name, an alternative like .octets_ref() that represents this more clearly could be appropriate.

AsRef

AsRef also has a canonicity problem -- there can only be one AsRef<[u8]> for each type, and so using octet order is an opinionated choice. For Ipv4Addr, I think providing AsRef<[u8]> is correct, as octet order is pragmatically speaking canonical (u32 interpretations of IPv4s are little more than a historical curiosity now), but for Ipv6Addr this is less clear. Indeed, segment order is the most common interpretation, so providing AsRef<[u8]> could be ambiguous.

On the other hand, the Ipv6Addr type as currently implemented cannot provide an AsRef<[u8]> in any other order than big-endian, and if we provided as_array, that would be fixed by the API (or require internal duplicate storage in the other endianness), so this is some argument for providing AsRef.

as_slice() vs. as_array()

My initial implementation in the PR below provides as_slice() -> &[u8] instead of as_array() -> [u8; N]@scottmcm mentioned that this throws away information, but still provides the ref access, so I'm planning on changing it.

Links and related work

I've opened a PR implementing the functionality here. It currently provides as_slice rather than as_array, but I'm planning on updating that.

AsRef<[u8]> impls are pulled out here to avoid needing an FCP to land an initial implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions