Skip to content

Derived serialization of Cow<'a, [u8]> is not always reversible #2940

@E-Mans-Application

Description

@E-Mans-Application

Hello,

Currently, the automatically derived implementation (which uses serde::private::de::borrow_cow_bytes) can only deserialize Cow<'_, [u8]> from an actual byte slice (either a str or a [u8]), not a "sequence".

As a result, it is not possible to decode bytes from a JSON array. Since non-UTF8 byte slices will serialize to a JSON array, this means serialization won't be reversible.

See this minimal reproducible example:

use serde::{Deserialize, Serialize};
use std::borrow::Cow;

#[derive(Serialize, Deserialize, Debug, PartialEq, Eq)]
struct Foo<'a>(   
    #[serde(borrow)]
    Cow<'a, [u8]>,
);

#[test]
fn test() {
    let foo = Foo(Cow::Owned(vec![0, 159, 146, 150])); // Invalid UTF-8
    let json = serde_json::to_string(&foo).unwrap();
    println!("{json}"); // [0, 159, 146, 150]

    let foo2 = serde_json::from_str(&json).unwrap();
    // called `Result::unwrap()` on an `Err` value: Error("invalid type: sequence, expected a byte array", line: 1, column: 1)

    assert_eq!(foo, foo2);
}

Conceptually, I'm not sure whether the issue is on serde_json or serde itself. Anyway, it seems pretty easy to fix it directly in serde, by overriding visit_seq in implementation of Visitor for CowBytesVisitor (private API):

fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
where
    A: serde::de::SeqAccess<'a>,
{
    let mut bytes = Vec::with_capacity(seq.size_hint().unwrap_or_default());
    while let Some(byte) = seq.next_element()? {
        bytes.push(byte);
    }
    self.visit_byte_buf(bytes)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions