Skip to content

Conversation

@Abestanis
Copy link

@Abestanis Abestanis commented Nov 12, 2025

This is a new attempt to add serialization support for maps and flattened fields, since #223 seem to have stalled out. This adds checks to ensure a stable column order.

Why

There are multiple reasons:

  • To support having a row that is made up of nested structs. Sometimes you have some data that is in a struct which is used throughout the program, and you want to write it in csv with an additional column. Currently you would have to create two structs and copy all members of the first struct into the second, which can be annoying for structs with a large number of members:

    Example without this PR
    struct Data {
        data1: f64,
        // .. // Lots more fields
        data_n: f64,
    }
    
    #[derive(Serialize)]
    struct Row {
        time: u32, // One extra field for the csv, then a copy of all of the other fields.
        data1: f64,
        // .. // Lots more fields
        data_n: f64,
    }
    
    impl Row {
        fn from_data(time: u32, value: Data) -> Self {
            Self {
                time,
                data1: value.data1,
                // .. // Lots more fields
                data_n: value.data_n,
            }
        }
    }
    Example with this PR
    #[derive(Serialize)]
    struct Data {
        data1: f64,
        // .. // Lots more fields
        data_n: f64,
    }
    
    #[derive(Serialize)]
    struct Row {
        time: u32, // One extra field for the csv.
        #[serde(flatten)]
        data: Data,
    }
    
    impl Row {
        fn from_data(time: u32, data: Data) -> Self {
            Self { time, data }
        }
    }
  • To support some dynamic columns that depend on some configuration which is not known at compile-time in addition to some common fields that are known at compiletime.

    Example
    #[derive(Serialize)]
    struct Row {
        data1: f64,
        // .. // More fields
        #[serde(flatten)]
        extra_fields: BTreeMap<String, f64>,
    }
  • To archive symmetry with the deserializer, which already has support for the serde(flatten) attribute, so it's surprising that serializing support does not.

How

This implementation focusses on minimal impact of serialization performance. Adding serialization support on it's own does not add any overhead and is implemented in the first commit (30cc9a0). The required steps are as follows:

  • When encountering a map or a struct with a flatten member, serde will call serialize_map. Similar to serialize_struct we check that we are not already in the process of serializing a row, so nested maps are not allowed.
  • Then, for each entry of the map or member of the stuct, SerializeMap::serialize_key and SerializeMap::serialize_value is called.
  • Finally, SerializeMap::end is called.
The flatten attribute works the same way

serde treats any struct with one or more flatten member as a map, so the following inputs are equivalent to the serializer:

let mut input = BTreeMap::new();
input.insert("a", 2.0);
input.insert("b", 1.0);
#[derive(Serialize)]
struct Inner {
    b: f64,
}
#[derive(Serialize)]
struct Outer {
    a: f64,
    #[serde(flatten)]
    inner: Inner,
}
let input = Outer { a: 2.0, inner: Inner { b: 1.0 } };
let mut extra = BTreeMap::new();
extra.insert("b", 1.0);
#[derive(Serialize)]
struct Row {
    a: f64,
    #[serde(flatten)]
    extra: BTreeMap<&'static str, f64>,
}
let input = Row { a: 2.0, extra };

However, this will fall apart when used with a map that has an unstable entry order, where the order of the entries is not guaranteed, like the HashMap. This is why commit 22acfdc adds a check that errors when we encounter out of order keys. It does that by keeping a list of serialized keys and comparing each incoming key with the next expected key.

Alternatives

This implementation only collects keys and checks them for map based data, not for pure structs without flatten members. This is done so that this check has no impact when no maps are used, but this means it is still possible to get out of order columns when mixing non-map and map based data:

Example
#[derive(Serialize)]
struct Row {
    b: f64,
    a: f64,
}

let mut writer = Writer::from_writer(vec![]);
writer.serialize(Row {
    b: 1.0,
    a: 2.0,
}).unwrap();
let mut map = BTreeMap::new();
map.insert("a", 2.0);
map.insert("b", 1.0);
writer.serialize(map).unwrap();
b,a
1.0,2.0
2.0,1.0
It is in my opinion very unlikely that anyone will do something like this, nevertheless it would be possible to collect all keys regardless of whether they come from a map or a struct. This would add overhead for non-map based data.

An alternative to generating an error for out of order keys is also to support them by accumulating the serialized values and then writing them out at the end (in SerializeMap::end) in the right order. That way HashMap would be supported, but we'd loose support for two colums with the same name and we'd need to store the serialized values in addition to the serialized keys.

Future work

I did not add an option to enable or disable the key order check, but this can be easily added if to the builder if we agree on this solution.


I hope this helps to finally bring this feature to the crate. Thank you for your work in maintaining it!

@Abestanis Abestanis force-pushed the feature/flatten_serialize branch from d3852ca to 22acfdc Compare November 12, 2025 14:07
@Abestanis Abestanis force-pushed the feature/flatten_serialize branch from 49a7662 to 62fb407 Compare November 12, 2025 17:34
@ElouanPetereau
Copy link

Greetings,

This merge request seems to be relatively new but since it fix the unordered issue and having the #[serde(flatten)] support added to this crate would be really great would you have any time to check it @BurntSushi?

If not, is there anything we could help either you or @Abestanis for making progress on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants