Add support for serializing flattened attributes #415
+302
−22
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a new attempt to add serialization support for maps and flattened fields, since #223 seem to have stalled out. This adds checks to ensure a stable column order.
Why
There are multiple reasons:
To support having a row that is made up of nested structs. Sometimes you have some data that is in a struct which is used throughout the program, and you want to write it in csv with an additional column. Currently you would have to create two structs and copy all members of the first struct into the second, which can be annoying for structs with a large number of members:
Example without this PR
Example with this PR
To support some dynamic columns that depend on some configuration which is not known at compile-time in addition to some common fields that are known at compiletime.
Example
To archive symmetry with the deserializer, which already has support for the
serde(flatten)attribute, so it's surprising that serializing support does not.How
This implementation focusses on minimal impact of serialization performance. Adding serialization support on it's own does not add any overhead and is implemented in the first commit (30cc9a0). The required steps are as follows:
flattenmember,serdewill callserialize_map. Similar toserialize_structwe check that we are not already in the process of serializing a row, so nested maps are not allowed.SerializeMap::serialize_keyandSerializeMap::serialize_valueis called.SerializeMap::endis called.The flatten attribute works the same way
serdetreats any struct with one or moreflattenmember as a map, so the following inputs are equivalent to the serializer:However, this will fall apart when used with a map that has an unstable entry order, where the order of the entries is not guaranteed, like the
HashMap. This is why commit 22acfdc adds a check that errors when we encounter out of order keys. It does that by keeping a list of serialized keys and comparing each incoming key with the next expected key.Alternatives
This implementation only collects keys and checks them for map based data, not for pure structs without
flattenmembers. This is done so that this check has no impact when no maps are used, but this means it is still possible to get out of order columns when mixing non-map and map based data:Example
An alternative to generating an error for out of order keys is also to support them by accumulating the serialized values and then writing them out at the end (in
SerializeMap::end) in the right order. That wayHashMapwould be supported, but we'd loose support for two colums with the same name and we'd need to store the serialized values in addition to the serialized keys.Future work
I did not add an option to enable or disable the key order check, but this can be easily added if to the builder if we agree on this solution.
I hope this helps to finally bring this feature to the crate. Thank you for your work in maintaining it!