Infer naming convention when converting objects to structs #636
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Struct types support renaming of fields for encoding/decoding. A common use of this is to enforce a camelCase convention in the serialized format:
Previously when converting an object to a struct we'd always use the renamed field names rather than the original names. This was true whether the input was a
dict
, a non-dict mapping, mapping, or an arbitrary object via attributes iffrom_attributes=True
. The latter two inputs will rarely/never occur when coming from a serialization framework, but are more commonly used with database/ORM-like objects. In this case it's more likely that the original attribute names are more useful, as both the database and struct object representations are internal to the application (unlike the serialized names which may have to match some external convention like camelCase).We now infer the intended naming schem when a non-dict mapping or object is passed to
msgspec.convert
to convert to amsgspec.Struct
type. The inference process is as follows:The attribute names are tried first
If an attribute name is present in the input AND the attribute name doesn't match the renamed name, then attribute names are used exclusively for the remainder of the conversion process.
If an attribute name isn't present AND the attribute name doesn't match the renamed name, then the renamed name is tried. If the renamed name is present, then renamed names are used exclusively for the remainder of the conversion process.
A key point here is that inputs may not mix attribute and renamed names together - the inference process will decide to use either only one or the other depending on what names are present. Using
Example
above:An input with
field_one
andfield_two
would be validAn input with
fieldOne
andfieldTwo
would be validAn input with
field_one
andfieldTwo
would error sayingfield_two
is missing.An input with
fieldOne
andfield_two
would error sayingfieldTwo
is missing.The overhead of this inference process is low - at worst only one excess
getattr
call is made to determine whether to use the original or renamed names.To reiterate, this change only affects object (non-dict mapping or arbitrary object) inputs to
msgspec.convert
when converting to aStruct
type. Inputs of other types likedict
are still assumed to have come from a serialization protocol and will always use the renamed names.Fixes #630.