-
-
Notifications
You must be signed in to change notification settings - Fork 791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When a serde(untagged) enum can't deser, show all the reasons why #2376
When a serde(untagged) enum can't deser, show all the reasons why #2376
Conversation
I was writing some code recently that uses `serde(untagged)` for an enum when deserializing certain JSON schema, but I was unhappy with the quality of the error messages I was getting, because it doesn't really tell you why each possibility doesn't work. I noticed that there is a TODO item in the `serde_derive` proc macro code that generates this deserialization. I decided that trying to improve the error messages upstream in serde-derive is easier than trying to change how I'm using serde, so I took a stab at implementing this TODO, and updating the tests that test error messages, and writing some more tests. I have tried to follow the patterns and conventions that I have seen elsewhere in the serde-derive source code, and I think that the code gen is good in that it uses `format_args!` like other parts of serde error handling and avoids making a new dynamic memory allocation. When untagged deserialization fails, the errors are collected on the stack rather than in a new dynamically-sized container. Let me know if you think this is a good direction, I'm happy to iterate if this patch is interesting to you. Thanks for building serde, it's great.
b0a649e
to
7a2e9f5
Compare
This comment was marked as spam.
This comment was marked as spam.
Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes.
Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes.
586: Improve error messaged by dropping untagged enums r=jonasbb a=jonasbb Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes. Co-authored-by: Jonas Bushart <jonas@bushart.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll discuss this PR with dtolnay next time we chat. May take a few weeks, but if I don't comment here again in the next month, please ping me.
// We need two copies of this iterator | ||
let err_identifiers1 = (0..num_variants).map(|idx| format_ident!("_err{}", idx)); | ||
let err_identifiers2 = err_identifiers1.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While not strictly necessary, as you could probably do the iteration over the attempts
directly in the format_args!
arguments, this seems better to avoid any question about the order of evaluation.
// The format string we are building will have the following structure: | ||
// "data did not match any variant of untagged enum{}\nvar1: {}\nvar2: {}\nvar3: {}" | ||
let mut err_format_string = fallthrough_msg.to_owned(); | ||
let mut num_variants = 0usize; | ||
for var in variants.iter().filter(|variant| !variant.attrs.skip_deserializing()) { | ||
err_format_string.push_str("\n"); | ||
err_format_string.push_str(&var.ident.to_string()); | ||
err_format_string.push_str(": {}"); | ||
num_variants += 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not confident this will render well in all the error libraries out there, but we can adjust it once we have seen some of these messages in the wild. The Display/Debug output isn't something we provide back compat guarantees for anyway.
We had a chat, and are both worried about the diagnostic's usefulness. It will mostly appear to users of your library that have given bad input, and whether it helps them actually solve the problem is not clear. We feel like while this solves the immediate issue you are seeing, we can do better than just patching a diagnostic. Thus we have an alternate proposal: Add new examples to https://github.com/serde-rs/serde-rs.github.io/tree/master/_src that explains how to use untagged enums.
Thanks for your work. I'm going to close this PR in my attempt to get a hold of our PR queue, but feel free to use this PR to ask follow up questions. |
I was writing some code recently that uses
serde(untagged)
for an enum when deserializing certain JSON schema, but I was unhappy with the quality of the error messages I was getting, because it doesn't really tell you why each possibility doesn't work.I noticed that there is a TODO item in the
serde_derive
proc macro code that generates this deserialization.I decided that trying to improve the error messages upstream in serde-derive is easier than trying to change how I'm using serde, so I took a stab at implementing this TODO, and updating the tests that test error messages, and writing some more tests.
I have tried to follow the patterns and conventions that I have seen elsewhere in the serde-derive source code, and I think that the code gen is good in that it uses
format_args!
like other parts of serde error handling and avoids making a new dynamic memory allocation. When untagged deserialization fails, the errors are collected on the stack rather than in a new dynamically-sized container.Let me know if you think this is a good direction, I'm happy to iterate if this patch is interesting to you. Thanks for building serde, it's great.