When a serde(untagged) enum can't deser, show all the reasons why #2376

cbeck88 · 2023-02-18T19:09:26Z

I was writing some code recently that uses serde(untagged) for an enum when deserializing certain JSON schema, but I was unhappy with the quality of the error messages I was getting, because it doesn't really tell you why each possibility doesn't work.

I noticed that there is a TODO item in the serde_derive proc macro code that generates this deserialization.

I decided that trying to improve the error messages upstream in serde-derive is easier than trying to change how I'm using serde, so I took a stab at implementing this TODO, and updating the tests that test error messages, and writing some more tests.

I have tried to follow the patterns and conventions that I have seen elsewhere in the serde-derive source code, and I think that the code gen is good in that it uses format_args! like other parts of serde error handling and avoids making a new dynamic memory allocation. When untagged deserialization fails, the errors are collected on the stack rather than in a new dynamically-sized container.

Let me know if you think this is a good direction, I'm happy to iterate if this patch is interesting to you. Thanks for building serde, it's great.

I was writing some code recently that uses `serde(untagged)` for an enum when deserializing certain JSON schema, but I was unhappy with the quality of the error messages I was getting, because it doesn't really tell you why each possibility doesn't work. I noticed that there is a TODO item in the `serde_derive` proc macro code that generates this deserialization. I decided that trying to improve the error messages upstream in serde-derive is easier than trying to change how I'm using serde, so I took a stab at implementing this TODO, and updating the tests that test error messages, and writing some more tests. I have tried to follow the patterns and conventions that I have seen elsewhere in the serde-derive source code, and I think that the code gen is good in that it uses `format_args!` like other parts of serde error handling and avoids making a new dynamic memory allocation. When untagged deserialization fails, the errors are collected on the stack rather than in a new dynamically-sized container. Let me know if you think this is a good direction, I'm happy to iterate if this patch is interesting to you. Thanks for building serde, it's great.

Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes.

586: Improve error messaged by dropping untagged enums r=jonasbb a=jonasbb Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes. Co-authored-by: Jonas Bushart <jonas@bushart.org>

oli-obk

I'll discuss this PR with dtolnay next time we chat. May take a few weeks, but if I don't comment here again in the next month, please ping me.

oli-obk · 2023-04-20T19:04:19Z

serde_derive/src/de.rs

+     // We need two copies of this iterator
+     let err_identifiers1 = (0..num_variants).map(|idx| format_ident!("_err{}", idx));
+     let err_identifiers2 = err_identifiers1.clone();


While not strictly necessary, as you could probably do the iteration over the attempts directly in the format_args! arguments, this seems better to avoid any question about the order of evaluation.

oli-obk · 2023-04-20T19:05:43Z

serde_derive/src/de.rs

+    // The format string we are building will have the following structure:
+    // "data did not match any variant of untagged enum{}\nvar1: {}\nvar2: {}\nvar3: {}"
+    let mut err_format_string = fallthrough_msg.to_owned();
+    let mut num_variants = 0usize;
+    for var in variants.iter().filter(|variant| !variant.attrs.skip_deserializing()) {
+        err_format_string.push_str("\n");
+        err_format_string.push_str(&var.ident.to_string());
+        err_format_string.push_str(": {}");
+        num_variants += 1;
+     }


I'm not confident this will render well in all the error libraries out there, but we can adjust it once we have seen some of these messages in the wild. The Display/Debug output isn't something we provide back compat guarantees for anyway.

oli-obk · 2023-04-23T21:28:33Z

We had a chat, and are both worried about the diagnostic's usefulness. It will mostly appear to users of your library that have given bad input, and whether it helps them actually solve the problem is not clear. We feel like while this solves the immediate issue you are seeing, we can do better than just patching a diagnostic.

Thus we have an alternate proposal:

Add new examples to https://github.com/serde-rs/serde-rs.github.io/tree/master/_src that explains how to use untagged enums.

e.g. via a custom visitor for a custom Deserialize impl to produce a single unified diagnostic instead of a generic line per variant.
You could also ask https://github.com/jonasbb/serde_with if they'd want to add helpers for automatically creating such visitors
Or create a helper crate for easily writing visitors (e.g. with a builder pattern)

Thanks for your work. I'm going to close this PR in my attempt to get a hold of our PR queue, but feel free to use this PR to ask follow up questions.

cbeck88 force-pushed the improve-serde-untagged-error-messages branch from b0a649e to 7a2e9f5 Compare February 18, 2023 19:09

cbeck88 mentioned this pull request Feb 18, 2023

make a way to store hardening advisory data with enclave measurements and load it mobilecoinfoundation/mobilecoin#3148

Merged

fixup

a5eabd9

PSeitz mentioned this pull request Feb 23, 2023

Sending a malformatted aggregation request with CLI does not return a useful message quickwit-oss/quickwit#2800

Closed

This comment was marked as spam.

Sign in to view

appetrosyan mentioned this pull request Mar 29, 2023

Support i128 and u128 serde-rs/json#846

Closed

jonasbb mentioned this pull request Apr 2, 2023

Improve error messaged by dropping untagged enums jonasbb/serde_with#586

Merged

oli-obk approved these changes Apr 20, 2023

View reviewed changes

oli-obk closed this Apr 23, 2023

oli-obk mentioned this pull request Jul 12, 2023

Accumulation of error messages from validation of multiple fields #2507

Closed

dtolnay mentioned this pull request Aug 4, 2023

Collect errors when deserializing untagged enums #1544

Closed

AlexTMjugador mentioned this pull request Aug 4, 2023

Consider adding helpers for better deserialize error messages for untagged enums jonasbb/serde_with#635

Open

dtolnay mentioned this pull request Aug 28, 2023

Better message when failing to match any variant of an untagged enum #2157

Closed

ssokolow mentioned this pull request Jan 23, 2024

Unhelpful Serde error on config structure the docs seem to say should work xremap/xremap#406

Closed

ysndr mentioned this pull request Jul 11, 2024

refactor: add collective type for descriptor variants flox/flox#1630

Merged

Pushkarm029 mentioned this pull request Oct 11, 2024

Better error response for wrong datetime format in REST filter qdrant/qdrant#3531

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When a serde(untagged) enum can't deser, show all the reasons why #2376

When a serde(untagged) enum can't deser, show all the reasons why #2376

cbeck88 commented Feb 18, 2023

This comment was marked as spam.

oli-obk left a comment

oli-obk Apr 20, 2023

oli-obk Apr 20, 2023

oli-obk commented Apr 23, 2023

When a serde(untagged) enum can't deser, show all the reasons why #2376

When a serde(untagged) enum can't deser, show all the reasons why #2376

Conversation

cbeck88 commented Feb 18, 2023

This comment was marked as spam.

oli-obk left a comment

Choose a reason for hiding this comment

oli-obk Apr 20, 2023

Choose a reason for hiding this comment

oli-obk Apr 20, 2023

Choose a reason for hiding this comment

oli-obk commented Apr 23, 2023