-
-
Notifications
You must be signed in to change notification settings - Fork 788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect errors when deserializing untagged enums #1544
Conversation
See this more as a draft that just happens to help me debug stuff locally :) Bikeshedding of feature name and error format very welcome. |
Just used this to debug some nested enums that had a typo. Very helpful! Thanks! |
@dtolnay are you interested in merging this? I'm not sure what Serde's approach to verbose debugging is/whether there is any precedence for this. |
What would be holding us back from merging this in? Seems fantastic. |
I cannot get this branch to work. Compiling throws a lot of errors like:
|
how to use it along with serde_json cause I can override serde version but not serde_json version of serde Well after try everything, I clone serde_bytes, serde_json, serde, did some git rebase and it's worked. |
You can use `[patch]` in your Cargo.toml. Check out this section of the rust book: https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html
… On Aug 9, 2020, at 4:09 AM, Antoine PLASKOWSKI ***@***.***> wrote:
how to use it along with serde_json cause I can override serde version but not serde_json version of serde
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
This wouldn't work cause there would be two version of serde, and so as @piegamesde reported this cause Deserialize to not be implement for serde_json version of serde. |
Thanks for doing this work. Is this behind a flag because the message is too long or because collecting the debugging information comes at a performance cost (or something else)? |
This also adds a verbose-debug feature that enables the new behavior. Previously: > data did not match any variant of untagged enum ReferenceOr at line 6 column 4 Now, with `verbose-debug` feature enabled: > data did not match any variant of untagged enum `ReferenceOr` > - attempted to deserialize `Reference` but failed with: missing field `$ref` > - attempted to deserialize `Item` but failed with: invalid value: string "lol", expected expected format `\dXX` at line 6 column 4 cf. serde-rs#773
35a042b
to
59650e1
Compare
Just had a use for this again because the error messages serde gives me when reading in a config file are not helpful to users. @dtolnay I'd really appreciate if you could have a look at this :) |
+1 for adding this or more extensive error messages - some failures are very hard to debug. |
Can any of the maintainers take a look at this change? Debugging deserialization problems into enums is basically impossible without it. |
I know it has been asked already but I think maintainers should seriously consider checking this. It looks like a small change and it will be of huge help to all of us using untagged enums in our projects. |
Is there any reason why this is not being considered? This would be so helpful for actually finding errors in YAML files. |
I really need something like this. Unfortunately I was not able to use the git version in the PR repo because it's a bit outdated. My use-case: I'm querying an API that either returns an error JSON or the value requested. The API is apparently unstable because after some time, my code no longer works. I'm trying to work out what's not matching in the JSON, but it's really difficult because it's a huge chunk and it almost fits. Is it a type not matching? Is it a field that became optional? I have no way of knowing without looking at each field in my struct and trying to manually match that against the JSON I got. My workaround is likely going to be: identify an error response without deserializing (i.e. dumb string matching) and then deserializing only the concrete types and use serde_path_to_error to gain some insight. |
@dtolnay ? |
@dtolnay there seems to be a fair amount of interest for this issue. We'd all appreciate if you let us know what your concerns are, and if you see a path for it to get merged. |
For anyone who encounters this issue today, you need a branch where the serde version matches your other dependencies. I took a similar commit (that uses no compiler feature flag) from this fork: https://github.com/jonasbb/serde/tree/dev and cherry picked it on top of mainline's master. Result: [patch.crates-io]
serde_derive = { git = "https://github.com/kurtbuilds/serde" } Hopefully this is mainlined soon. |
Serde has such a fast release cadence that the repository that @kurtbuilds kindly provided is already out of date. If anyone wants this patch, they practically need to keep their own automatically-updated fork with the patch (using tools such as the What has been blocking this improvement for more than three years now? It's a really, really welcome change for those that use Serde to parse user-provided configuration. Edit: I've set up a fork that is automatically updated with Edit 2: Edit 3 (November 28th, 2022): |
@dtolnay Did you have any thoughts on this approach? When trying to build user facing tools/services this would be a big help. If you have thoughts on a different approach that's cool too. It'd be great to get some collaboration going on it. |
Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes.
Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes.
586: Improve error messaged by dropping untagged enums r=jonasbb a=jonasbb Untagged enums do not provide good error messages and likely never will, given that there are multiple PRs which are just completely ignored ([serde#2376](serde-rs/serde#2376) and [serde#1544](serde-rs/serde#1544)). Instead using `content::de` the untagged enums can be replaced by custom buffering. The error messages for `OneOrMany` and `PickFirst` now look like this, including the original failure for each variant. ```text OneOrMany could not deserialize any variant: One: invalid type: map, expected u32 Many: invalid type: map, expected a sequence ``` ```text PickFirst could not deserialize any variant: First: invalid type: string "Abc", expected u32 Second: invalid digit found in string ``` The implementations of `VecSkipError` and `DefaultOnError` are updated too, but should not result in any visible changes. Co-authored-by: Jonas Bushart <jonas@bushart.org>
#2403 introduced new merge conflicts for this PR, which I've just fixed in my fork (if anyone's using my fork and notices any problem due to that, please file an issue on the fork). If conflicts are keeping this PR from being merged, I'd be happy to contribute and submit my updated patches upstream. Edit (2023-08-04): 6f1f38d introduced yet another merge conflict, which I've fixed in my fork. The full list of changes from upstream that will be updated automatically as I resolve conflicts can be found here. Edit 2 (2023-08-04): the above conflict resolution was not enough to get |
When will this PR be merged? This is really useful! There is so little misinformation now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach was rejected in a different PR at #2376.
One way to substitute this is with serde_path_to_error, for me, that has completely satisfied what this pr did. |
Thank you @dtolnay for your clarification on the status of this PR - I appreciate being decisive and explicit with PRs as important to certain users as this one has been for years. |
@dtolnay I'm sorry but this just doesn't make sense to me. The reasoning in rejecting #2376 suggests that this is a hack and there's a better way to do this, even though that's not true at all. EDIT: Implementing a visitor only works when the enum doesn't require rolling the parser back which is not possible in serde or buffering the input, which is as far as I know only possible with an internal API There's also Forking serde and serde_derive is obviously complicated since that would also require patching the entire dependency chain to use that fixed version. That is on top of maintaining the fork, keeping it up to date and resolving any conflicts. Implementing a derive macro is not a good option either since it has the same issues as implementing a custom visitor, on top of the even higher complexity of implementing the derive macro. There is quite a decent amount of interest in this since this PR is 2nd most upvoted PR and would be the 5th most upvoted issue. And there is no good solution still. This PR basically provides the only solution that is not overly complex and works in every case. I think this feature is pretty much basic and a must-have, since currently it's basically impossible to tell what went wrong in an untagged enum, especially when it's the top level type or has a lot of children, then the entire error is completely worthless. |
When trying to use serde-untagged I also ran into issues with not being able to roll the parser back thus not being able to easily deserialize an enum containing multiple structs. However I guess an alternate implementation could potentially either fork/copy over the |
Adding to some of the points exposed by @33KK:
I've been doing that maintenance work for a while now, so other people don't have to. Maintaining that fork has been straightforward so far, barring the time when precompiled binaries where introduced, but they ended up being reverted for a plethora of reasons. That does not solve the problem that patching
While I agree with this, I recently stumbled upon and documented the little-known |
It's a shame this is still an issue. Most of the issues with our configuration that is reported by users is caused by enums/unions swallowing the actual inner error, and we have no way to present that to the user. We also use For example, say we have a field that accepts a string ID, or a vec of string IDs. Very simple enum, as we can do something like |
Background ---------- This change introduces a custom serde visitor to handle single-store and chained-store variants while still producing useful error messages. The [chained authority PR](hickory-dns#2161) used the serde [untagged enum representation](https://serde.rs/enum-representations.html#untagged) in order to represent single-store (TOML table, serde map) or chained-store (TOML array-of-tables, serde sequence) variants. Unfortunately, serde is not able to show useful error messages when a syntax error is encountered and untagged enums are being used. For example, misspelling the "type" option in a zone store configuration results in this error message to the user: ``` 1727879524:INFO:hickory_dns:443:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ data did not match any variant of untagged enum StoreConfigContainer ``` By using a minimal custom visitor, we can restore the previous, more useful error messages: ``` *Single Store* 1728310440:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/example_recursor.toml" Error: failed to read config file from "tests/test-data/test_configs/example_recursor.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [zones.stores] | ^^^^^^^^^^^^^^ missing field `type` ``` ``` *Chained Stores* 1728311606:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 47, column 1 | 47 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ missing field `type` ``` A side effect of this change is that StoreConfig is always represented as a Vec<StoreConfig>, which simplifies a bit of the code that iterates over the stores in bin/src/hickory-dns.rs. Other Options ------------- * File a bug report with the serde team. This has been done by others, and it appears the serde maintainers have no intention of fixing this in serde proper, rejecting multiple PRs that address this issue. See, for example, [here](serde-rs/serde#1544) * Use one of the work-around crates published by the serde team. ** [Path to Error](https://docs.rs/serde_path_to_error/latest/serde_path_to_error/). This appears to work, but adds an additional crate to our dependency tree. ** [Serde Untagged](https://docs.rs/serde-untagged/latest/serde_untagged/). Has unacceptable type limitations (it only supports bool, i8, i16, i32, i64, i128, u8, u16, u32, u64, u128, f32, f64, char, string, borrowed_str, bytes, borrowed_bytes, byte_buf, unit, seq, map fields) * Make all stores an array-of-tables. In addition to breaking existing configurations, this seems counterintuitive to me. * Use a different configuration name for single- vs chained-stores: [zones.stores] for single stores and [[zones.chained_stores]] for chained stores. This still seems kind of clunky to me, particularly with the existing "stores" being plural for a single-store configuration, but is probably the next-best alternative.
Background ---------- This change introduces a custom serde visitor to handle single-store and chained-store variants while still producing useful error messages. The [chained authority PR](hickory-dns#2161) used the serde [untagged enum representation](https://serde.rs/enum-representations.html#untagged) in order to represent single-store (TOML table, serde map) or chained-store (TOML array-of-tables, serde sequence) variants. Unfortunately, serde is not able to show useful error messages when a syntax error is encountered and untagged enums are being used. For example, misspelling the "type" option in a zone store configuration results in this error message to the user: ``` 1727879524:INFO:hickory_dns:443:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ data did not match any variant of untagged enum StoreConfigContainer ``` By using a minimal custom visitor, we can restore the previous, more useful error messages: **Single Store** ``` 1728310440:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/example_recursor.toml" Error: failed to read config file from "tests/test-data/test_configs/example_recursor.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [zones.stores] | ^^^^^^^^^^^^^^ missing field `type` ``` **Chained Stores** ``` 1728311606:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 47, column 1 | 47 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ missing field `type` ``` A side effect of this change is that StoreConfig is always represented as a Vec<StoreConfig>, which simplifies a bit of the code that iterates over the stores in bin/src/hickory-dns.rs. Other Options ------------- * File a bug report with the serde team. This has been done by others, and it appears the serde maintainers have no intention of fixing this in serde proper, rejecting multiple PRs that address this issue. See, for example, [here](serde-rs/serde#1544) * Use one of the work-around crates published by the serde team. 1. [Path to Error](https://docs.rs/serde_path_to_error/latest/serde_path_to_error/). This appears to work, but adds an additional crate to our dependency tree. 2. [Serde Untagged](https://docs.rs/serde-untagged/latest/serde_untagged/). Has unacceptable type limitations (it only supports bool, i8, i16, i32, i64, i128, u8, u16, u32, u64, u128, f32, f64, char, string, borrowed_str, bytes, borrowed_bytes, byte_buf, unit, seq, map fields) * Make all stores an array-of-tables. In addition to breaking existing configurations, this seems counterintuitive to me. * Use a different configuration name for single- vs chained-stores: [zones.stores] for single stores and [[zones.chained_stores]] for chained stores. This still seems kind of clunky to me, particularly with the existing "stores" being plural for a single-store configuration, but is probably the next-best alternative.
Background ---------- This change introduces a custom serde visitor to handle single-store and chained-store variants while still producing useful error messages. The [chained authority PR](hickory-dns#2161) used the serde [untagged enum representation](https://serde.rs/enum-representations.html#untagged) in order to represent single-store (TOML table, serde map) or chained-store (TOML array-of-tables, serde sequence) variants. Unfortunately, serde is not able to show useful error messages when a syntax error is encountered and untagged enums are being used. For example, misspelling the "type" option in a zone store configuration results in this error message to the user: ``` 1727879524:INFO:hickory_dns:443:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ data did not match any variant of untagged enum StoreConfigContainer ``` By using a minimal custom visitor, we can restore the previous, more useful error messages: **Single Store** ``` 1728310440:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/example_recursor.toml" Error: failed to read config file from "tests/test-data/test_configs/example_recursor.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [zones.stores] | ^^^^^^^^^^^^^^ missing field `type` ``` **Chained Stores** ``` 1728311606:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 47, column 1 | 47 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ missing field `type` ``` A side effect of this change is that StoreConfig is always represented as a Vec<StoreConfig>, which simplifies a bit of the code that iterates over the stores in bin/src/hickory-dns.rs. Other Options ------------- * File a bug report with the serde team. This has been done by others, and it appears the serde maintainers have no intention of fixing this in serde proper, rejecting multiple PRs that address this issue. See, for example, [here](serde-rs/serde#1544) * Use one of the work-around crates published by the serde team. 1. [Path to Error](https://docs.rs/serde_path_to_error/latest/serde_path_to_error/). This appears to work, but adds an additional crate to our dependency tree. 2. [Serde Untagged](https://docs.rs/serde-untagged/latest/serde_untagged/). Has unacceptable type limitations (it only supports bool, i8, i16, i32, i64, i128, u8, u16, u32, u64, u128, f32, f64, char, string, borrowed_str, bytes, borrowed_bytes, byte_buf, unit, seq, map fields) * Make all stores an array-of-tables. In addition to breaking existing configurations, this seems counterintuitive to me. * Use a different configuration name for single- vs chained-stores: [zones.stores] for single stores and [[zones.chained_stores]] for chained stores. This still seems kind of clunky to me, particularly with the existing "stores" being plural for a single-store configuration, but is probably the next-best alternative.
Background ---------- This change introduces a custom serde visitor to handle single-store and chained-store variants while still producing useful error messages. The [chained authority PR](#2161) used the serde [untagged enum representation](https://serde.rs/enum-representations.html#untagged) in order to represent single-store (TOML table, serde map) or chained-store (TOML array-of-tables, serde sequence) variants. Unfortunately, serde is not able to show useful error messages when a syntax error is encountered and untagged enums are being used. For example, misspelling the "type" option in a zone store configuration results in this error message to the user: ``` 1727879524:INFO:hickory_dns:443:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ data did not match any variant of untagged enum StoreConfigContainer ``` By using a minimal custom visitor, we can restore the previous, more useful error messages: **Single Store** ``` 1728310440:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/example_recursor.toml" Error: failed to read config file from "tests/test-data/test_configs/example_recursor.toml": toml decode error: TOML parse error at line 38, column 1 | 38 | [zones.stores] | ^^^^^^^^^^^^^^ missing field `type` ``` **Chained Stores** ``` 1728311606:INFO:hickory_dns:431:loading configuration from: "tests/test-data/test_configs/chained_blocklist.toml" Error: failed to read config file from "tests/test-data/test_configs/chained_blocklist.toml": toml decode error: TOML parse error at line 47, column 1 | 47 | [[zones.stores]] | ^^^^^^^^^^^^^^^^ missing field `type` ``` A side effect of this change is that StoreConfig is always represented as a Vec<StoreConfig>, which simplifies a bit of the code that iterates over the stores in bin/src/hickory-dns.rs. Other Options ------------- * File a bug report with the serde team. This has been done by others, and it appears the serde maintainers have no intention of fixing this in serde proper, rejecting multiple PRs that address this issue. See, for example, [here](serde-rs/serde#1544) * Use one of the work-around crates published by the serde team. 1. [Path to Error](https://docs.rs/serde_path_to_error/latest/serde_path_to_error/). This appears to work, but adds an additional crate to our dependency tree. 2. [Serde Untagged](https://docs.rs/serde-untagged/latest/serde_untagged/). Has unacceptable type limitations (it only supports bool, i8, i16, i32, i64, i128, u8, u16, u32, u64, u128, f32, f64, char, string, borrowed_str, bytes, borrowed_bytes, byte_buf, unit, seq, map fields) * Make all stores an array-of-tables. In addition to breaking existing configurations, this seems counterintuitive to me. * Use a different configuration name for single- vs chained-stores: [zones.stores] for single stores and [[zones.chained_stores]] for chained stores. This still seems kind of clunky to me, particularly with the existing "stores" being plural for a single-store configuration, but is probably the next-best alternative.
This also adds a verbose-debug feature that enables the new behavior.
Previously:
Now, with
verbose-debug
feature enabled:cf. #773