Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#[derive(Recap)] doesn't compose #10

Open
ExpHP opened this issue Dec 16, 2019 · 3 comments · May be fixed by #28
Open

#[derive(Recap)] doesn't compose #10

ExpHP opened this issue Dec 16, 2019 · 3 comments · May be fixed by #28

Comments

@ExpHP
Copy link

ExpHP commented Dec 16, 2019

💡 Feature description

Currently the code generated by #[derive(Recap)] doesn't seem to be able to be composed in any manner. Ideally, I would more or less expect that fields of types with #[derive(Recap)] ought to be able to be parsed just like primitives and builtin types.

💻 Basic example

#[derive(Debug, Deserialize, Recap)]
#[recap(regex = r#"(?P<quantity>\d+) (?P<name>\w+)"#)]
struct Chemical {
    quantity: u32,
    name: String,
}

#[derive(Debug, Deserialize, Recap)]
#[recap(regex = r#"(?P<inner>\d+ \w+)"#)]
struct Wrapper {
    inner: Chemical,
}

fn main() {
    // ok
    "1 FUEL".parse::<Chemical>().unwrap();

    // Err(Custom("invalid type: string "1 FUEL", expected struct Chemical"))
    "1 FUEL".parse::<Wrapper>().unwrap();
}
@ExpHP ExpHP changed the title #[derive(Recap)] doesn't compose #[derive(Recap)] doesn't compose Dec 16, 2019
@softprops
Copy link
Owner

Hi @ExpHP.

I'm a little confused by your example and what you mean by "compose".

Your example with Chemical seems correct. You're derive a way to deserialize a chemical struct with two fields. That seems valid as you're describing the fields to expect in the regex.

With the Wrapper example, the error message reports what I'd expect. When you are declaring a Recap you're describing the fields of the thing you are parsing from a string. In this example you're declaring a description of a a fields for another struct, not Wrapper.

What's your concrete use case? The use cases recap targets is typical FromStr cases you might around std lib. for example.

"123"parse::<usize>() // => 123

but I wouldn't expect the following to work

struct Wrapper { inner: usize }
"123".parse::<Wrapper>() // => err

Perhaps I could extend this further if I had more context for the problem you're trying to solve.

Just out of curiousity. Why not just create the wrapper providing the chemical?

Wrapper { inner: "1 FUEL".parse::<Chemical>().unwrap() }

@ExpHP
Copy link
Author

ExpHP commented Dec 17, 2019

The use case is described here:

https://users.rust-lang.org/t/deserializing-a-vector-with-recap/35726

1 LFDGN => 7 DMPX
1 PFNM, 14 MVSK => 3 VQCQ
14 HJLX, 3 KGKVK, 1 XQSVS => 6 HGSM
#[derive(Debug, Deserialize, Recap)]
#[recap(regex = r#"(?P<quantity>\d+) (?P<name>\w+)"#)]
struct Chemical {
    quantity: u32,
    name: String,
}

#[derive(Debug, Deserialize, Recap)]
#[recap(regex = r#"^(?P<inputs>\d+ \w+(, )?)+ => (?P<output>\d+ \w+)$"#)]
struct Reaction {
    inputs: Vec<Chemical>,
    output: Chemical,
}

@mmirate
Copy link

mmirate commented Jan 15, 2021

Luckily, envy just happens to do magical things with the comma character, so a simple wrapper-type is all that's needed to make this work:

use recap::Recap;
use serde::Deserialize;
use std::error::Error;

mod stringly_typed {
    use serde::Deserialize;
    use std::str::FromStr;

    pub(crate) struct StringlyTyped<T>(T);
    impl<T: std::fmt::Debug> std::fmt::Debug for StringlyTyped<T> {
        fn fmt(
            &self,
            f: &mut std::fmt::Formatter<'_>,
        ) -> std::fmt::Result {
            std::fmt::Debug::fmt(&self.0, f)
        }
    }
    impl<'de, T: Deserialize<'de> + FromStr> Deserialize<'de> for StringlyTyped<T>
    where
        <T as std::str::FromStr>::Err: std::fmt::Display,
    {
        fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
        where
            D: serde::Deserializer<'de>,
        {
            let s: String = Deserialize::deserialize(deserializer)?;
            Ok(StringlyTyped(s.parse().map_err(serde::de::Error::custom)?))
        }
    }
}

use stringly_typed::StringlyTyped;

#[derive(Debug, Deserialize, Recap)]
#[recap(regex = r#"(?P<quantity>\d+) (?P<name>\w+)"#)]
struct Chemical {
    quantity: u32,
    name: String,
}

#[derive(Debug, Deserialize, Recap)]
#[recap(regex = r#"^(?P<inputs>(\d+ \w+(, )?)+) => (?P<output>\d+ \w+)$"#)]
struct Reaction {
    // when asked for a Vec-like thing, envy's deserializer format
    // deserializes a string and splits it on ","
    inputs: Vec<StringlyTyped<Chemical>>,
    output: StringlyTyped<Chemical>,
}

fn main() -> Result<(), Box<dyn Error>> {
    let chems = r#"
1 LFDGN => 7 DMPX
1 PFNM, 14 MVSK => 3 VQCQ
14 HJLX, 3 KGKVK, 1 XQSVS => 6 HGSM"#;

    for line in chems.lines().map(str::trim).filter(|s| !s.is_empty()) {
        let entry: Reaction = line.parse()?;
        eprintln!("{:?}", entry);
    }

    Ok(())
}

The only annoying part is that the intermediate deserialized value is a Vec<String> rather than a Vec<&str> (per #2), so it's going to be either inefficient or optimization-reliant.

FuegoFro added a commit to FuegoFro/recap that referenced this issue Dec 26, 2023
This enables Recap to work for multiple layers of nested structs, as seen in the `custom_deserialize_allows_nested_structs` test. The impetus for this came from things like https://adventofcode.com/2023/day/19, where parsing the workflow with rules with optional conditions gets much easier if you can just describe it as 3 nested structs.

There's details on the implementation written in the `derive_impl_deserialize` function, but the high level is that, when enabled, this will create a helper struct that mirrors the original one and then handle most deserialization requests by first deserializing into the helper struct and then moving the data to an instance of the original struct. The main difference is custom behavior for `str`, which allows us to parse the `str` using the regex and *then* forward to the helper struct.

This also adds integration tests for the derive definition and updates the documentation to describe the new behavior.

Note that currently the `Deserialize` implementation has some limitations like not working with zero-copy structs (structs that have lifetimes) or generic types (though generics already seems to not work with the existing Recap derivation). This also changes the `Recap` proc-macro to allow/handle `serde` attributes, which could lead to false negatives where people are putting attributes that end up being unused and not getting compiler warnings for it when the `handle_deserialize` attribute is *not* present. The only way around this that I'm aware of would be to use a different derive name (eg something like `#[derive(RecapDeserialize)]`) which I'm certainly open to as an alternative to the current `#[recap(handle_deserialize)]` approach if you'd prefer.

Fixes softprops#10.
FuegoFro added a commit to FuegoFro/recap that referenced this issue Dec 26, 2023
This enables Recap to work for multiple layers of nested structs, as seen in the `custom_deserialize_allows_nested_structs` test. The impetus for this came from things like https://adventofcode.com/2023/day/19, where parsing the workflow with rules with optional conditions gets much easier if you can just describe it as 3 nested structs.

There's details on the implementation written in the `derive_impl_deserialize` function, but the high level is that, when enabled, this will create a helper struct that mirrors the original one and then handle most deserialization requests by first deserializing into the helper struct and then moving the data to an instance of the original struct. The main difference is custom behavior for `str`, which allows us to parse the `str` using the regex and *then* forward to the helper struct.

This also adds integration tests for the derive definition and updates the documentation to describe the new behavior.

Note that currently the `Deserialize` implementation has some limitations like not working with zero-copy structs (structs that have lifetimes) or generic types (though generics already seems to not work with the existing Recap derivation). This also changes the `Recap` proc-macro to allow/handle `serde` attributes, which could lead to false negatives where people are putting attributes that end up being unused and not getting compiler warnings for it when the `handle_deserialize` attribute is *not* present. The only way around this that I'm aware of would be to use a different derive name (eg something like `#[derive(RecapDeserialize)]`) which I'm certainly open to as an alternative to the current `#[recap(handle_deserialize)]` approach if you'd prefer.

Fixes softprops#10.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants