Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Biplate Macro #2] New Uniplate Derive Macro #290

Merged
merged 2 commits into from
Sep 20, 2024

Conversation

niklasdewally
Copy link
Contributor

@niklasdewally niklasdewally commented Apr 3, 2024

This PR adds the new Uniplate derive macro.

The reason for this rewrite was to add Biplates and "multiple-type traversals"; this is discussed further in this wiki article: https://github.com/conjure-cp/conjure-oxide/wiki/2024%E2%80%9003-Implementing-Uniplates-and-Biplates-with-Structure-Preserving-Trees

Implementation

The library syn supports the creation of custom AST nodes. Using the Parse trait, one can implement parsers for these custom nodes by combining existing Rust AST nodes together as well as parsing Rust tokens by hand.

The macro code has two parts: parsing into a custom AST, and using this AST for code-gen. This simplifies code-gen, as only the data needed for Uniplate is encoded in the AST. This also allows for better error reporting, as Most errors are now reported during parsing, allowing errors to be associated with the specific tokens that caused them.

The biggest change in my AST compared to Rust's is the representation of types. Uniplate has three kinds of type: boxed pointer types, plateable types, and unpalatable types.

Each plateable types contains some wrapper types and a base type. For example, the input type Vec<Vec<A> would be split into wrappers Vec,Vec, and a base type A. This is useful as we want to generally derive implementations for A only. Vectors are implicitly unwrapped using default implementations.

Boxed pointer types represent smart pointers such as Box and Rc. These use the .borrow() function instead of & to pass the inner value to the Biplate instance. Generating instances for Box<A> and Rc<A> would mean that values are constantly moved from stack to heap - this avoids this.

Testing

I added trybuild, a library for UI testing derive macros. The Rust tester does not run at-all when test code does not compile, but we need to test whether our macro gives a compile error when it should. trybuild also checks that the error messages given by the compiler have not changed.

Based on #301

@niklasdewally niklasdewally marked this pull request as ready for review April 5, 2024 20:26
@niklasdewally niklasdewally marked this pull request as draft April 7, 2024 10:28
@niklasdewally niklasdewally force-pushed the uniplate-derive branch 9 times, most recently from 0cd0da5 to 86c6ee0 Compare April 13, 2024 10:57
@niklasdewally niklasdewally changed the title Uniplate derive New Uniplate Derive Macro Apr 20, 2024
@niklasdewally
Copy link
Contributor Author

niklasdewally commented Apr 20, 2024

@ozgurakgun @gskorokhod I think this is ready for a preliminary review.

Known issues:

  • My Biplate implementation that unwraps vectors is not working for nested vectors and I don't know why.
  • The macro currently assumes that all types inside an enum contain each-other - I need a way to say that (for example) a String cannot contain an int. This would probably be in the form of a parameter l specifying which types are recursive.

These errors can be seen by removing the comments in the examples/stmt.rs file.

I've got some ideas for how to fix these, and some examples I could writeup to show specifically the underlying problems, but I don't have the time to do this just yet. A general code review would still be helpful though.

@niklasdewally niklasdewally self-assigned this Apr 20, 2024
@niklasdewally niklasdewally added area::uniplate kind::feature New feature or request kind::refactor Improvements to existing code (style, performance, clarity, ...) labels Apr 20, 2024
@ChrisJefferson
Copy link
Contributor

There seems to be a bunch of compile issues, I tried just running cargo check to start, and got errors about private crates, and lots of errors which look like:

error[E0599]: no method named `with_children` found for reference `&Expression` in the current scope
   --> crates/conjure_core/src/rules/minion.rs:293:22
    |
293 |                 expr.with_children(sub)
    |                      ^^^^^^^^^^^^^ method not found in `&Expression`
    |
    = help: items from traits can only be used if the trait is implemented and in scope
    = note: the following trait defines an item `with_children`, perhaps you need to implement it:
            candidate #1: `Uniplate`

@niklasdewally
Copy link
Contributor Author

niklasdewally commented Apr 20, 2024

The entire project doesn't compile as I haven't refactored conjure oxide to use the new types yet.

The uniplate crate specifically should though!

@niklasdewally
Copy link
Contributor Author

niklasdewally commented Apr 30, 2024

RFC @gskorokhod @ozgurakgun @lixitrixi @ChrisJefferson

This is the blocker for this PR; any help would be appreciated! I don't really know how to explain it very well in writing (especially as I last looked at this indepth 2 weeks ago), but hopefully this gets the core of the issue across!.


To decide which Biplates to generate, each type found inside the given type is added to a list. A Biplate instance is generated from the given type to each type in this list.

#[derive(Uniplate)]
enum A {
  B(F,G),
  C(H),
  D(i32),

Deriving Uniplate on A will generate the trait instances: Uniplate for A, Biplate<F> for A,Biplate<G> for A,Biplate<H> for A, Biplate<i32> for A.

Under the hood, Biplates and Uniplates are just recursive invocations of Biplate that walk towards the target type. Currently, for each field in an enum variant, the generated code calls the Biplate that goes from that fields’ type to the target type (<FieldTy as Biplate<TargetTy>>::biplate(self)).

Consider the instance Biplate<G> for B. For the B variant, it will call <F as Biplate<G>> and <G as Biplate<G>>. Suppose F does not contain G. In that case, there will not exist a Biplate instance from F to G (as instances are only generated for inner types of F); therefore, Rust will throw a compile error.

Another issue is that it will try to call <i32 as Biplate<G>>. i32 is a primitive type we don’t control and due to trait orphaning we cannot paste an instance for Biplate for it inside the users code.

The main blocker for this PR is that we need a UI to decide what instances to generate. I’m a bit stuck on what the best way to go about this would be.

Also note that traits specialisation does not exist, and generic traits in rust do not yet have mutually exclusive bounds, so what we can do with blanket implementations is very limited.

@ozgurakgun
Copy link
Contributor

picking this up part by part, for <F as Biplate<G>> and

Suppose F does not contain G. In that case, there will not exist a Biplate instance from F to G (as instances are only generated for inner types of F); therefore, Rust will throw a compile error.

Why not generate that instance in the trivial way?

@ozgurakgun
Copy link
Contributor

Another issue is that it will try to call <i32 as Biplate<G>>. i32 is a primitive type we don’t control and due to trait orphaning we cannot paste an instance for Biplate for it inside the users code.

we might have to special case primitive types. I see we already have derive_unplateable, do these create orphan instances then?

@niklasdewally
Copy link
Contributor Author

picking this up part by part, for <F as Biplate<G>> and

Suppose F does not contain G. In that case, there will not exist a Biplate instance from F to G (as instances are only generated for inner types of F); therefore, Rust will throw a compile error.

Why not generate that instance in the trivial way?

I'll give a more concrete example:

#[derive(Eq, PartialEq, Clone, Debug, Uniplate)]
enum Stmt {
    Assign(String, Expr),
    Sequence(Vec<Stmt>),
    If(Expr, Box<Stmt>, Box<Stmt>),
    While(Expr, Box<Stmt>),
}

#[derive(Eq, PartialEq, Clone, Debug, Uniplate)]
enum Expr {
    Add(Box<Expr>, Box<Expr>),
    Sub(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
    Div(Box<Expr>, Box<Expr>),
    Val(i32),
    Var(String),
    Neg(Box<Expr2>),
}

#[derive(Eq, PartialEq, Clone, Debug, Uniplate)]
enum Expr2 {
  Modulo(Box<Expr>)
}

When derive runs on Stmt it generates the instances:

<Stmt as Biplate<String>>
<Stmt as Biplate<Expr>>
<Stmt as Biplate<Stmt>>
<Stmt as Uniplate>

When derive runs on Expr it generates the instances:

<Expr as Biplate<i32>>
<Expr as Biplate<String>>
<Expr as Biplate<Expr>>
<Expr as Uniplate>

Here the problematic case is <Expr as Biplate<Stmt>>, which would be called in the generated code for <Stmt as Biplate<Stmt>>.

The macro runs on each type in isolation, generating biplates from that type to all detected inner types. If we generate all instances from each type and to each type in Expr, this may solve this issue, but we could get duplicate conflicting implementations of Biplate for a pair of types. Also, this would mean that we cannot use the type information of the from type properly during derivation.

We have no way of checking during macro execution if an implementation is missing or is already present, so we must instead allow the user to specify what additional instances to generate.

In the Haskell direct derive the user specifies all instances to generate by hand, using the compiler to tell them which ones are missing. I think we can do better here.

An example stack project for this is here: (uses dependencies too old for Apple Silicon unfortunately)
https://github.com/niklasdewally/uniplate-bits/tree/main/hs-derive-orig

@niklasdewally
Copy link
Contributor Author

niklasdewally commented May 1, 2024

There seems to be a bunch of compile issues, I tried just running cargo check to start, and got errors about private crates, and lots of errors which look like:

error[E0599]: no method named `with_children` found for reference `&Expression` in the current scope
   --> crates/conjure_core/src/rules/minion.rs:293:22
    |
293 |                 expr.with_children(sub)
    |                      ^^^^^^^^^^^^^ method not found in `&Expression`
    |
    = help: items from traits can only be used if the trait is implemented and in scope
    = note: the following trait defines an item `with_children`, perhaps you need to implement it:
            candidate #1: `Uniplate`

I think that I will make another PR moving Uniplate on Conjure Oxide to a manual implementation using the new types. This would help me iterate on the macro implementation without breaking Conjure Oxide as often.

@niklasdewally
Copy link
Contributor Author

niklasdewally commented May 1, 2024

Another issue is that it will try to call <i32 as Biplate<G>>. i32 is a primitive type we don’t control and due to trait orphaning we cannot paste an instance for Biplate for it inside the users code.

we might have to special case primitive types. I see we already have derive_unplateable, do these create orphan instances then?

Thats right - it is just a simple substitution filling in some predefined terminal instances impl<T> Biplate<T> for $ty

niklasdewally added a commit to niklasdewally/conjure-oxide that referenced this pull request May 1, 2024
* Replace the #[derive(Uniplate)] with a manual implementation.

* Use the new uniplate::biplate::Uniplate type instead of
  uniplate::uniplate::Uniplate. This will be the only Uniplate type in
  the future.

These changes are to help development of the new Uniplate derive macro.

In conjure-cp#290, I am in the
process of removing the old uniplate type (uniplate::uniplate::Uniplate)
and rewriting the macro - the changes made in this commit ensure that
during development of this PR Conjure Oxide still compiles
and passes the tests.
@niklasdewally niklasdewally force-pushed the uniplate-derive branch 4 times, most recently from 0ed0a27 to 1eecd51 Compare May 1, 2024 20:08
@niklasdewally
Copy link
Contributor Author

niklasdewally commented May 1, 2024

Still work in progress, but I have added some syntax thats specifies what instances to generate. It fixes all issues to do with missing trait implementations; however, it requires the user to manually specify which types should be walked into when looking for children.

use uniplate::Uniplate;

#[derive(Eq, PartialEq, Clone, Debug, Uniplate)]
#[uniplate()]
#[biplate(to=String,walk_into=(Expr))]
enum Stmt {
    Assign(String, Expr),
    //Sequence(Vec<Stmt>),
    If(Expr, Box<Stmt>, Box<Stmt>),
    While(Expr, Box<Stmt>),
}

#[derive(Eq, PartialEq, Clone, Debug, Uniplate)]
#[uniplate()]
#[biplate(to=String)]
enum Expr {
    Add(Box<Expr>, Box<Expr>),
    Sub(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
    Div(Box<Expr>, Box<Expr>),
    Val(i32),
    Var(String),
    Neg(Box<Expr>),
}

Forgetting to add a type to walk_into could cause annoying bugs, so I think an opt-out mechanism (don't generate instances for these two types) may be better than this. However, this is significantly simpler to implement.

@niklasdewally niklasdewally force-pushed the uniplate-derive branch 3 times, most recently from 5ed75ca to 0aced1f Compare May 13, 2024 19:34
niklasdewally added a commit to niklasdewally/conjure-oxide that referenced this pull request Jun 27, 2024
* Use a manual implementation of Uniplate on Expression instead of the
  derive macro.

* Use the new uniplate::biplate::Uniplate type instead of
  uniplate::uniplate::Uniplate. This will be the only Uniplate type in
  the future.

These changes are to help development of the new Uniplate derive macro.

In conjure-cp#290, I am in the
process of removing the old uniplate type (uniplate::uniplate::Uniplate)
and rewriting the macro - the changes made in this commit ensure that
during development of this PR Conjure Oxide still compiles
and passes the tests.
niklasdewally added a commit to niklasdewally/conjure-oxide that referenced this pull request Jun 30, 2024
* Use a manual implementation of Uniplate on Expression instead of the
  derive macro.

* Use the new uniplate::biplate::Uniplate type instead of
  uniplate::uniplate::Uniplate. This will be the only Uniplate type in
  the future.

These changes are to help development of the new Uniplate derive macro.

In conjure-cp#290, I am in the
process of removing the old uniplate type (uniplate::uniplate::Uniplate)
and rewriting the macro - the changes made in this commit ensure that
during development of this PR Conjure Oxide still compiles
and passes the tests.
@niklasdewally niklasdewally changed the title New Uniplate Derive Macro [Biplate Macro #2] New Uniplate Derive Macro Jun 30, 2024
@niklasdewally niklasdewally force-pushed the uniplate-derive branch 2 times, most recently from 581ecca to cb24ca2 Compare June 30, 2024 15:08
@niklasdewally niklasdewally force-pushed the uniplate-derive branch 6 times, most recently from 5bd12a5 to 18727ed Compare September 19, 2024 13:49
Reimplement the uniplate-derive macro to allow for Biplates and
"multiple type traversals" as discussed in [Niklas Dewally, 2024‐03:
Implementing Uniplates and Biplates with Structure Preserving Trees].
(currently at
https://github.com/conjure-cp/conjure-oxide/wiki/2024%E2%80%9003-Implementing-Uniplates-and-Biplates-with-Structure-Preserving-Trees).

Implementation overview
=======================

The library syn supports the creation of custom AST nodes. Using the
Parse trait, one can implement parsers for these custom nodes by
combining existing Rust AST nodes together as well as parsing Rust
tokens by hand.

The macro code has two parts: parsing into a custom AST, and using this
AST for code-gen. This simplifies code-gen, as only the data needed for
Uniplate is encoded in the AST. This also allows for better error
reporting, as Most errors are now reported during parsing, allowing
errors to be associated with the specific tokens that caused them.

The biggest change in Uniplate's AST compared to Rust's is the
representation of types. Uniplate has three kinds of type: boxed pointer
types, plateable types, and unpalatable types.
Plateable types include the current type being derived, or other types
explicitly specified through the walk_through attribute in the macro
invocation.

Boxed pointer types represent smart pointers such as Box and Rc. These
use the .borrow() function instead of & to pass the inner value to the
Biplate instance. Generating instances for Box<A> and Rc<A> would mean
that values are constantly moved from stack to heap - this avoids this.

Implementations for standard library iterator types and primitives are
provided by Uniplate. These implementations are generated by declarative
macro, allowing easy extension to more collection types in the future.

Testing
=======

Add trybuild, a library for UI testing derive macros. The Rust tester
does not run at-all when test code does not compile, but we need to test
whether our macro gives a compile error when it should. trybuild also
checks that the error messages given by the compiler have not changed.

Two test ASTs are used for Uniplate and Biplates: a toy
statement-expression found in the original paper, and the current
Conjure Oxide AST. Property-based testing in paper.rs shows that
children() and with_children() work for random homogenous AST's,
including with boxes and iterators. As these two methods essentially act
as wrappers for the child tree and context function returned by the
uniplate() / biplate() methods, this test is sufficient to show that
automatically derived Uniplate implementations act as intended.

Signed-off-by: Niklas Dewally <niklas@dewally.com>
@niklasdewally niklasdewally marked this pull request as ready for review September 19, 2024 14:04
@niklasdewally
Copy link
Contributor Author

Working and ready for review @ozgurakgun 🚀

I think the documentation and UI of the macro still need some work, but I think merging this as is would be best.

@niklasdewally niklasdewally removed the request for review from gskorokhod September 19, 2024 14:07
Replace the manual implementation of the Uniplate and Biplate traits for
the AST with the derive macro.

Adding uniplate_derive to the ast broke is_enum_variant, therefore this
commit also removes and replaces all invocations of is_xxx() functions
with let..else or pattern matching.

Signed-off-by: Niklas Dewally <niklas@dewally.com>
@niklasdewally
Copy link
Contributor Author

cc: @lixitrixi

@ozgurakgun
Copy link
Contributor

brilliant. I think I'd like a walk through before we merge though. tomorrow afternoon? I can do 3.30.

@ozgurakgun ozgurakgun merged commit dd2d278 into conjure-cp:main Sep 20, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind::feature New feature or request kind::refactor Improvements to existing code (style, performance, clarity, ...)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants