Skip to content

Conversation

@claytonrcarter
Copy link
Collaborator

@claytonrcarter claytonrcarter commented Aug 21, 2022

This was fun to implement, and was pretty easy. Considering how flexible and powerful it could be ... I'm sort of concerned I'm overlooking something! 😄 As always, I'm 100% open to suggestions here.

  • This does nothing to check for -- let alone prevent -- recursion in aliases. If someone defines a foo alias that itself calls foo() ... they're out of luck.
  • This only allows aliasing functions; I did not consider allowing aliases for names, though it would be easy to add.
    • I kind of feel like it's not necessary, though. Aliasing names risks conflicting w/ commits and refs, while functions have no such conflict. And aliases feel a little more "functiony" anyway.
  • I made no attempt to include the defined aliases in the EvalError::UnboundFunction output. It wasn't immediately obvious to me how to list all config keys defined in the section, but I should admit that I didn't look very hard. 😄 Including them would certainly be kinder and would aid in discoverability.
  • This has me wondering about git branchless shipping w/ some predefined aliases of it's own, which would be installed via init. Doing so would allow us to use Rust for functions that need some intricate logic, while aliases could be used for queries composable from the Rust-based "building block" functions. I'm thinking of something like siblings=children(parents($1)) - $1, onlyParent=exactly(parents($1), 1), sole=exactly($1, 1), etc
    • One downside to this, though, is that errors might end up being surprising. For example, if I call siblings(foo) but get an error about children or parents, perhaps that could be confusing?
    • Anyway, just an idea, not in scope for this PR.
  • I chose branchless.revsets.alias as the config key, and this shows up in the .config/git/config file like so:
[branchless "revsets.alias"]
    grandChildren = children(children($1))
    oldest = sole(roots(intersection(ancestors($1), descendants($1))))
    sole = exactly($1, 1)
    onlyChild = sole(children($1))
  • I don't have strong feelings about the config key, and I'm happy to change to match whatever you'd prefer. As I write this, though, I see that I should drop the git- and just use branchless.revsets.alias ... I'll have to fix that tomorrow Fixed.

Thank you!

@claytonrcarter
Copy link
Collaborator Author

Oh dear! I just realized that I pushed these three recent branches directly to this repo, and not to my fork. I'm sorry about that! (It was a misfire w/ git stack.) I'll be more careful in the future. 🤦

Copy link
Owner

@arxanas arxanas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only allows aliasing functions; I did not consider allowing aliases for names, though it would be easy to add.

That seems sensible. After all, the user can just call foo() with no arguments anyways.

I made no attempt to include the defined aliases in the EvalError::UnboundFunction output. It wasn't immediately obvious to me how to list all config keys defined in the section, but I should admit that I didn't look very hard. 😄 Including them would certainly be kinder and would aid in discoverability.

That's probably fine for now. If we wanted to implement this, we would probably add variable/function lookup contexts and iterate through those. To list all config, I think we would want the Config::entries function (which currently isn't wrapped by git-branchless's Git APIs).

This has me wondering about git branchless shipping w/ some predefined aliases of it's own, which would be installed via init.

If we do, won't it be more or less equivalent to having defined them in the code?

It's worth calling attention to the ... hacky? ... crude? ... expeditious? way that I implemented this: When an unrecognized fn is encountered, we check to see if a corresponding alias is defined. If so, we serialize each element of args (from Expr back into Strings) and replace each placeholder in the alias "template" w/ the corresponding serialized arg. Then we re-parse the alias (w/ the placeholders replaced) and send the result back through eval_inner.

To speak more accurately about this, you've implemented a user-defined macro system for revset aliases (as opposed to a user-defined function system). In this case, your macro system is entirely textual.

For the sake of discussion, I've pushed a fixup commit w/ an alternate approach. Instead of serializing and reparsing expressions

And this is a syntactic macro system, in which you find the AST node and you replace it with some other nodes.

As you've noticed, there are some downsides to the macro approach:

  • The same expression might be evaluated more than once.
  • The user might get confused because error messages might refer to macro-expanded code that they can't see.

A more robust option might be to just parse the alias as is (w/ placeholders not replaced) but then pass some sort of "unbound arg resolver" callback through the eval system, and that callback could be used to resolve things like $n placeholders into parsed Exprs. I suspect that this might introduce some significant complexity into the eval system, though.

The other implementation approach would be to define lookup contexts for variables/functions, rather than using macros. This is how gitrevset does it: https://github.com/quark-zju/gitrevset/blob/50234203913331641aae1d479b5f861869f39cda/src/eval.rs#L21-L31

At startup, you would populate the contexts with the set of variables (initially none) and functions (the predefined ones and then add all aliases that the user has defined). For each user entry, you would need to analyze it and construct some kind of function object (currently we use type FnType = &'static (dyn Fn(&mut Context, &str, &[Expr]) -> EvalResult + Sync), but that might not be suitable if we need to close over more state). During the evaluation of the function in question, you would add $1/$2/etc. to the variable lookup context based on the number of actual arguments, and you would want to make sure that these are not inherited by any functions called by the alias.

The advantages would be:

  • We can prevent the evaluation of the arguments more than once.
  • We might be able to produce better error messages since we can traverse the logical call stack and say something like
In expression `sole(foo())`
               ^^^^^^^^^^^
with `sole = exactly($1, 1)`
                     ^^
error: expected `$1` to evaluate to 1 element, but got 3

That would indeed be a lot of complexity and I don't think the advantages are that important here, so we can stick with the macro approach indefinitely (unless you want a challenge 😉).

@claytonrcarter
Copy link
Collaborator Author

Thank you for the review. I've combined the previous fixup commit into the base commit, and I've added a new fixup commit w/ today's changes. If you don't mind, I'd like you to double check my work on a couple of pieces that I'm a little uncertain on. (I've noted those in other comments.)

That seems sensible. After all, the user can just call foo() with no arguments anyways.

My thinking exactly. 😉

If we do, won't it be more or less equivalent to having defined them in the code?

Yup. I don't see any real upsides, I guess, aside from perhaps providing demonstration on aliases/macros.

unless you want a challenge

Not right now. 😄 I'm happy to use this implementation, and then change it if that becomes necessary or prudent.

I think I see what you're saying though: the text-macro system (as originally implemented here) leads to repeated parsing and evaluation of sets/arguments, the syntactic-macro system (currently implemented here) avoids repeated parsing but will still do repeated evaluation of sets, while the system in use by gitrevset parses and evaluates everything only once.

Alias can be defined via git config, either globally or per repo.

For example:
```
$ git config --global \
    git-branchless.revsets.alias.oldest \
    'sole(roots(intersection(ancestors($1), descendants($1))))'

$ git query 'oldest(message(fixup))'
...

$ git config \
    git-branchless.revsets.alias.grandChildren \
    'children(children($1))'

$ git query 'grandChildren(abc123)'
...
```
@claytonrcarter claytonrcarter merged commit 70b63c0 into master Aug 30, 2022
@claytonrcarter claytonrcarter deleted the revset-aliases branch August 30, 2022 01:52
@claytonrcarter
Copy link
Collaborator Author

I've added a basic write up about aliases to the revset docs at https://github.com/arxanas/git-branchless/wiki/Reference:-Revsets#Aliases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants