Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax extension: add default values for arguments #4328

Open
fingolfin opened this issue Mar 22, 2021 · 3 comments
Open

Syntax extension: add default values for arguments #4328

fingolfin opened this issue Mar 22, 2021 · 3 comments

Comments

@fingolfin
Copy link
Member

It would be nice to have a convenient way to provide default values for function. Right now, one has to write code like this:

# usage:  f(x, y[, z])  if z not given then it defaults to 42
f := function(x, y, more...)
  if Length(more) = 0 then
    z := 42;
  elif Length(more) = 1 then
    z := more[1];
  else
    Error("Too many arguments");
  fi;
  ...
end

It would be much nicer if one could write something like this, similar to what languages like C++, Python, Julia, ... allow yo u to do:

# usage:  f(x, y[, z])  if z not given then it defaults to 42
f := function(x, y, z := 42)
  ...
end

Caveats

Printing

Printing a function with default argument values should be tested; I imagine it'd look like this

function ( x, y, z := 42  )
    ...
end

Evaluation of default arguments

Default arguments can be any expression:

f := function ( x, y, z := LogInt(42)  )
    ...
end

We might even allow these to accept other arguments (to avoid ambiguities, I'd suggest only preceding arguments may be referenced -- that would also greatly simplify parsing of such expressions)

f := function ( x, y := x+1, z := x+y  )
    ...
end

That means we must be wary about when to actually evaluate these expressions, and in which context.

There are at least two major options, both with their pros and cons:

  1. evaluation at time of definition: i.e., when we parse the function, we immediately evaluate the default arguments, and only store the result.
    • That is likely by far the easiest to implement.
    • It would not support function ( x, y := x+1, z := x+y )...
    • it would allow function ( x, y := Factorial(30) ), and computes Factorial(30) only once and store it.
    • For f := function(x := "") return x; end; we'd have f() always return the same string
  2. evaluation at time of execution
    • function ( x, y := x+1, z := x+y )... would work
    • function ( x, y := Factorial(30) ); would recompute Factorial(30) each time the default value is used.
    • For f := function(x := "") return x; end; we'd have f() always return a new string

More on what I think we could do below under "Implementation"

Use in closures.

f := x -> {y := x, z := y+1} -> [x, y, z]

Introspection

It should perhaps be possible to retrieve the default values from a function object? But it's not clear what that should do if they are not constants (see also the preceding section).

But the syntax tree code definitely should be extended to work with these, and of course also tests for this should be added.

Implementation

... if evaluating at definition time

If we are willing to evaluate at definition time, then we can implement this fairly easily: start in ReadFuncArgList (from src/read.c): there, change while (rs->s.Symbol == S_COMMA) { to also allow for S_ASSIGN; and if we see S_ASSIGN then parse one expression (the default argument); and immediately evaluate it. Then store the result in the BodyHeader::values list, and we can reference it later on.

.... evaluation at call time

If we want to evaluate at call time, the easiest and cleanest way (albeit not the most efficient!) to implement this would probably be by putting the default value expression into separate helper function. A definition like

f := function ( x, y := x+1, z := x+y  )
    ...
end

could be translated into something like the following; note that the helpers only can access the preceding arguments, but not any later arguments, nor can they modify any other arguments or local variables:

tmp1 := {x} -> x+1;
tmp2 := {x,y}
f := function ( x, y, z  )
  # The following would not work in a "regular" function, but here we could
  # install special C handlers which leave unset arguments as "unbound"; this
  # is actually trivial to do, as arguments and local variables are really the
  # same thing in GAP, only distinguished by whether they are assigned at the
  # start of the function or not.
  if not IsBound(y) then y := tmp1(x); fi;
  if not IsBound(z) then z := tmp2(x, y); fi;
    ...
end

Implementation again starts in ReadFuncArgList (from src/read.c): there, change while (rs->s.Symbol == S_COMMA) { to also allow for S_ASSIGN; and if we see S_ASSIGN then call a helper which takes the arguments as parsed by ReadFuncArgList so far; uses this to start a new function expression; parses one expression (the default argument).

Caveat: at the time we read the argument list, there is not yet a function body for the function whose arguments we are reading. As such, we can't easily store the default expression inside the main function; nor can we emit the desired if not IsBound(foo) then foo := BAR; fi;

As a variant, we could also defined tmp1 and tmp2 as closures; this way they could also perform fancy tricks, like e.g. also taking later arguments into account; modifying other arguments or local variables from the expression; and more. However, I think this is quite a bit more involved to implement. On the upside, I am not sure I'd consider these "features"... Anyway, this variant approach only adds "abilities" compared to the the previous form, so we could always switch to it later on, if desired.

@ChrisJefferson
Copy link
Contributor

I think this would be a nice feature.

I'm personally more tempted towards evaluation at call time, because Python does evaluation at definition time, and it often confuses me. In particular, if you have a function like the following, if you build the list once and keep using the same one, the list acquires another value every time you call the function. I can't think of a way of avoiding that with "definition time", other than making the default values immutable.

f := function(list := [])
  Add(list, 1);
  return list;
end;

@wilfwilson
Copy link
Member

I like this feature suggestion.

Evaluation at the time of definition feels more GAP-ish to me, i.e. if someone told me that this new feature existed, but I didn't ask for or read any details about it, then I would expect it to work this way.

But ultimately, I think I'm with Chris, and I'd prefer evaluation at the time of execution, and find it more useful.

Also, if I'm not mistaken, you can use the execution-time version to get the 'advantages' of the definition-time, I think (although it might be 'ugly'). i.e.

global_variable := Factorial(30);
function ( x, y := gvar )

and

global_string := "";
f := function(x := global_string) return x; end;

@fingolfin
Copy link
Member Author

Yeah, call time would be more useful and arguably closer to what people expect.

To implement either mode, we need a way to read the expression into an independent function context (i.e., not into a closure of the to-be-read function, or any surrounding code); this may be a bit tricky to implement... we'd wrap the expression to be read into a fake function, vaguely similar to what ReadEvalFile does (which implements ReadAsFunction, and really should be renamed accordingly).... Perhaps it could be as simple as something roughly like this:

Obj ReadDefaultValueFunc(... Int nargs, Obj nams, ...)
{
...
    Bag oldLVars = SWITCH_TO_BOTTOM_LVARS();
    memcpy(&oldIntr, &rs->intr, sizeof(rs->intr));
    memset(&rs->intr, 0, sizeof(rs->intr));
    IntrBegin(&rs->intr);

    /* fake the 'function ()'                                              */
    IntrFuncExprBegin(&rs->intr, 0, 0, nams,
                      GetInputLineNumber(rs->s.input));

    // only read a single expression
    ReadExpr(rs,    S_COMMA|symbolS_EOF, 'x' );

    /* fake the 'end;'                                                     */
    TRY_IF_NO_ERROR {
        IntrFuncExprEnd(&rs->intr, nr);
    }
    CATCH_ERROR {
        IntrAbortCoding(&rs->intr);
    }

    /* end the interpreter                                                 */
    type = IntrEnd(&rs->intr, rs->s.NrError > 0, evalResult);

    // restore the execution environment
    SWITCH_TO_OLD_LVARS(oldLVars);

...
}

This returns a T_FUNCTION object containing the "default value" function.

These then could be stored somewhere, perhaps in the values list of the BodyHeader.

Then we need a way to signal "this function can take N to M arguments" or ".. take N to M arguments, or more" (for variadic function with default arguments), which would be different from variadics; we need this for execution (or at least it'll make things a bit easier in my mind), and for printing the function. Here, I'll assume that if the function takes N to M arguments, then there are M-N "default value functions"; and those are stored as the first M-N entries in the values list. So now printing the function becomes doable.

For execution, we could have special variants of handlers like DoExecFunc2args: those handlers would inspect the function bag to determine how many, if any, arguments need to be filled by executing the relevant "default value function". (I guess error messages in there might have a slightly funky backtrace, but that could probably also be dealt with).

Aaaanyway, I already spent far too much time thinking about this -- I really shouldn't, I have far more pressing things to take care of. But if some enterprising soul is interested in taking this up, these notes might help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants