Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: allow self-referencing closures #33167

Closed
pjebs opened this issue Jul 18, 2019 · 60 comments
Closed

proposal: Go 2: allow self-referencing closures #33167

pjebs opened this issue Jul 18, 2019 · 60 comments
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal Proposal-FinalCommentPeriod v2 An incompatible library change
Milestone

Comments

@pjebs
Copy link
Contributor

pjebs commented Jul 18, 2019

	inspect := func(chain []ChainModel, insert bool) {

		for i := range chain {
			cm := &chain[i]
			...
			inspect(cm.Parents, insert)   <------- error here
		}
	}

Has to be changed to:

	var inspect func([]ChainModel, bool)
	inspect = func(chain []ChainModel, insert bool) {

		for i := range chain {
			cm := &chain[i]
			...
			inspect(cm.Parents, insert)   <------- allowed
		}
	}
@gopherbot gopherbot added this to the Proposal milestone Jul 18, 2019
@pjebs
Copy link
Contributor Author

pjebs commented Jul 18, 2019

I think for recursive functions, this is a hassle and should be made "as an exception".

I admit it should not be high-priority, if accepted.

@av86743
Copy link

av86743 commented Jul 18, 2019

Even if implemented, this still won't work with mutual recursion.

@earthboundkid
Copy link
Contributor

Block scope rules are already very complicated. I don't think there needs to be a new exception.

@bcmills bcmills added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Jul 18, 2019
@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2019

This proposal needs more detail. If := brings the declared variables in scope for a function declaration, should it also bring them in scope for other declarations? If not, why not?

Consider the following related examples:

https://play.golang.org/p/qJQRGcCHjAU:

	if err, ok := err.(net.Error); ok {
		fmt.Println(err)
	}

https://play.golang.org/p/yoQSNsSDRJZ:

	for i := range s[:len(s)-1] {
		i := i + 1
		fmt.Printf("%c", s[i])
	}

https://golang.org/doc/faq#closures_and_goroutines:

    for _, v := range values {
        v := v // create a new 'v'.
        go func() {
            fmt.Println(v)
            done <- true
        }()
    }

@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2019

To be clear: I sympathize with the underlying problem. I just think it would be clearer to try to make func […] declarations work within a function body than to try to change the behavior of :=.

func […] {
	[…]
	func inspect(chain []ChainModel, insert bool) {
		for i := range chain {
			cm := &chain[i]
			...
			inspect(cm.Parents, insert)
		}
	}
	[…]
}

@av86743
Copy link

av86743 commented Jul 18, 2019

@bcmills

Without going radically with nested func-s, you could substitute redundant closure type automatically, via trivial AST rewrite or an equivalent, in cases like

func f() {
    // ...
    var g func(int) (bool, error) = func(x int) (bool, error) {
        return x == x, nil
    }
    // ...
}

This simple language extension (relaxation) is easy not only to implement, but also to describe.

Conveniently, const g func ... is failed because (I quote):

...: const initializer func literal is not a constant

Ah well. Variable func literal it is. Cannot make an exact equivalent of nested func in terms of a source language.

@wpalmer
Copy link

wpalmer commented Jul 18, 2019

To be clear: I sympathize with the underlying problem. I just think it would be clearer to try to make func […] declarations work within a function body than to try to change the behavior of :=.

While I think that making func ... declarations work within non-global scopes would be beneficial (perhaps with some rules such as lowercase-only), I believe that is a separate concern and should not pollute this proposal.

Having := impact any scope which "begins" on that statement make so much sense, that I would go so far as to say that any exception to this rule is extremely unintuitive, and arguably goes against the current spec (after all, := does define a variable within the enclosing function)

@earthboundkid
Copy link
Contributor

...: const initializer func literal is not a constant

I think I made an issue to complain about that before. My use case is const MyHandler http.HandlerFunc = func(w http.ResponseWriter, r *http.Request) { ... }

@ianlancetaylor
Copy link
Contributor

All our previous attempts to address this foundered on the need to support mutually recursive functions while preserving a clear declaration scope.

@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2019

@ianlancetaylor, do you happen to have links for previous attempts?

My knee-jerk suggestion is to allow a locally-scoped func declaration to refer to any other func declaration that is not separated by a statement or non-constant expression. (That is: the two func declarations could have any number of intervening func or const declarations, and perhaps var declarations with constant initializers.)

@av86743
Copy link

av86743 commented Jul 18, 2019

@bcmills

the two func declarations could have any number of intervening func or const declarations, and perhaps var declarations with constant initializers.

What's the point in this contrived restriction? Allow to refer from any point in the code to any function in this and all enclosing function bodies, as well as to top-level functions. Like in many well-known languages that entirely liberated themselves from the concept of variable.

@alanfo
Copy link

alanfo commented Jul 18, 2019

I don't know how often people write recursive closures (I wrote a couple earlier this week) though I don't think the current situation is all that bad.

Sure, there's duplication but at least it's clear what's happening and it can cater for mutually recursive functions.

@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2019

What's the point in this contrived restriction?

In general in Go, a variable in a local scope does not exist until its point of declaration. (See https://play.golang.org/p/szHNxwcSjwL.)

It would be confusing to allow a local func to close over a variable that does not yet exist — especially if it does so by invoking a different function that has closed over that variable. What value would the function observe for that variable if invoked before the variable declaration is evaluated? (The answer is especially unclear if the variable has a non-zero initial value.)

@ianlancetaylor
Copy link
Contributor

@bcmills I don't have any links. We discussed it quite a bit in the pre-open-source days.

@av86743
Copy link

av86743 commented Jul 18, 2019

@bcmills

It would be confusing to allow a local func to close over a variable that does not yet exist — especially if it does so by invoking a different function that has closed over that variable. What value would the function observe for that variable if invoked before the variable declaration is evaluated? (The answer is especially unclear if the variable has a non-zero initial value.)

You are mixing up things. Initialization of variables in a given scope is a non-related aspect of language semantics. Each function can access (or closure may be bound to) part of the variables from its body, from enclosing function bodies, and from the top (package) scope. Whereas initialization of all variables in all enclosing scopes is hoisted within corresponding scope (to put it simplistically) and precedes any invocation of function/closure which may use it. It is obvious if you consider allocation and initialization of variables in generated code.

@ianlancetaylor
Copy link
Contributor

@av86743 You're not wrong, but I think you are not considering shadowing. Consider https://play.golang.org/p/SxLl2NwMt9D .

@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2019

@av86743, consider the following example:

	func f1() int {
		return f2()
	}

	x := f1()

	func f2() int {
		return x
	}

Without restrictions on declaration order, that example would be admitted — but now the value of the variable x depends upon itself!

The simple way to break that initialization cycle is to prohibit f1 from calling f2, since the variable x enters the scope in between the two function declarations.

@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2019

Note that the compiler rejects the package-level equivalent of the above example today (https://play.golang.org/p/TVXlSYqrp9H).

But variable declarations within a function today do not have a compiler-determined initialization order as package-level variables do: instead, variable initializers within a function body execute in program order.

I think that changing function-local variables to use the more complex package initialization order would be a mistake: it would make the behavior of functions much more difficult to reason about.

@av86743
Copy link

av86743 commented Jul 18, 2019

@ianlancetaylor

Consider rewriting body of for in your example (or any other kind of scope that I have not mentioned earlier) as an invocation of an anonymous function. With this transformation, you also need to take care of multiple inter-scope level exits, however this is again a separate aspect of language semantics and this is clearly doable.

If this construction seems to be too complicated, in my earlier reply extend the set of scopes with every other kind of scope that is present in the language. This does not affect the line of reasoning: it is all right to use every function/closure in a given scope and enclosing scopes, including those defined later in the corresponding scope.

Granted, the analysis on closure dependencies has to take into account forward references to closures, which may affect hoisting of variable initializations.

@av86743
Copy link

av86743 commented Jul 18, 2019

@bcmills

Without restrictions on declaration order, that example would be admitted — but now the value of the variable x depends upon itself!

Note that the compiler rejects the package-level equivalent of the above example today (https://play.golang.org/p/TVXlSYqrp9H).

Yours is an example of diverging computation. In this case, compiler detected this - nice catch! However, nothing in this example precludes it from being successfully compiled and then executed - indefinitely or perhaps until stack overflows. There are also simpler ways to do the same. And there apparently should be ways to do this in golang today so that compiler will not recognize this - if you are familiar with halting problem.

Compiler fails your example because it, practically, does not make sense. How about something meaningful using forward function/closure references? If this is doable at the top level, same holds for nested scopes as well. Granted, it may not be a fast path - but only in presence of forward references. You will likely need topo sort and cutting cycles of mutual references which will be pretty much a no-op (O(1) if you wish) on fast path.

But variable declarations within a function today do not have a compiler-determined initialization order as package-level variables do: instead, variable initializers within a function body execute in program order.

"... execute in program order" is a simplistic way to describe something which becomes more complicated and more difficult to describe precisely in the presence of forward function/closure references. Nothing, though, that would be contrary to "idiom" that every variable is initialized before it is ever used.

@bcmills
Copy link
Contributor

bcmills commented Jul 18, 2019

@av86743, there are certainly languages in which it makes sense to have loosely-defined initialization and control flow. (Haskell and Prolog in particular come to mind.)

However, Go today is a very mutative, very procedural language. I think applying that style of programming to Go would be too much of an impedance mismatch with the rest of the language.

@av86743
Copy link

av86743 commented Jul 18, 2019

However, Go today is a very mutative, very procedural language. I think applying that style of programming to Go would be too much of an impedance mismatch with the rest of the language.

I support that. Simple and by extension efficient in compilation and execution - that's golang that we want.

@loren-osborn
Copy link

Making the := operator accept function values like this reminds me of how Go handles the untyped-ness of literal constants. A numeric literal can’t have an in-memory representation until it has a type (to give it at least a bit-width, for instance). := should only need the type of the RHS function in order to have the LHS variable declared with its correct type. The “value” of the function needn’t be evaluated until the variable has been declared from the type inference. I think this is at least rationalization that this is not a special case for function values.

I’m interested to hear any opinions on this. Am I off in left field on this? Can the function be parsed properly into an AST, to get type information, then be evaluated and validated in a separate step of compilation?

@ianlancetaylor
Copy link
Contributor

@loren-osborn If I understand you correctly, then you are correct: the function does not need to be evaluated to determine the type of the variable. But that's not the issue here. The issue here is scoping: in a variable declaration, when does the variable itself come into scope? And when doing mutual recursion of function closures, how does scoping work so that the closures can refer to each other?

The current scoping rules are, in my opinion, simple and clear. I don't see how to change them without making them less simple and less clear. And that seems like a bad tradeoff for a case that rarely arises in practice, and that can be done today, when it does arise, in a slightly more awkward manner.

@pjebs
Copy link
Contributor Author

pjebs commented Jul 25, 2019

Let me reframe the perspective of the issue. I can see that people are discussing it from the angle of Go's scope laws that will need to be adjusted to accomodate the proposal.

I wrote that an exception should be made for closures for this reason:

Let's assume you created a closure for calculating the factorial of a number:

var factorial func(x uint) uint = func(x uint) uint {
	if x == 0 {
		return 1
	}
	
	return x * factorial(x-1)  <-------- recursion
}

My proposal is really about expanding the current capabilities of closures to include common mathematical concepts like recursion.

Currently a closure can't be used with recursion. You have to rewrite it as a Func Declaration to make use of recursion.

This is especially the case if you define a closure as a package level variable - where you can't use the "trick" of first declaring a func variable and then defining the func body in the next statement.

You can see with this code snippet: https://play.golang.org/p/GQHtC-2v8Ep

The compiler produces this error:

./prog.go:7:5: initialization loop:
	prog.go:7:5 factorial refers to
	prog.go:7:5 factorial

@earthboundkid
Copy link
Contributor

If it’s at the package level, you can recurse by using a non-variable declaration. I can’t think of why you’d need to write a recursing variable function at package level, but if it comes up, you can use an init func.

@earthboundkid
Copy link
Contributor

Changing the language means everyone who learns Go has to learn the new syntax. I've run into this problem maybe twice in 5 years of writing Go. It was easy enough to solve by just declaring the function before use. It looked a little ugly to have the same function declaration twice in a row, but it was easy to write and not confusing to read. I don't think this problem is worth adding new language features for.

@ianlancetaylor
Copy link
Contributor

If a func literal is set to a variable, that variable is available within the func literal for the function to refer to itself.

It's important to note that this is not backward compatible. If the variable shadows another variable defined in an enclosing block, the reference inside the function literal would silently change from the outer variable to the inner variable. This silent change in language semantics would not be permitted by the language transition rules (https://github.com/golang/proposal/blob/master/design/28221-go2-transitions.md).

@pjebs
Copy link
Contributor Author

pjebs commented Aug 27, 2019

I will amend proposal since you are correct

@pjebs
Copy link
Contributor Author

pjebs commented Aug 27, 2019

Perhaps ^()?

@ianlancetaylor
Copy link
Contributor

The current proposal is not backward compatible. I don't think it has been amended.

@pjebs
Copy link
Contributor Author

pjebs commented Oct 15, 2019

I think ^() would be backward compatible.

z := ^(arg1, arg2)

@ianlancetaylor
Copy link
Contributor

I think that would be backward compatible, but it seems quite obscure.

@pjebs
Copy link
Contributor Author

pjebs commented Oct 15, 2019

^ can be seen as an arrow to "above function"

@rogpeppe
Copy link
Contributor

z = ^(...) is not only obscure, it's not backwardly compatible either, as you can have a single argument and z=^(a) already means bitwise-not of the expression a.

@ianlancetaylor
Copy link
Contributor

We still don't have a clean, backward compatible, way to handle a self-referencing closure. And we don't have any way at all to handle mutually referencing closures, though that may be less important.

At this point we think this is a likely decline. We can reconsider if we see a clean, non-obscure, approach for this.

-- for @golang/proposal-review

@networkimprov
Copy link

Has the Javascript approach been considered? Functions defined within a scope are visible anywhere within it, and variables defined anywhere within that scope are visible to those functions (as if the functions were defined at the end).

func x() {
   wantsCallback(cb)
   func cb() {
      fmt.Println(v)
   }
   v := 1
}

Re initialization loops, the compiler already catches them for top-level scope.

@robpike
Copy link
Contributor

robpike commented Oct 30, 2019

It would be unwise and profoundly incompatible to change Go's scoping rules.

@networkimprov
Copy link

OK, require that any named closure like cb() above be defined at the end of its parent scope.

@ianlancetaylor
Copy link
Contributor

I don't see the need for "variables defined anywhere within that scope are visible to those functions." If we set that aside, I think that the amended proposal would be something like

  • A function may contain an embedded function (not function literal) (this is currently forbidden)
  • Embedded functions are visible within the entire scope of the block in which they are defined (including before they are defined)

This does raise an issue.

    {
        fmt.Println(f())
        var v = 2
        func f() int { return v }
    }

Is the variable v initialized when we call f, even though we have not yet reached the declaration? At package scope we enforce dependency order of initialization, but it's not clear that can or should work within a function.

And of course there is

    {
        fmt.Println(f())
        var v = f() + 1
        func f() int { return v }
    }

@networkimprov
Copy link

The latter can be disallowed as an initialization loop; those are barred in package scope.

We can't run code to initialize closure-ref'd variables not already initialized at closure invocation, because you'd have to allow:

fmt.Println(f())
v := resource_intensive()
func f() int { return v }

Therefore func f() int {...} must either cause implicit var v int at top of scope, or that must be written. Presumably the latter is more Goish.

And that approximates "variables defined anywhere within that scope are visible to those functions" because now you must define them at the top in order to use them :-)

@alanfo
Copy link

alanfo commented Nov 28, 2019

The present approach of using function literals rather than nested functions works fine for non-recursive cases.

Although introducing nested functions would deal better with recursive or mutually recursive cases, this feels like overkill to me for solving something which is a relatively minor problem for most people.

As well as the kind of issues which @ianlancetaylor drew attention to, in non-recursive cases (probably the vast majority) it would create a confusion in people's minds whether to use a function literal or a nested function.

A simpler way of avoiding the present duplication would be to use the 'dotdotdot' approach, where the function literal's parameter and return types could be inferred from its previous declaration:

func F() int {
	var f1, f2 func(i int) int
	f1 = func(...) ... { 
		if i <= 0 {
			return 0
		}
		return i + f2(i - 1)
	}
	f2 = func(...) ... {
		return i + f1(i - 1)
	}
	return f2(1)
}

However, it's not as clear and it would need to be a compile time error if the previous declaration did not include the parameter names as well as the types

My conclusion, frankly, is that it's best to leave matters as they are.

@networkimprov
Copy link

networkimprov commented Nov 29, 2019

There's a greater benefit than eliminating an extra declaration; you can define embedded functions in order of invocation, which makes code much more readable. I use this technique in Javascript often.

func f() {
   if needs_work {
      schedule_work(next_steps) // async 
   } else {
      next_steps()
   }
   func next_steps() {
      ... // continues for many lines
   }
}

@alanfo
Copy link

alanfo commented Nov 30, 2019

There's a greater benefit than eliminating an extra declaration; you can define embedded functions in order of invocation, which makes code much more readable.

I guess whether you find that a benefit or not depends on your programming mindset.

I tend to think like a C programmer and prefer to see functions declared before I call them. To me having to have a separate variable declaration for recursive function literals is analogous to a forward declaration in C and seems perfectly natural.

Even at top level, I always place the main() function last in the file even though order is immaterial in Go.

Having said all that, I have nothing against nested functions as such, I just don't think it's a good idea for a simple language such as Go to support both them and function literals. In a more complicated language such as C#, which does have both, you need to think carefully before opting for one or the other and the choice is not always straightforward. In Go there's nothing to think about - there's only one way to do it.

Also, there must have been a good reason why the Go team opted for function literals in the first place, despite the difficulties with recursivity. I imagine this was because it made the scoping rules simpler though @ianlancetaylor may be able to enlighten us on that.

@networkimprov
Copy link

networkimprov commented Nov 30, 2019

A separate declaration would not help in the example I gave above.

The key case for function literals is callback arguments: sort.Slice(a, func(a, b int)bool {...})

EDIT: Note that there are already two ways to define package functions: var F = func() {...}

@alanfo
Copy link

alanfo commented Dec 2, 2019

Note that there are already two ways to define package functions

True, though the considerations are different at package level - declaration order is immaterial and there are no 'capturing' aspects.

Off the top of my head, the only reason I can think of why one would prefer to use a function literal rather than an 'ordinary' function at package level is if you wanted to assign different functions (of the same type) to the one variable.

Of course, you could do the same thing at function level as well which might be one of the reasons why function literals were preferred to nested functions when the language was being designed.

@earthboundkid
Copy link
Contributor

earthboundkid commented Dec 2, 2019

Another reason to use var F = at package level is so you can set the type to http.HandlerFunc. I opened an issue ( #26351) at some point to allow const F http.HandlerFunc = ... but it never went anywhere.

@ianlancetaylor
Copy link
Contributor

In #33167 (comment) we said

We still don't have a clean, backward compatible, way to handle a self-referencing closure. And we don't have any way at all to handle mutually referencing closures, though that may be less important.

At this point we think this is a likely decline. We can reconsider if we see a clean, non-obscure, approach for this.

There was then further discussion and we reset the clock. But the further discussion still has complex scoping issues, and raises problems with variable initialization ordering.

For these reasons this is again a likely decline. Leaving open for four weeks for final comments.

-- for @golang/proposal-review

@ianlancetaylor
Copy link
Contributor

There were no further comments.

@golang golang locked and limited conversation to collaborators Jan 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal Proposal-FinalCommentPeriod v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests