Description
Common convention among a lot of Go programmers is to not return interfaces, because this always forces a pointer to escape, and thus triggers a heap allocation if the pointer didn't come from an argument. In fact, there's a linter to stop people from hitting themselves with this: https://github.com/butuzov/ireturn.
It also means that if the function is not inlined, Go will not devirtualize interface calls. For example:
package x
type i interface { f() }
type k int
func (*k) f() {}
//go:noinline
func x() i {
return new(k)
}
func y() {
x().f()
}
Go emits the following:
TEXT .y(SB), ABIInternal, $16-0
CMPQ SP, 16(R14)
PCDATA $0, $-2
JLS ...
PUSHQ BP
MOVQ SP, BP
SUBQ $8, SP
CALL .x(SB)
MOVQ 24(AX), AX
MOVQ AX, CX
MOVQ BX, AX
CALL CX
ADDQ $8, SP
POPQ BP
RET
However, by inspection, we can tell that CALL CX
will always call (*k).f
. Go simply does not pipe the information needed to devirtualize this call.
There is a relatively simple optimization opportunity here. There are two cases of interest:
- A function that returns an interface, where, within that function, it is possible to devirtualize every return statement's argument for that return value into the same concrete type.
- The above, but nil is also possible.
We could rewrite the above example as follows:
func x-devirt() *k {
return new(k)
}
func x() i { // Kept only for converting to a function pointer.
return x-devirt()
}
func y() {
x-devirt().f()
}
This is essentially the optimization realized by Rust's -> impl Trait
syntax, which requires that the function return precisely one concrete type, as if the return value was not a trait, but callers only see the trait. The version I suggest doesn't require changing language semantics, but probably opens up the usual optimization opportunities you get out of devirtualization.
Ideally the devirtualized return type would be advertised to callers directly to aid in devirtualization of their own return values, e.g. scribbled somewhere in ir.Func
.
This can be easily extended to functions that return either a concrete value or nil by returning a bool for indicating whether nil was returned or not. Given that this is a relatively common case, it feels worthwhile to try to address it, too.
Of course, once the optimization actually works it may be worthwhile to investigate places in the standard library where multiple unexported types are returned that could return just one, and take advantage of this optimization. For example, fmt.Errorf
returns three different concrete types, but could really get away with one.
Note that this is necessary but not sufficient to eliminate the allocation penalty of x
above. A separate optimization that changes the ABI to require the caller to pass in their own memory, possibly on the stack for a non-escaping return value, for returning a value by pointer would be necessary. This probably also requires making escape analysis track more information. (The C++ ABI on x86_64-unknown-linux does something like this for returning large types by value; the caller allocates space for them, rather than returning them on the stack.)