Skip to content

proposal: testing: add Keep, to force evaluation in benchmarks #61179

Open
@aclements

Description

@aclements

Benchmarks frequently need to prevent certain compiler optimizations that may optimize away parts of the code the programmer intends to benchmark. Usually, this comes up in two situations where the benchmark use of an API is slightly artificial compared to a “real” use of the API. The following example comes from @davecheney's 2013 blog post, How to write benchmarks in Go, and demonstrates both issues:

func BenchmarkFib10(b *testing.B) {
	// run the Fib function b.N times
	for n := 0; n < b.N; n++ {
		Fib(10)
	}
}
  1. Most commonly, the result of the function under test is not used because we only care about its timing. In the example, since Fib is a pure function, the compiler could optimize away the call completely. Indeed, in “real” code, the compiler would often be expected to do exactly this. But in benchmark code, we’re interested only in the side-effect of the function’s timing, which this optimization would destroy.

  2. An argument to the function under test may be unintentionally constant-folded into the function. In the example, even if we addressed the first issue, the compiler may compute Fib(10) entirely at compile time, again destroying the benchmark. This is more subtle because sometimes the intent is to benchmark a function with a particular constant-valued argument, and sometimes the constant argument is simply a placeholder.

There are ways around both of these, but they are difficult to use and tend to introduce overhead into the benchmark loop. For example, a common workaround is to add the result of the call to an accumulator. However, there’s not always a convenient accumulator type, this introduces some overhead into the loop, and the benchmark must then somehow ensure the accumulator itself doesn’t get optimized away.

In both cases, these optimizations can be partial, where part of the function under test is optimized away and part isn’t, as demonstrated in @eliben’s example. This is particularly subtle because it leads to timings that are incorrect but also not obviously wrong.

Proposal

I propose we add the following function to the testing package:

package testing

// Keep returns its argument. It ensures that its argument and result
// will be evaluated at run time and treated as non-constant.
// This is for use in benchmarks to prevent undesired compiler optimizations.
func Keep[T any](v T) T

(This proposal is an expanded and tweaked version of @randall77’s comment.)

The Keep function can be used on the result of a function under test, on arguments, or even on the function itself. Using Keep, the corrected version of the example would be:

func BenchmarkFib10(b *testing.B) {
	// run the Fib function b.N times
	for n := 0; n < b.N; n++ {
		testing.Keep(Fib(testing.Keep(10)))
	}
}

(Or testing.Keep(Fib)(10), but this is subtle enough that I don’t think we should recommend this usage.)

Unlike various other solutions, Keep also lets the benchmark author choose whether to treat an argument as constant or not, making it possible to benchmark expected constant folding.

Alternatives

  • Keep may not be the best name. This is essentially equivalent to Rust’s black_box, and we could call it testing.BlackBox. Other options include Opaque, NoOpt, Used, and Sink.

  • testing: document best practices for avoiding compiler optimizations in benchmarks #27400 asks for documentation of best practices for avoiding unwanted optimization. While we could document workarounds, the basic problem is Go doesn’t currently have a good way to write benchmarks that run afoul of compiler optimizations.

  • proposal: testing: a less error-prone API for benchmark iteration #48768 proposes testing.Iterate, which forces evaluation of all arguments and results of a function, in addition to abstracting away the b.N loop, which is another common benchmarking mistake. However, its heavy use of reflection would be difficult to make zero or even low overhead, and it lacks static type-safety. It also seems likely that users would often just pass a func() with the body of the benchmark, negating its benefits for argument and result evaluation.

  • runtime.KeepAlive can be used to force evaluation of the result of a function under test. However, this isn’t the intended use and it’s not clear how this might interact with future optimizations to KeepAlive. It also can’t be used for arguments because it doesn’t return anything. @cespare has some arguments against KeepAlive in this comment.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Hold

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions