ranger: Improve intsRanger performance #202

matheusd · 2022-12-12T13:05:54Z

This uses an indirection trick to avoid an allocation in intsRanger.Range call, signficantly improving its performance during iteration.

The trick involves calling reflect.ValueOf on the address of the field instead of using its concrete value. This avoids an allocation performed by the Go runtime to convert the int value to an interface value.

In order to use this trick, the interpretation of the 'i' field needed to be changed from "next index" to "previous index" and its initialization needed to be offset by -1.

Following is the impact of this change in the relevant benchmark:

name          old time/op    new time/op    delta
IntsRanger-4    61.0ns ± 2%    23.1ns ± 1%   -62.12%  (p=0.000 n=10+9)

name          old alloc/op   new alloc/op   delta
IntsRanger-4     15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=9+10)

name          old allocs/op  new allocs/op  delta
IntsRanger-4      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)

Additionally, it adds a benchmark to assert custom renderer performance.

It's now possible to execute templates with guaranteed zero memory allocations (during execution) by using a combination of int and custom rangers and either simple field access or custom renderers.

This uses an indirection trick to avoid an allocation in intsRanger.Range call, signficantly improving its performance during iteration. The trick involves calling reflect.ValueOf on the address of the field instead of using its concrete value. This avoids an allocation performed by the Go runtime to convert the int value to an interface value. In order to use this trick, the interpretation of the 'i' field needed to be changed from "next index" to "previous index" and its initialization needed to be offset by -1. Following is the impact of this change in the relevant benchmark: name old time/op new time/op delta IntsRanger-4 61.0ns ± 2% 23.1ns ± 1% -62.12% (p=0.000 n=10+9) name old alloc/op new alloc/op delta IntsRanger-4 15.0B ± 0% 0.0B -100.00% (p=0.000 n=9+10) name old allocs/op new allocs/op delta IntsRanger-4 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)

docs/syntax.md

sauerbraten · 2022-12-15T21:28:22Z

Very cool, I think the new intsRanger code is still pretty easy to read, so am definitely gonna merge this. Thank you!

This demonstrates function calling is cheaper than pipeline execution in the current implementation.

matheusd added 3 commits December 12, 2022 09:53

eval/test: Add ints ranger benchmark

80439ac

eval/test: Add custom renderer benchmark

6d68eea

sauerbraten reviewed Dec 15, 2022

View reviewed changes

docs/syntax.md Show resolved Hide resolved

matheusd added 2 commits December 16, 2022 09:04

docs: Remove TODO from ranger section

b86ea01

eval/test: Add benchmarks for fn and pipeline execution

2544d9a

This demonstrates function calling is cheaper than pipeline execution in the current implementation.

matheusd force-pushed the speedup2 branch from a56a8e1 to 2544d9a Compare December 16, 2022 12:04

sauerbraten merged commit 41ef507 into CloudyKit:master Dec 16, 2022

matheusd deleted the speedup2 branch December 16, 2022 15:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ranger: Improve intsRanger performance #202

ranger: Improve intsRanger performance #202

matheusd commented Dec 12, 2022 •

edited

Loading

sauerbraten commented Dec 15, 2022

ranger: Improve intsRanger performance #202

ranger: Improve intsRanger performance #202

Conversation

matheusd commented Dec 12, 2022 • edited Loading

sauerbraten commented Dec 15, 2022

matheusd commented Dec 12, 2022 •

edited

Loading