Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ranger: Improve intsRanger performance #202

Merged
merged 5 commits into from
Dec 16, 2022

Conversation

matheusd
Copy link
Contributor

@matheusd matheusd commented Dec 12, 2022

This uses an indirection trick to avoid an allocation in intsRanger.Range call, signficantly improving its performance during iteration.

The trick involves calling reflect.ValueOf on the address of the field instead of using its concrete value. This avoids an allocation performed by the Go runtime to convert the int value to an interface value.

In order to use this trick, the interpretation of the 'i' field needed to be changed from "next index" to "previous index" and its initialization needed to be offset by -1.

Following is the impact of this change in the relevant benchmark:

name          old time/op    new time/op    delta
IntsRanger-4    61.0ns ± 2%    23.1ns ± 1%   -62.12%  (p=0.000 n=10+9)

name          old alloc/op   new alloc/op   delta
IntsRanger-4     15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=9+10)

name          old allocs/op  new allocs/op  delta
IntsRanger-4      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)

Additionally, it adds a benchmark to assert custom renderer performance.

It's now possible to execute templates with guaranteed zero memory allocations (during execution) by using a combination of int and custom rangers and either simple field access or custom renderers.

This uses an indirection trick to avoid an allocation in
intsRanger.Range call, signficantly improving its performance during
iteration.

The trick involves calling reflect.ValueOf on the address of the field
instead of using its concrete value. This avoids an allocation performed
by the Go runtime to convert the int value to an interface value.

In order to use this trick, the interpretation of the 'i' field needed
to be changed from "next index" to "previous index" and its
initialization needed to be offset by -1.

Following is the impact of this change in the relevant benchmark:

name          old time/op    new time/op    delta
IntsRanger-4    61.0ns ± 2%    23.1ns ± 1%   -62.12%  (p=0.000 n=10+9)

name          old alloc/op   new alloc/op   delta
IntsRanger-4     15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=9+10)

name          old allocs/op  new allocs/op  delta
IntsRanger-4      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
@sauerbraten
Copy link
Collaborator

Very cool, I think the new intsRanger code is still pretty easy to read, so am definitely gonna merge this. Thank you!

This demonstrates function calling is cheaper than pipeline execution in
the current implementation.
@sauerbraten sauerbraten merged commit 41ef507 into CloudyKit:master Dec 16, 2022
@matheusd matheusd deleted the speedup2 branch December 16, 2022 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants