Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce variability in benchmarks via deterministic allocation #89

Open
overlookmotel opened this issue Aug 15, 2024 · 1 comment
Open

Comments

@overlookmotel
Copy link

oxc-project/oxc#4483 was an experiment in reducing variance in our benchmarks by using an allocator which has deterministic behavior.

The experiment was in one sense a success - it reduced variance in benchmarks almost to zero. That does suggest that indeterminism in system allocator is the largest cause of benchmark variance. In fact, I wonder why it didn't reduce variance to absolutely zero - what else can be causing variance?

But there is a problem: The simple allocator I used talc is too fast. This makes allocation unrealistically cheap, which in turn makes our benchmarks unrealistic.

Let's say we introduce a change that removes a bunch of allocations, but requires some extra work to do that (caching structs, bookkeeping etc). That is very likely to be a performance gain in real world, but if the allocator we use for benchmarks makes allocation unrealistically cheap (as this one does), benchmarks will lie to us and tell us it's a perf regression. For example, with this allocator, benchmarks probably would have told us oxc-project/oxc#4213 was a perf regression, whereas in fact it gave +5% speed up.

NB: Allocations are a small part of the code overall. So if we're seeing 10% perf boost on some benchmarks from replacing the allocator, probably that means this new allocator is ~double the speed of the system one. That's a very big discrepancy.

What we need is an allocator which is as close as possible to real world allocators (e.g. libc's, or jemallocator) but does not include any random elements.

We might have more luck with https://crates.io/crates/dlmalloc which it sounds like is a closer analogue to the default system allocator (from libc), and so may reduce this discrepancy. But it doesn't have as easy an API to work with.

I also asked for help on CodSpeed Discord but it seems they're not sure how to solve this either.

Try to figure this out when have more time.

@overlookmotel
Copy link
Author

@MichaReiser mentioned on CodSpeed Discord that he had some success reducing variance in benchmarks by tweaking some of Jemalloc's config:

astral-sh/ruff#13299

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant