perf: use faster allocators #2874

baszalmstra · 2025-01-10T10:06:58Z

I stole this approach from uv. Apparently using specialized allocators can have a significant performance benefit. However, it also adds additional compilation overhead that we dont really need when compiling locally therefor we use this clever trick with features to enable the allocators conditionally.

You can build a "performance" build with cargo build --release --features performance. This should allow you to benchmark this locally.

ruben-arts · 2025-01-10T13:21:38Z

I've tested this on M4 PRO, whith the holoviews repo and the following command

PIXI_CACHE_DIR=/tmp/bench_pixi hyperfine -p "pixi clean && rm -f pixi.lock" "main-pixi install -vv" "fast-pixi install" -w 2 -r 5 --show-output

It was inconclusive. running it twice give both as fastest.

We should test it on a linux musl build.

Hofer-Julian · 2025-01-10T16:22:30Z

@ruben-arts and I stumbled over #2878 while testing this PR

ruben-arts · 2025-01-13T10:23:29Z

Huge improvement after building correctly:

❯ hyperfine -p "rm pixi.lock -f" "pixi-perf-musl list --no-install" "pixi list --no-install" -r 3 -w 1
# THIS PR WITH `--feature performance`
Benchmark 1: pixi-perf-musl list --no-install
  Time (mean ± σ):     10.475 s ±  0.296 s    [User: 121.102 s, System: 10.084 s]
  Range (min … max):   10.260 s … 10.813 s    3 runs
# THIS PR WITHOUT THE `--feature performance`
Benchmark 2: pixi list --no-install
  Time (mean ± σ):     132.517 s ±  3.027 s    [User: 517.171 s, System: 1443.620 s]
  Range (min … max):   129.974 s … 135.865 s    3 runs
 
Summary
  pixi-perf-musl list --no-install ran
   12.65 ± 0.46 times faster than pixi list --no-install

baszalmstra · 2025-01-13T10:36:41Z

Lucky find I guess!

Hofer-Julian · 2025-01-13T10:42:50Z

A few things that would be good to figure out:

Does it actually increase compile time, if not we can do it without the "performance" feature gate
- Check Windows
- Check macOS
Does it increase the binary size, if yes, we should only enable it on platforms where it actually speeds things up
- Check windows
- Check macOS

baszalmstra · 2025-01-13T13:49:35Z

I took the approach from uv where the explicitly gate this to reduce the compile time. It introduces a number of heavy dependencies.

With my simple benchmark it speeds things up on Windows, but just marginally. Unless the binary size increases a lot I say we leave it as is.

wolfv · 2025-01-13T14:15:34Z

I compared artifacts produced by different CI pipelines and they are pretty similar in size. But I am not sure if the artifact produces is using the feature.

baszalmstra · 2025-01-13T14:18:18Z

Only the artifact produced by dist is using the performance features atm.

baszalmstra · 2025-01-13T16:46:30Z

Just ran the same benchmark on my windows machine and compared it also to a few older versions. The results are actually quite insane..

❯ hyperfine --shell nu --prepare 'rm pixi.lock -f' 'pixi-0.39.4 list --no-install' 'pixi-0.39.5 list --no-install' 'pixi-0.40.0 list --no-install' 'pixi-0.40.0-performance list --no-install' -r 3 -w 1
Benchmark 1: pixi-0.39.4 list --no-install
  Time (mean ± σ):     13.292 s ±  0.282 s    [User: 112.317 s, System: 17.124 s]
  Range (min … max):   13.051 s … 13.602 s    3 runs

Benchmark 2: pixi-0.39.5 list --no-install
  Time (mean ± σ):     10.560 s ±  0.380 s    [User: 80.650 s, System: 7.936 s]
  Range (min … max):   10.168 s … 10.926 s    3 runs

Benchmark 3: pixi-0.40.0 list --no-install
  Time (mean ± σ):     10.481 s ±  0.257 s    [User: 80.286 s, System: 7.691 s]
  Range (min … max):   10.202 s … 10.707 s    3 runs

Benchmark 4: pixi-0.40.0-performance list --no-install
  Time (mean ± σ):      6.330 s ±  0.236 s    [User: 62.807 s, System: 7.415 s]
  Range (min … max):    6.188 s …  6.602 s    3 runs

Summary
  pixi-0.40.0-performance list --no-install ran
    1.66 ± 0.07 times faster than pixi-0.40.0 list --no-install
    1.67 ± 0.09 times faster than pixi-0.39.5 list --no-install
    2.10 ± 0.09 times faster than pixi-0.39.4 list --no-install

I ran this on the holoviews repository.

Hofer-Julian

Would be interesting if macOS also improves.

Independently of that, I think we can bring this in.

Let's remember to compare binary sizes after the next release.

ruben-arts · 2025-01-14T08:57:30Z

With the caveat that I got a crazy outlier on the run before this it doesn't seem to do much on macos

hyperfine -p "rm -f pixi.lock" "pixi-perf list --no-install" "pixi-0.40.0 list --no-install" -w 1 -r 3 
Benchmark 1: pixi-perf list --no-install
  Time (mean ± σ):      9.825 s ±  0.424 s    [User: 38.959 s, System: 8.087 s]
  Range (min … max):    9.368 s … 10.206 s    3 runs
 
Benchmark 2: pixi-0.40.0 list --no-install
  Time (mean ± σ):      9.905 s ±  1.550 s    [User: 42.261 s, System: 4.414 s]
  Range (min … max):    8.146 s … 11.072 s    3 runs
 
Summary
  pixi-perf list --no-install ran
    1.01 ± 0.16 times faster than pixi-0.40.0 list --no-install

perf: use faster allocators

b918f25

baszalmstra requested review from wolfv, ruben-arts and Hofer-Julian January 10, 2025 10:06

fmt

fac49c4

Hofer-Julian mentioned this pull request Jan 13, 2025

Downloading of repodata with linux-musl pixi builds can be very slow #2878

Open

2 tasks

Merge remote-tracking branch 'upstream/main' into perf/allocators

93d007c

Hofer-Julian approved these changes Jan 14, 2025

View reviewed changes

Merge branch 'main' into perf/allocators

fd7d002

ruben-arts enabled auto-merge (squash) January 14, 2025 09:02

ruben-arts approved these changes Jan 14, 2025

View reviewed changes

ruben-arts merged commit 9e371c4 into prefix-dev:main Jan 14, 2025
28 checks passed

BrewTestBot mentioned this pull request Jan 14, 2025

pixi 0.40.1 Homebrew/homebrew-core#204237

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: use faster allocators #2874

perf: use faster allocators #2874

baszalmstra commented Jan 10, 2025 •

edited

Loading

ruben-arts commented Jan 10, 2025

Hofer-Julian commented Jan 10, 2025

ruben-arts commented Jan 13, 2025

baszalmstra commented Jan 13, 2025

Hofer-Julian commented Jan 13, 2025

baszalmstra commented Jan 13, 2025

wolfv commented Jan 13, 2025

baszalmstra commented Jan 13, 2025

baszalmstra commented Jan 13, 2025 •

edited

Loading

Hofer-Julian left a comment

ruben-arts commented Jan 14, 2025

perf: use faster allocators #2874

perf: use faster allocators #2874

Conversation

baszalmstra commented Jan 10, 2025 • edited Loading

ruben-arts commented Jan 10, 2025

Hofer-Julian commented Jan 10, 2025

ruben-arts commented Jan 13, 2025

baszalmstra commented Jan 13, 2025

Hofer-Julian commented Jan 13, 2025

baszalmstra commented Jan 13, 2025

wolfv commented Jan 13, 2025

baszalmstra commented Jan 13, 2025

baszalmstra commented Jan 13, 2025 • edited Loading

Hofer-Julian left a comment

Choose a reason for hiding this comment

ruben-arts commented Jan 14, 2025

baszalmstra commented Jan 10, 2025 •

edited

Loading

baszalmstra commented Jan 13, 2025 •

edited

Loading