-
Notifications
You must be signed in to change notification settings - Fork 31
Description
It's hard to be sure how exactly zsh-bench is measuring what it claims to be measuring, by looking at its code.
I'm going to trust a screen recording as the source of truth for the prompt times in my machine. I've recorded zim and zsh4humans, 3 times each. Each time the steps are:
- Sleep 1 second. In this time I type
exitand ENTER, so it's used as input in the next steps. - Print the framework name. I consider the counting starts when the output of this print appears in the recording.
- Start a new shell with
HOMEas the specific framework installation dir. I consider the counting ends when the prompt appears.
This translates to: for f in zim zsh4humans; do repeat 3 do; sleep 1; print ${f}; HOME=${PWD:h}/${f} zsh -li; done; done
The steps above are executed in a dir with a git repo with 1,000 directories and 10,000 files, set up using the code here and here.
Each framework was installed with the setup script here and here, respectively for zim and zsh4humans.
This should be enough to guarantee that the recordings are using the same scenario used by zsh-bench.
This is the screen recording:
This is the times extracted from checking the recording frames:
| begin_frame | prompt_frame | frames | ms | |
|---|---|---|---|---|
| zim 1 | 161 | 169 | 9 | 300.000 |
| zim 2 | 200 | 207 | 8 | 266.667 |
| zim 3 | 238 | 246 | 9 | 300.000 |
| zsh4humans 1 | 276 | 282 | 7 | 233.333 |
| zsh4humans 2 | 316 | 321 | 6 | 200.000 |
| zsh4humans 3 | 354 | 359 | 6 | 200.000 |
The recording has 30fps, so each frame represents 33.333ms. We can consider the error of each measurement above to be +/- 33.333ms.
This is what I get when running zsh-bench on my same machine. Ran 3 times each of the mentioned frameworks:
❯ ./zsh-bench zim
==> setting up a container for benchmarking ...
==> benchmarking zim ...
creates_tty=0
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=115.790
first_command_lag_ms=149.910
command_lag_ms=62.535
input_lag_ms=28.830
exit_time_ms=47.530
❯ ./zsh-bench zim
==> setting up a container for benchmarking ...
==> benchmarking zim ...
creates_tty=0
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=196.126
first_command_lag_ms=229.765
command_lag_ms=142.110
input_lag_ms=29.402
exit_time_ms=48.146
❯ ./zsh-bench zim
==> setting up a container for benchmarking ...
==> benchmarking zim ...
creates_tty=0
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=196.878
first_command_lag_ms=231.291
command_lag_ms=140.756
input_lag_ms=27.556
exit_time_ms=48.560
❯ ./zsh-bench zsh4humans
==> setting up a container for benchmarking ...
==> benchmarking zsh4humans ...
creates_tty=1
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=30.361
first_command_lag_ms=100.323
command_lag_ms=4.997
input_lag_ms=14.536
exit_time_ms=10.895
❯ ./zsh-bench zsh4humans
==> setting up a container for benchmarking ...
==> benchmarking zsh4humans ...
creates_tty=1
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=30.458
first_command_lag_ms=101.612
command_lag_ms=5.198
input_lag_ms=12.339
exit_time_ms=11.051
❯ ./zsh-bench zsh4humans
==> setting up a container for benchmarking ...
==> benchmarking zsh4humans ...
creates_tty=1
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=29.819
first_command_lag_ms=108.122
command_lag_ms=5.123
input_lag_ms=13.146
exit_time_ms=11.05
What is odd
zim first_prompt_lag_ms times in zsh-bench fluctuate quite a lot. It's an average of 169.598ms with stdev of 46.601! In the recordings the average was 288.889ms with a stdev of 19.245, which was actually a stdev of 0.577 in terms of frames (it's just one frame more in one recordings). zsh-bench should be more precise than a 30fps recording.
Also from zsh-bench's code looks like this output is the min value. It would be good to know what is the stdev of values inside a zsh-bench run, to make sure the values are not fluctuating.
I could compare the minimum recording time with the minimum zsh-bench time for each framework, but let's use the averages since the stdev of zsh-bench for zim was so high. So for completeness of information, the first_prompt_lag_ms average for zsh4humans is 30.213ms, with a stdev of 0.344. And in recordings the average was 244.444ms (stdev was 50.918, but again it was actually just one frame more in one of the recordings).
Then, it's odd that zim is 1.703x faster in zsh-bench than the recordings -- I was expecting much closer values --, and zsh4humans is 8.091x faster in zsh-bench than the recordings!
If anything, zsh-bench if favoring zsh4humans by a lot, both in terms of more stable measurements, than giving it times way faster than the real ones.
