Skip to content

Prompt times don't match the real times in a screen recording, and zsh4humans seem to be getting quite a good advantage #5

@ericbn

Description

@ericbn

It's hard to be sure how exactly zsh-bench is measuring what it claims to be measuring, by looking at its code.

I'm going to trust a screen recording as the source of truth for the prompt times in my machine. I've recorded zim and zsh4humans, 3 times each. ​Each time the steps are:

  1. Sleep 1 second. In this time I type exit and ENTER, so it's used as input in the next steps.
  2. Print the framework name. I consider the counting starts when the output of this print appears in the recording.
  3. Start a new shell with HOME as the specific framework installation dir. I consider the counting ends when the prompt appears.

This translates to: for f in zim zsh4humans; do repeat 3 do; sleep 1; print ${f}; HOME=${PWD:h}/${f} zsh -li; done; done

The steps above are executed in a dir with a git repo with 1,000 directories and 10,000 files, set up using the code here and here.

Each framework was installed with the setup script here and here, respectively for zim and zsh4humans.

This should be enough to guarantee that the recordings are using the same scenario used by zsh-bench.

This is the screen recording:

Kapture 2021-10-24 at 16 54 36

This is the times extracted from checking the recording frames:

begin_frame prompt_frame frames ms
zim 1 161 169 9 300.000
zim 2 200 207 8 266.667
zim 3 238 246 9 300.000
zsh4humans 1 276 282 7 233.333
zsh4humans 2 316 321 6 200.000
zsh4humans 3 354 359 6 200.000

The recording has 30fps, so each frame represents 33.333ms. We can consider the error of each measurement above to be +/- 33.333ms.

This is what I get when running zsh-bench on my same machine. Ran 3 times each of the mentioned frameworks:

❯ ./zsh-bench zim
==> setting up a container for benchmarking ...
==> benchmarking zim ...
creates_tty=0
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=115.790
first_command_lag_ms=149.910
command_lag_ms=62.535
input_lag_ms=28.830
exit_time_ms=47.530

❯ ./zsh-bench zim
==> setting up a container for benchmarking ...
==> benchmarking zim ...
creates_tty=0
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=196.126
first_command_lag_ms=229.765
command_lag_ms=142.110
input_lag_ms=29.402
exit_time_ms=48.146

❯ ./zsh-bench zim
==> setting up a container for benchmarking ...
==> benchmarking zim ...
creates_tty=0
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=196.878
first_command_lag_ms=231.291
command_lag_ms=140.756
input_lag_ms=27.556
exit_time_ms=48.560

❯ ./zsh-bench zsh4humans
==> setting up a container for benchmarking ...
==> benchmarking zsh4humans ...
creates_tty=1
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=30.361
first_command_lag_ms=100.323
command_lag_ms=4.997
input_lag_ms=14.536
exit_time_ms=10.895

❯ ./zsh-bench zsh4humans
==> setting up a container for benchmarking ...
==> benchmarking zsh4humans ...
creates_tty=1
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=30.458
first_command_lag_ms=101.612
command_lag_ms=5.198
input_lag_ms=12.339
exit_time_ms=11.051

❯ ./zsh-bench zsh4humans
==> setting up a container for benchmarking ...
==> benchmarking zsh4humans ...
creates_tty=1
has_compsys=1
has_syntax_highlighting=1
has_autosuggestions=1
has_git_prompt=1
first_prompt_lag_ms=29.819
first_command_lag_ms=108.122
command_lag_ms=5.123
input_lag_ms=13.146
exit_time_ms=11.05

What is odd

zim first_prompt_lag_ms times in zsh-bench fluctuate quite a lot. It's an average of 169.598ms with stdev of 46.601! In the recordings the average was 288.889ms with a stdev of 19.245, which was actually a stdev of 0.577 in terms of frames (it's just one frame more in one recordings). zsh-bench should be more precise than a 30fps recording.

Also from zsh-bench's code looks like this output is the min value. It would be good to know what is the stdev of values inside a zsh-bench run, to make sure the values are not fluctuating.

I could compare the minimum recording time with the minimum zsh-bench time for each framework, but let's use the averages since the stdev of zsh-bench for zim was so high. So for completeness of information, the first_prompt_lag_ms average for zsh4humans is 30.213ms, with a stdev of 0.344. And in recordings the average was 244.444ms (stdev was 50.918, but again it was actually just one frame more in one of the recordings).

Then, it's odd that zim is 1.703x faster in zsh-bench than the recordings -- I was expecting much closer values --, and zsh4humans is 8.091x faster in zsh-bench than the recordings!

If anything, zsh-bench if favoring zsh4humans by a lot, both in terms of more stable measurements, than giving it times way faster than the real ones.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions