Skip to content

Improving the performance of a 'typical' bat run #2545

Open
@sharkdp

Description

@sharkdp

I found this cool performance analysis tool called magic-trace (https://github.com/janestreet/magic-trace) and used a trace recording of bat to try it out. I actually spotted a few things that could potentially be improved.

Recently, we made fantastic progress on reducing bats startup speed (see #951), mostly by lazy-loading assets thanks to @Enselic changes. Here, I'm not focusing on startup-speed per-se, but rather on a "typical" run of bat, i.e. input files which are not gigantic or weird in any sense. Let's use bats Cargo.toml as an example. I'm disabling the pager because I want to focus on bat. I'm deliberately not using --no-config to simulate a real world use case (my config only contains comments and --italic-text=always). I don't have any custom assets.

First, let's measure the full execution time using hyperfine:

▶ hyperfine --warmup=20 "bat --paging=never --force-colorization Cargo.toml"
Benchmark 1: bat --paging=never --force-colorization Cargo.toml
  Time (mean ± σ):       8.5 ms ±   1.8 ms    [User: 7.5 ms, System: 1.6 ms]
  Range (min … max):     4.7 ms …  14.6 ms    271 runs

That's pretty fast, but cat only takes 1.0 ms ± 0.5 ms, so there is some room for improvement. Let's record a trace. I'm running bat once beforehand to warm up the disk cache (in order to be comparable with the benchmark above):

▶ bat Cargo.toml > /dev/null; magic-trace run -full-execution bat -- --paging=never --force-colorization Cargo.toml

The resulting trace.fct is attached here: trace.zip

Now let's look at the results on https://magic-trace.org/. The full run (16ms with tracing) looks like this:

image

The first 2.3ms are occupied with low-level startup procedures that we can't influence(?):

image

Next, we have some fast initialization stuff, config file loading, parsing and command-line option handling. Alltogether, this part only takes 600µs, so probably not much room for improvement:

image

Up next, we see the first surprise. Calling App::config ("get the final Config object based on config, command-line options, etc") takes 3.5 ms:

image

Zooming in closer:

image

Most of this time is being spent compiling regexes (for glob patterns) for our syntax mappings (not syntax highlighting!), i.e. things like this:

image

This is certainly something that could be optimized. Either by somehow pre-compiling syntax mappings (although this would only work for the builtin mappings). Or by parallelization. Or by using a faster glob matcher(?).

Next up, we query the Git status (pretty cool that we can see libgit2 internals here), which takes another 2 ms:

image

There's probably nothing to optimize here, but potentially we could already start other things in parallel! Like what comes next: loading and deserializing the theme and syntax sets, which also takes 2.5 ms:

image

Finally, we can start printing something to the terminal:

image

Notice how the first few calls to print_line take a bit longer as some regexes are being compiled lazily (I suppose):

image

image

There's also this weird part here, which I don't understand:

image

To summarize:

  • magic-trace is cool
  • Our SyntaxMapping initialization is much slower than it could be
  • I think we can optimize a few things in bats startup procedure by parallelizing individual tasks (e.g. querying Git information and loading assets)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions