Allow baseline per category

Edit: if there's already a way to do this and I'm just being ignorant: please say!

Often I'm trying to compare alternative implementations that can impact multiple scenarios - essentially baseline vs option 1 (vs option 2 etc), but for multiple *separate* tests A, B, C.

The `Baseline = true`  feature is great, but only really allows a single baseline. If multiple methods are marked as `Baseline`, then the test runner fails and complains at you.

However, `[BenchmarkCategory(...)]` exists (via #248). It is currently only used to *filter* tests to run, but it could be much richer:

- the result grids could be split by category
- the relative comparisons against baseline could be *computed* by category

so instead of:

```
                   Method |         Mean |      Error |     StdDev |      Op/s | Scaled | ScaledSD |    Gen 0 |   Gen 1 |  Allocated |
------------------------- |-------------:|-----------:|-----------:|----------:|-------:|---------:|---------:|--------:|-----------:|
         'single, Stream' |     61.32 us |  1.2102 us |  1.8842 us | 16,309.04 |   1.00 |     0.00 |   2.2583 |  0.0292 |   13.88 KB |
 'single, ReadOnlyBuffer' |     52.98 us |  0.1676 us |  0.1568 us | 18,874.14 |   0.86 |     0.03 |   2.2583 |  0.0042 |   13.88 KB |
          'multi, Stream' | 14,285.07 us | 54.3681 us | 50.8560 us |     70.00 | 233.18 |     6.95 | 564.1667 | 10.0000 | 3470.77 KB |
  'multi, ReadOnlyBuffer' | 13,219.59 us | 34.1087 us | 31.9053 us |     75.65 | 215.79 |     6.41 | 564.1667 |  0.8333 | 3470.76 KB |
```

we could have:

```

Category: 'multi'

                   Method |         Mean |      Error |     StdDev |      Op/s | Scaled | ScaledSD |    Gen 0 |   Gen 1 |  Allocated |
------------------------- |-------------:|-----------:|-----------:|----------:|-------:|---------:|---------:|--------:|-----------:|
                 'Stream' |     61.32 us |  1.2102 us |  1.8842 us | 16,309.04 |   1.00 |     0.00 |   2.2583 |  0.0292 |   13.88 KB |
         'ReadOnlyBuffer' |     52.98 us |  0.1676 us |  0.1568 us | 18,874.14 |   0.86 |     0.03 |   2.2583 |  0.0042 |   13.88 KB |

Category: 'single'

                   Method |         Mean |      Error |     StdDev |      Op/s | Scaled | ScaledSD |    Gen 0 |   Gen 1 |  Allocated |
------------------------- |-------------:|-----------:|-----------:|----------:|-------:|---------:|---------:|--------:|-----------:|
                 'Stream' | 14,285.07 us | 54.3681 us | 50.8560 us |     70.00 |   1.00 |     *.** | 564.1667 | 10.0000 | 3470.77 KB |
         'ReadOnlyBuffer' | 13,219.59 us | 34.1087 us | 31.9053 us |     75.65 |   0.93 |     *.** | 564.1667 |  0.8333 | 3470.76 KB |
```

with `Scaled` / `ScaledSD` being relative to the `Baseline` (if one) in that same category.

(*.** is just where I haven't "done the math" by hand; to be clear: "single" and "multi" here are completely different tests - it isn't just more of the same - naming is hard)

If necessary, this could be an opt-in `SplitResultsByCategory` feature on custom options. Or it could be implicit: "there's multiple baselines == split by category" (since this won't have worked previously, this can't change existing *working* behaviour)
  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Allow baseline per category #617

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Allow baseline per category #617

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions