Skip to content

Conversation

@danielmarbach
Copy link
Contributor

@danielmarbach danielmarbach commented Feb 10, 2026

This PR aligns pipeline execution with a simplified trampoline model, keeps state on the context bag, and improves AOT/trimming friendliness while minimizing broader architectural change.

image

The trampoline model executes the pipeline as a flat loop-like progression over prebuilt parts, instead of recursively composing many nested delegates at runtime.

  • A PipelinePart[] represents the execution plan.
  • A small mutable frame on the context (Index, RangeEnd) tracks current position/range.
  • Start initializes the frame and dispatches the first part.
  • Each behavior calls next, which routes to StageRunners.Next, increments index, and dispatches the next part.
  • Stage connectors adjust the frame to jump into child ranges (and restore progression semantics through the same Next entrypoint).

This gives predictable control flow, avoids runtime code generation, and keeps hot-path dispatch compact.

Benefits

  • More AOT/trimming-friendly:
    • No runtime MakeGenericType, expression-tree compilation, or dynamic method generation required in the core dispatch path.
  • Better hot-path characteristics:
    • Compact, predictable dispatch with fewer dynamic layers.
  • Cleaner design:
    • Terminator transitions handled uniformly.
    • Reflection-derived metadata computed once and reused.
  • No source generator required: the model relies on precomputed parts + static invoker dispatch, so it remains AOT/trimming-friendly without introducing source-gen infrastructure, generator maintenance, or build-time generator
    coupling.
  • Behavior extensibility is unchanged: behaviors remain fully pluggable as before, as long as they target supported stage/context transitions for the pipeline. This preserves the existing extension model while making execution
    more static.

Additional Notes

  • The pipeline already had a precomputed behaviors array per built pipeline.
  • This PR extends the same precomputation concept to parts (PipelinePart[]), so execution now uses two aligned, indexed arrays:
    • behaviors[index] for the behavior instance
    • parts[index] for how that behavior/stage is invoked
  • This keeps runtime work minimal and shifts complexity to build time.

Pipeline Frame Safety

  • The mutable pipeline frame (Index, RangeEnd) is safe in this design because it is invocation-local state carried on the current behavior context (context bag), not global/static state.
  • In other words, each pipeline invocation operates on its own frame data; there is no cross-invocation sharing by design.
  • As long as pipeline contexts are not reused concurrently across invocations (current model), mutating the frame is thread-safe and correct.

Cost of Adding a New Pipeline

This change does add a small amount of explicit wiring when introducing a brand-new pipeline. That should be called out, but in context:

  • Adding a new pipeline already requires touching multiple extension points today (registration, model/build wiring, diagnostics/visualization, and tests/approvals).
  • With this PR, the extra work is mainly adding invoker-id mappings/wiring for the new stage transitions.
  • Historically, this is a low-frequency operation: over many years, only a small number of new pipelines were introduced (notably recoverability and audit), while existing core pipelines stayed largely stable.

So the tradeoff is intentional:

  • Slightly more explicit setup for rare "new pipeline type" work.
  • Better runtime characteristics, simpler execution model, and improved AOT/trimming compatibility for the common path.

Benchmarks

https://github.com/danielmarbach/MicroBenchmarks and branches starting with bare-metal


BenchmarkDotNet v0.15.8, macOS Tahoe 26.2 (25C56) [Darwin 25.2.0]
Apple M3 Max, 1 CPU, 14 logical and 14 physical cores
.NET SDK 10.0.101
  [Host]     : .NET 10.0.1 (10.0.1, 10.0.125.57005), Arm64 RyuJIT armv8.0-a
  DefaultJob : .NET 10.0.1 (10.0.1, 10.0.125.57005), Arm64 RyuJIT armv8.0-a

Execution

Method PipelineDepth Mean Error StdDev Ratio Allocated Alloc Ratio
Trampo 10 24.67 ns 0.114 ns 0.106 ns 0.82 - NA
Expressions 10 30.05 ns 0.172 ns 0.160 ns 1.00 - NA
Trampo 20 48.42 ns 0.558 ns 0.522 ns 0.75 - NA
Expressions 20 64.99 ns 0.467 ns 0.437 ns 1.00 - NA
Trampo 40 111.00 ns 0.649 ns 0.607 ns 0.77 - NA
Expressions 40 144.73 ns 1.599 ns 1.496 ns 1.00 - NA

Throwing

Method PipelineDepth Mean Error StdDev Ratio RatioSD Gen0 Allocated Alloc Ratio
Expressions 10 7.633 μs 0.1514 μs 0.1682 μs 1.00 0.03 0.1526 1.31 KB 1.00
Trampo 10 7.679 μs 0.1285 μs 0.1202 μs 1.01 0.03 0.1526 1.3 KB 0.99
Trampo 20 7.591 μs 0.1021 μs 0.0955 μs 0.99 0.02 0.1526 1.3 KB 0.99
Expressions 20 7.704 μs 0.1426 μs 0.1333 μs 1.00 0.02 0.1526 1.31 KB 1.00
Current 40 7.450 μs 0.1426 μs 0.1334 μs 1.00 0.02 0.1526 1.31 KB 1.00
Expressions 40 7.528 μs 0.0712 μs 0.0631 μs 1.01 0.02 0.1526 1.3 KB 0.99

Warmup

Method PipelineDepth Mean Error StdDev Ratio Gen0 Gen1 Allocated Alloc Ratio
Trampo 10 2.249 μs 0.0152 μs 0.0142 μs 0.003 2.4338 0.0229 19.89 KB 0.42
Expressions 10 765.779 μs 5.5135 μs 5.1573 μs 1.000 4.8828 1.9531 46.85 KB 1.00
Trampo 20 4.230 μs 0.0328 μs 0.0307 μs 0.003 4.8370 0.0610 39.54 KB 0.45
Expressions 20 1,518.139 μs 13.7502 μs 12.8619 μs 1.000 9.7656 3.9063 88.34 KB 1.00
Trampo 40 8.419 μs 0.0592 μs 0.0554 μs 0.003 9.6436 0.1678 78.81 KB 0.46
Expressions 40 2,983.006 μs 19.4129 μs 18.1588 μs 1.000 19.5313 7.8125 171.81 KB 1.00

Alternatives Considered

  • Class-based pipeline part objects (polymorphic dispatch): considered, but rejected for hot-path execution due to extra indirection and weaker inlining characteristics compared to static delegate/id-based dispatch.
  • Source-generated invokers: would likely work, but rejected to avoid generator complexity and build coupling; current design achieves AOT/trimming goals without source gen.
  • Runtime codegen/expression-tree compilation: rejected for AOT/trimming constraints.

See https://github.com/danielmarbach/PipelinePlayground and and branches starting with bare-metal and invokers

@danielmarbach danielmarbach changed the title Pipeline Replace expression-tree pipeline execution with trampoline parts Feb 10, 2026
@danielmarbach danielmarbach marked this pull request as ready for review February 10, 2026 17:31
@danielmarbach
Copy link
Contributor Author

Open question / assumption to validate:

  • My assumption is that our public pipeline extensibility contract is limited to:
    • adding behaviors to supported existing stages,
    • replacing existing stage connectors,
    • replacing existing fork connectors.
  • I do not believe introducing entirely new stage context transitions is a supported public scenario today, since stage graph/context evolution is framework-owned.
  • If that assumption is correct, strict invoker mapping is aligned with the documented model; fallback for unknown transitions is then an internal compatibility choice, not a required public contract guarantee.

https://docs.particular.net/nservicebus/pipeline/steps-stages-connectors

@mikeminutillo maybe knows more. In theory I could add a fallback invocation, but that would require me to re-introduce generic method creation and currently I think that would be unnecessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant