[DataFlow runtime] Phase B2 — decouple the target engine from the sglang version by maocheng23 · Pull Request #632 · sgl-project/SpecForge

maocheng23 · 2026-07-01T03:07:54Z

Phase B (domain abstractions) — 2/3. Stacked on #631 (B1). This is the core decoupling.

Before this, both SGLangEagle3TargetEngine and SGLangDFlashTargetEngine imported ~20 sglang internals directly and each carried its own near-duplicate _extend forward — the two copies had even drifted to different sglang API versions (eagle3 = module-level prepare_mlp_sync_batch_raw(attn_cp_size=); dflash = the removed Scheduler.prepare_mlp_sync_batch_raw(spec_algorithm=)). A sglang bump touched every subclass and the copies could silently diverge.

This extracts every sglang internal + the single capture forward into one version-pinned boundary, sglang_backend/capture.py::SGLangCaptureBackend; the algorithm engines now compose it and import zero sglang (enforced by a pure-AST test). A sglang bump now touches one file. Net −592 lines in the engine files.

Behavior:

Byte-identical on the test configs (TP=1/2, dp=1): require_mlp_sync is False so the unified mlp-sync branch is skipped identically; construction, req building, the forward, split/shard logic, and pool-clear ordering are transplanted verbatim. import specforge stays sglang-optional (lazy import in from_pretrained).
Two deliberate, flagged changes: (1) DFlash mlp-sync unified onto the eagle3 0.5.9 signature — its old Scheduler.* call was latent-broken for dp>1; (2) dropped a stray debug print() in DFlash set_capture_layers.

Also adds the sglang_server backend (SGLangServerEagle3TargetEngine): selectable via the factory, construction raises an actionable NotImplementedError until the live-capture depth is set by the O1.3 spike (docs/roadmap/online-disaggregation.md §O1.3).

New test: tests/test_runtime/test_sglang_capture_backend.py (AST decoupling invariant + sglang_server selectability).

Validation

Full tests/test_runtime 214 OK (2 skip, 1 xfail) on 8×H200; the hf-vs-sglang-vs-custom capture parity test builds a real SGLang runner through SGLangCaptureBackend at TP=2 and matches the HF reference — 2 OK. Adversarial review: 0 confirmed defects.

🤖 Generated with Claude Code

…ang version Extract EVERY sglang internal + the duplicated extend/capture forward into one version-pinned boundary, `sglang_backend/capture.py::SGLangCaptureBackend`, and have the algorithm engines COMPOSE it instead of embedding it: SGLangCaptureBackend (the only place that imports sglang.srt.* for capture) · build() ServerArgs / ModelConfig / SGLangRunner wiring (unified) · _forward_extend() the single ScheduleBatch/ForwardBatch capture forward · _maybe_prepare_mlp_sync_batch() ONE (0.5.9) prepare_mlp_sync signature · extend / extend_vlm / extend_dflash / get_rope_index / set_eagle3_capture_layers SGLangEagle3TargetEngine / SGLangDFlashTargetEngine now hold a backend and do only torch-side output shaping — they import ZERO sglang internals (verified by tests/test_runtime/test_sglang_capture_backend.py, a pure-AST invariant). Why: before this, both sglang engines imported ~20 sglang symbols and each carried its own near-duplicate `_extend`; the two copies had drifted to DIFFERENT sglang API versions (eagle3 = module-level prepare_mlp_sync_batch_raw(attn_cp_size=); dflash = the removed Scheduler.prepare_mlp_sync_batch_raw(spec_algorithm=)). A sglang bump touched every subclass and the copies could silently diverge. Now a bump touches one file; "put the pieces together" (capture backend + shaping + adapter) instead of tangling the version into each algorithm. Behavior: - Byte-identical on the test configs (TP=1/2, dp=1): require_mlp_sync is False so the unified mlp-sync branch is skipped identically; construction, req building, the forward, splitting/shard logic, and pool-clear ordering are transplanted verbatim (`import specforge` stays sglang-optional via lazy import in from_pretrained; the engine forward is still under @torch.no_grad). - Two deliberate, flagged changes: (1) DFlash's mlp-sync now uses the same 0.5.9 signature as eagle3 — its old Scheduler.* call was latent-broken for dp>1; (2) dropped a stray debug print() in DFlash set_capture_layers. Also adds the `sglang_server` backend (SGLangServerEagle3TargetEngine): selectable via get_eagle3_target_model(backend="sglang_server"), construction raises an actionable NotImplementedError until the live-capture depth is set by the O1.3 spike (docs/roadmap/online-disaggregation.md §O1.3). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gemini-code-assist · 2026-07-01T03:07:57Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

maocheng23 mentioned this pull request Jul 1, 2026

[DataFlow runtime] Phase B3 — domain Trainer wrapping the runtime seam #633

Merged

refactor: clarify sglang eagle3 capture entrypoints

3c768c6

maocheng23 marked this pull request as ready for review July 1, 2026 08:22

maocheng23 requested review from FlamingoPg, FrankLeeeee, shuaills and sleepcoo as code owners July 1, 2026 08:22

Base automatically changed from dataflow-up-24-target-engine to dataflow-up-16-zerocopy July 3, 2026 02:03

jiapingW self-requested a review July 3, 2026 02:12

jiapingW approved these changes Jul 3, 2026

View reviewed changes

jiapingW merged commit ac8f878 into dataflow-up-16-zerocopy Jul 3, 2026
1 check passed

jiapingW deleted the dataflow-up-25-sglang-capture-backend branch July 3, 2026 02:12

maocheng23 mentioned this pull request Jul 4, 2026

Merge DataFlow runtime branch into main #648

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DataFlow runtime] Phase B2 — decouple the target engine from the sglang version#632

[DataFlow runtime] Phase B2 — decouple the target engine from the sglang version#632
jiapingW merged 2 commits into
dataflow-up-16-zerocopyfrom
dataflow-up-25-sglang-capture-backend

maocheng23 commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

maocheng23 commented Jul 1, 2026

Validation

Uh oh!

gemini-code-assist Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants