[DataFlow runtime] DFlash end-to-end on the composable launch (offline + online)#628
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
… W3′ naming Review fixes (verified against the files): - Status (confirmed): stop calling the in-review composable-launch stack (#627/#628/#629) "landed"/"DONE"/"done". Split the genuinely-merged spine from the in-review stack in §1; one consistent "in review" label in §1/Phase A/success table and across the roadmap (README, Phase A). Leave the spine's "landed" wording (it is merged). - Module placement (confirmed): Evaluator/EvalCache are top-level domain managers (specforge/eval/), not specforge/runtime/eval/ — fix the eval-and-breadth.md outlier to match plan.md §2.3 and domain-refactor.md. - W3′ naming (confirmed): SGLangServerEngine is ONE engine with two feature transports (capture-into-FeatureStore for W3/O1.3, inline-HTTP for the light W3′) — disambiguate in §2.2, the workload table and §G2 rather than overloading one name. - O1.3 spike (reviewer's premise refuted — it is already an explicit 🔴 gate): added the valid narrow point instead — the spike scopes only the sglang_server slice of Phase B; the de-EAGLE3 extraction and domain Trainer carry no engine risk. Additional contradictions found by a completeness sweep and fixed: - StrategySpec registry: plan.md said it "stays in runtime/training unchanged" but §6 + Phase E move it — clarify the per-step strategy seam stays, the registry converges into training/strategies/. - TargetEngine source: extracted from modeling/target/*TargetModel (adapters wrap it), not "absorbs runtime/inference adapters". - Draft package: models/drafts is the target layout; note today's modeling/draft/ + real filenames. - Dependency graph: align domain-refactor (E depended on {C,D}) with README (D→E, C parallel). - Drop the up-19/up-20 branch tags that only appeared in the online doc. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
Code reviewNo high-confidence issues found. Checked for bugs in the DFlash adapter, offline transform/collate path, strategy registration, and online rollout-to-train wiring. |
…e + online)
DFlash now trains through the runtime via a StrategySpec + a DFlashAdapter, with
ZERO launch.py changes (the spec seam from the previous commit carries it).
- registry.py: dflash spec — offline reader (OfflineManifestReader with dflash
feature_keys, no aux/target swap), per-sample transform, padding collate; online
via DFlashAdapter; supports_online=True.
- inference/dflash_adapter.py (new): wraps generate_dflash_data, emits
{input_ids, hidden_states, loss_mask}; verify_capture self-skips the eagle3
aux/target checks (different feature names + __aux_layer_ids__=None).
- tests/_fixtures.py: write_offline_files_dflash + build_dflash (tiny Qwen3 target
-> DFlash draft + TargetEmbeddingsAndHead -> OnlineDFlashModel).
- tests/test_dflash_launch.py + test_dflash_online_launch.py (new, GPU): offline
and online dflash train end-to-end through FSDP.
- tests/test_strategy_registry.py: dflash-fully-wired assertions.
DFlash is online-only in production (no offline dumper exists yet — prepare_
hidden_states.py is eagle3-only), so the offline path is exercised with synthetic
fixtures while online is its real workflow.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
457b20a to
8faf111
Compare
6945bfe to
a6b8b7d
Compare
|
Addressed review feedback (self-review pass).
Deferred: extracting the length-grouped batching shared with Validated: full |
What
DFlash trains end-to-end (offline + online) through the composable launch from the parent PR — via a
StrategySpecentry + aDFlashAdapter, with ZEROlaunch.pychanges.Changes
registry.py: dflash spec — offline reader (OfflineManifestReaderwith dflashfeature_keys, no aux/target swap), per-sample transform, padding collate; online viaDFlashAdapter;supports_online=True.specforge/runtime/inference/dflash_adapter.py(new): wrapsgenerate_dflash_data, emits{input_ids, hidden_states, loss_mask}.verify_captureself-skips the eagle3 aux/target checks (different feature names +__aux_layer_ids__=None).tests/_fixtures.py:write_offline_files_dflash+build_dflash(tiny Qwen3 target → DFlash draft +TargetEmbeddingsAndHead→OnlineDFlashModel).tests/test_dflash_launch.py+test_dflash_online_launch.py(new, GPU): offline and online dflash train end-to-end through FSDP.Note
DFlash is online-only in production today (no offline dumper exists —
prepare_hidden_states.pyis eagle3-only), so the offline path is exercised with synthetic fixtures while online is its real workflow.Testing
Part of the 197 tests OK suite run at the stack tip (sci-h200 / H200).
Stacked on the composable-launch PR. Part 2/3.
🤖 Generated with Claude Code