[DataFlow runtime 7/7] Integration: launcher + end-to-end equivalence gates#600
Merged
jiapingW merged 1 commit intoJun 25, 2026
Conversation
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
ea463fc to
d005a13
Compare
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d005a13 to
7a81ce5
Compare
jiapingW
approved these changes
Jun 25, 2026
This was referenced Jun 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DataFlow runtime — part 7/7 (integration). Stacked on #599 — true-stacked: this PR's base is the previous PR's branch, so the diff below shows only this layer.
Turns the runtime into a thin launcher and adds the end-to-end equivalence gates.
What
specforge/runtime/launch.py—build_offline_eagle3_runtime: assemblesOfflineManifestReader → DataFlowController → LocalFeatureStore → FeatureDataLoader → Eagle3TrainStrategy → TrainerController/Core → FSDP.scripts/train_eagle3_dataflow.py— thin offline launcher; reusestrain_eagle3's model/data builders (no training logic in the script).@skipUnless(cuda)for the GPU ones):test_equiv_offline_eagle3(oldrun_forwardvs newEagle3TrainStrategy.forward_loss, bit-exact per-batch loss),test_equiv_online_eagle3,test_equiv_trainer_split,test_offline_launch_fsdp,test_checkpoint_resume,test_extraction_vs_hf_reference, plus_fixtures.py.args.target_batch_sizein the dataflow launcher (it was read before being set → crash); hardendestroy_distributed()againstNone/already-destroyed groups so a successful run does not exit non-zero on teardown.runtime/README.md,runtime/ARCHITECTURE.md.How to run the full 7B old-vs-new offline comparison
Results — Qwen2.5-7B, 200 steps, HF backend, seed 0 (offline)
Old and new converge to the same point (loss ≈ 4.15, acc ≈ 0.7, acceptance ≈ 0.23, grad ≈ 5). Per-step values are not bit-identical because the two paths iterate samples in different order and report loss slightly differently;
test_equiv_offline_eagle3isolates the per-batch math as bit-exact.Part of an 8-PR stack adding the DataFlow runtime (M1–M4 + integration). Verified on current
main: imports + fulltests/test_runtimepass.