-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: dataclass args for accelerated MoE tuning #390
base: main
Are you sure you want to change the base?
feat: dataclass args for accelerated MoE tuning #390
Conversation
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Thanks for making a pull request! 😃 |
tuning/config/acceleration_configs/acceleration_framework_config.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Tested using new flag on granite 3 3b MoE, inference up next Regular MOE tuningTested this branch without
Training logs:
Location: Fast MOEAnd with
Training logs
Location: ResultsWe see a 2.48x speedup |
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
After running checkpoint utils on the branch Fabian created for safetensors, vLLM inference ran as expected:
Post-processing completed with this script (thanks again Fabian!):
FastMOE model saved in: |
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Description of the change
This PR adds one dataclass argument to enable accelerted moe for
sft_trainer.py
, via the new fms-acceleration accelerated-moe plugin and allows for accelerated MoE full-finetuning with the--fast_moe
flag.--fast_moe
enables a technique to train Mixture of Expert (MoE) models in parallel instead of sequentially.With this flag, we expect major speedup in train time and decrease in memory usage on Mixture of Expert models.
Related issue number
How to verify the PR
This PR is a work-in-progress and requires more testing, and the official release of
fms-acceleration-moe
fast_moe
.fast_moe
Was the PR tested