-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support pre-training 8*V100 (32G) gpus with xformers #411
Conversation
Thank you for your contribution! It looks nice to me, and thank you for providing the training logs as well. One minor: can you remove
|
OK, I'll modify it as soon as possible and submit it. |
Completed the modification and submitted |
no response,Is there anything else that needs to be modified? |
Thanks for your effort and I want to inquire about the GPU assumption. After applying xformers like FastChat, could LLaVA 13B be fine-tuned on 8 V100? |
I have tested it and GPU OOM will occur. I'll test it tomorrow and see if I can adjust the parameters. |
Test results on 8*V100 (32G) gpus with xformers: pre-training works on 7b, but fine-tuned does not work on 7b, causing OOM. |
Thanks! I think I will just use Lora. Thanks for your effort and patience! |
Sorry for the late response, and thank you for the modification. I will review and merge this week. |
is there any way to pretrain on 4*v100 (32gb)? |
Hi, is there a way to fine tune 7b on 8*V100s now by any chance? Thanks! |
Support pre-training 8*V100 (32G) gpus with xformers
could you please confirm which version of xformers did you use? |
So there's no way to do full-parameter fine-tuning with v100, right? |
FlashAttention does not support V100 GPU, and llava cannot train on V100, but most of the cards now are still V100. Now we introduce the FastChat solution to llava, and xFormers replaces FlashAttention. Like FastChat(https://github.com/lm-sys/FastChat), you can now use LLaVa to train on V100.