-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mistal-chat cuda out of memory error #14
Comments
Not a Mistral official product, and shameless promotion, but if you're having OOM issues, have a go with Unsloth :) You get 70%+ memory reduction, 2x faster and no accuracy degradation! https://github.com/unslothai/unsloth Mistral v0.3 7b via a free Colab: https://colab.research.google.com/drive/1_yNCks4BTD5zOnjozppphh5GzMFaMKq_?usp=sharing |
reducing the sequence length to under 8000 helped me |
Hi thx
Of course
But the main problem is that Lora model of 80mb becomes 10gb and gets me to
memory error
The base model is 13.5gb on disk on on GPU. How come the Lora model becomes
to big?
…On Mon, May 27, 2024, 16:43 Mosleh Mahamud ***@***.***> wrote:
reducing the sequence length to under 8000 helped me
—
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACF55NPVTTZAVWWHCIUPNJ3ZEM2AXAVCNFSM6AAAAABIJ6W7O6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZTGUYTKOJVGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Lora is not the cause of you're error
|
Hello @mosh98, you mean reducing seq length while training? For sure it helps during training. Any other updates on this issue? I get the same cuda error while trying to get inferences from the fine-tuned model. The problem arises due to load_lora method I think. Because just like @yuvalshachaf said, the lora model becomes 10gb when loading! |
Hello, I have the same problem. The inference with Nemo-instruct is OK on an A100. But when finetuned, using the inference method described in the tutorial with model.load_lora, I get an OOM with the same A100. How did you solve it if you indeed solved it ? Maybe merging the adapter with peft before the inference is a solution ? |
Same issue here, did anyone found a way? |
I am getting memory error using the model i have trained from the tutorial, any idea? im using A10G GPU
The text was updated successfully, but these errors were encountered: