-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plain pytorch LLaMA implementation (no fairscale, use as many GPUs as you want) #179
Comments
Great work!
|
You can run vanilla-llama on 1, 2, 4, 8 or 100 GPUs. |
Run 7B model on 1 GPU (1070, 8GB)
|
Yes, it doesn't loading the weights in load_checkpoint_and_dispatch. device_map = infer_auto_device_map(model) {'':cpu} at all |
Hi @veelion your error is a weird. Try to convert the weights again, it looks like something when wrong in that step. I have tested the conversion script on a cluster so i have never experienced memory problems, try again and let me know if there are any out of memory errors. If that is the case please open an issue on vanilla-llama and i will try to make the conversion script use less RAM. @yokie121 vanilla-llama uses all the available GPUs by default via |
Hi @galatolofederico I'm experiencing the same error as @veelion. I was able to load the weights, merged by my script that differs a bit, and then device_map showed all ok, but still getting NotImplementedError: Cannot copy out of meta tensor; no data! We can not open issue in your repo as issues are disabled there :) |
Hi @galatolofederico , I have tried convert the weights many times, but still got the same error as before. I followed @randaller 's method to print device_map, all are on cpu:
|
Sorry i didn't noticed it 🤦 . I have enabled the issues now! |
Maybe it can be a good idea to also release a llama version without fairscale layers. It is possible to run the 65B version using just 2 A100-SXM-80GB but this code forces you to use 8 GPUs no matter what.
Here is a vanilla pytorch implementation of LLaMA (and a script to convert the weights) https://github.com/galatolofederico/vanilla-llama
The text was updated successfully, but these errors were encountered: