-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llava-llama is huge. OOM. #114
Comments
If you can install bitsandbytes, you can use the bf4 quantization option making it 4 times smaller. |
Installed bitsandbytes with: |
Installed bitsandbytes with
bitsandbytes-0.45.0-py3-none-win_amd64.whl
Any advice? |
|
correct fix:
Now all that's left is to wait for support HunyuanVideo in https://github.com/KONAKONA666/q8_kernels for speed and it would be great. |
saw there's a LTX Q8 version, do you have any idea if is already usable in comfy? is so confusing. |
I managed to install (windows) it (https://github.com/KONAKONA666/q8_kernels) with (https://github.com/KONAKONA666/LTX-Video) in a separate venv (not comfyUI). The speed really increased more than 2 times. But the lack of support STG and the need for constant (long time) loading/unloading of the necessary models into memory negate the speed advantage. |
thanks! yeah back to main topic |
Is there a chance to use something smaller ?
The text was updated successfully, but these errors were encountered: