Replies: 1 comment
-
You should be able to use all available quantization strategies on a 4090. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
as topic , I saw the doc says support A100 A10G & T4 , I want to know if TGI could run on single RTX 4090 normally , although I run docker container success , but VRAM consume extremely high (7B model to 23G VRAM) , IDK quantize bitsandbytes is runable on this hardware or not , or have other better solution pls teach me , thanks.
Beta Was this translation helpful? Give feedback.
All reactions