Open
Description
I am using huggigface candle for smollm local inference. it's fast-growing optimized framework that can be used across different devices. we can add it here. i can raise PR, it's already integrated in candle as it's architecture is same as llama.
https://github.com/huggingface/candle/tree/main/candle-examples/examples/quantized
https://github.com/huggingface/candle/tree/main/candle-examples/examples/llama
Metadata
Metadata
Assignees
Labels
No labels