Closed
Description
I have trained falcon 7b model with qlora but the inference time for outputs is too high.So I want to use vllm for increasing the inference time for that I have used a code snippet to load the model path
llm = LLM(model="/content/trained-model/").
But I am getting an error :
OSError: /content/trained-model/ does not appear to have a file named config.json. Checkout
'https://huggingface.co//content/trained-model//None' for available files.
Metadata
Metadata
Assignees
Labels
No labels