Closed
Description
Using the simple python script on the "supported-models" page I was able to successfully generate output from TheBloke/Wizard-Vicuna-13B-Uncensored-HF
, but h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b
generates garbage.
I'm running CUDA 11.7.1 on RHEL 8.4 and an NVIDIA A100-SXM-80GB.
Here's the script:
import sys
from vllm import LLM
llm = LLM(model=sys.argv[1])
output = llm.generate("Hello, my name is")
print(output)
Here's output from TheBloke:
prompt='Hello, my name is'
text="Bastian Mehl and I'm going to talk about how we can solve"
And here's output from h2oai:
prompt='Hello, my name is'
text='\u0442\u0435 Business up t");ymbol\u7532 _ itsardervesag t beskrevs t \u201c
Here's the full output:
Loading h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b
INFO 06-27 10:46:22 llm_engine.py:59] Initializing an LLM engine with config: model='/tmp/h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b', dtype=torch.float16, use_dummy_weights=False, download_dir=None, use_np_weights=False, tensor_parallel_size=1, seed=0)
INFO 06-27 10:46:22 tokenizer_utils.py:30] Using the LLaMA fast tokenizer in 'hf-internal-testing/llama-tokenizer' to avoid potential protobuf errors.
INFO 06-27 10:51:57 llm_engine.py:128] # GPU blocks: 3808, # CPU blocks: 327
Processed prompts: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1/1 [00:00<00:00, 3.17it/s]
[RequestOutput(request_id=0, prompt='Hello, my name is', prompt_token_ids=[1, 15043, 29892, 590, 1024, 338], outputs=[CompletionOutput(index=0, text='\u0442\u0435 Business up t");ymbol\u7532 _ itsardervesag t beskrevs t \u201c', token_ids=[730, 15197, 701, 260, 1496, 2789, 31843, 903, 967, 538, 20098, 351, 260, 7718, 260, 1346], cumulative_logprob=-90.18120217323303, logprobs={}, finish_reason=length)], finished=True)]
Metadata
Metadata
Assignees
Labels
No labels