garbage output from h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b

Using the simple python script on the "supported-models" page I was able to successfully generate output from `TheBloke/Wizard-Vicuna-13B-Uncensored-HF`, but `h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b` generates garbage.

I'm running CUDA 11.7.1 on RHEL 8.4 and an NVIDIA A100-SXM-80GB.

Here's the script:
```
import sys
from vllm import LLM
llm = LLM(model=sys.argv[1])
output = llm.generate("Hello, my name is")
print(output)
```
Here's output from TheBloke:
```
prompt='Hello, my name is'
text="Bastian Mehl and I'm going to talk about how we can solve"
```
And here's output from h2oai:
```
prompt='Hello, my name is'
text='\u0442\u0435 Business up t");ymbol\u7532 _ itsardervesag t beskrevs t \u201c
```

Here's the full output:
```
Loading h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b
INFO 06-27 10:46:22 llm_engine.py:59] Initializing an LLM engine with config: model='/tmp/h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b', dtype=torch.float16, use_dummy_weights=False, download_dir=None, use_np_weights=False, tensor_parallel_size=1, seed=0)
INFO 06-27 10:46:22 tokenizer_utils.py:30] Using the LLaMA fast tokenizer in 'hf-internal-testing/llama-tokenizer' to avoid potential protobuf errors.
INFO 06-27 10:51:57 llm_engine.py:128] # GPU blocks: 3808, # CPU blocks: 327
Processed prompts: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1/1 [00:00<00:00,  3.17it/s]
[RequestOutput(request_id=0, prompt='Hello, my name is', prompt_token_ids=[1, 15043, 29892, 590, 1024, 338], outputs=[CompletionOutput(index=0, text='\u0442\u0435 Business up t");ymbol\u7532 _ itsardervesag t beskrevs t \u201c', token_ids=[730, 15197, 701, 260, 1496, 2789, 31843, 903, 967, 538, 20098, 351, 260, 7718, 260, 1346], cumulative_logprob=-90.18120217323303, logprobs={}, finish_reason=length)], finished=True)]
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

garbage output from h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b #281

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

garbage output from h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b #281

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions