Skip to content

garbage output from h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b #281

Closed
@tjedwards

Description

@tjedwards

Using the simple python script on the "supported-models" page I was able to successfully generate output from TheBloke/Wizard-Vicuna-13B-Uncensored-HF, but h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b generates garbage.

I'm running CUDA 11.7.1 on RHEL 8.4 and an NVIDIA A100-SXM-80GB.

Here's the script:

import sys
from vllm import LLM
llm = LLM(model=sys.argv[1])
output = llm.generate("Hello, my name is")
print(output)

Here's output from TheBloke:

prompt='Hello, my name is'
text="Bastian Mehl and I'm going to talk about how we can solve"

And here's output from h2oai:

prompt='Hello, my name is'
text='\u0442\u0435 Business up t");ymbol\u7532 _ itsardervesag t beskrevs t \u201c

Here's the full output:

Loading h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b
INFO 06-27 10:46:22 llm_engine.py:59] Initializing an LLM engine with config: model='/tmp/h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-13b', dtype=torch.float16, use_dummy_weights=False, download_dir=None, use_np_weights=False, tensor_parallel_size=1, seed=0)
INFO 06-27 10:46:22 tokenizer_utils.py:30] Using the LLaMA fast tokenizer in 'hf-internal-testing/llama-tokenizer' to avoid potential protobuf errors.
INFO 06-27 10:51:57 llm_engine.py:128] # GPU blocks: 3808, # CPU blocks: 327
Processed prompts: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1/1 [00:00<00:00,  3.17it/s]
[RequestOutput(request_id=0, prompt='Hello, my name is', prompt_token_ids=[1, 15043, 29892, 590, 1024, 338], outputs=[CompletionOutput(index=0, text='\u0442\u0435 Business up t");ymbol\u7532 _ itsardervesag t beskrevs t \u201c', token_ids=[730, 15197, 701, 260, 1496, 2789, 31843, 903, 967, 538, 20098, 351, 260, 7718, 260, 1346], cumulative_logprob=-90.18120217323303, logprobs={}, finish_reason=length)], finished=True)]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions