The hf-internal-testing/llama-tokenizer do not support Chinese prompt

I wonder whether the vllm support Chinese or other language, because I can successfully inference with English prompt, but when I use Chinese prompt, exception raised:
```
INFO 06-27 11:11:16 tokenizer_utils.py:30] Using the LLaMA fast tokenizer in 'hf-internal-testing/llama-tokenizer' to avoid potential protobuf errors.
INFO 06-27 11:13:57 llm_engine.py:128] # GPU blocks: 247, # CPU blocks: 327
Processed prompts:   0%|          | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/mnt/lustre/sunyuhan/./scripts/testvllm.py", line 38, in <module>
    outputs = llm.generate(prompts, sampling_params)
  File "/mnt/cache/sunyuhan/miniconda3/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 114, in generate
    return self._run_engine(use_tqdm)
  File "/mnt/cache/sunyuhan/miniconda3/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 134, in _run_engine
    step_outputs = self.llm_engine.step()
  File "/mnt/cache/sunyuhan/miniconda3/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 242, in step
    self._decode_sequences(seq_groups)
  File "/mnt/cache/sunyuhan/miniconda3/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 259, in _decode_sequences
    new_token, new_output_text = detokenize_incrementally(
  File "/mnt/cache/sunyuhan/miniconda3/lib/python3.10/site-packages/vllm/engine/tokenizer_utils.py", line 68, in detokenize_incrementally
    output_text = tokenizer.convert_tokens_to_string(output_tokens)
  File "/mnt/cache/sunyuhan/miniconda3/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 536, in convert_tokens_to_string
    return self.backend_tokenizer.decoder.decode(tokens)
TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString'
```
And I also want to know how to solve this problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

The hf-internal-testing/llama-tokenizer do not support Chinese prompt #270

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

The hf-internal-testing/llama-tokenizer do not support Chinese prompt #270

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions