Skip to content

Huggingface error for models with arbitrary model name #103

Closed
@strangiato

Description

@strangiato

When executing guidellm against a vllm instance with an arbitrary model name set, guidellm errors out with a huggingface error that it can't access the tokenizer_config.json.

Duplicating the issue

Deploy a vllm instance with any model and set the following argument:

--served-model-name=my-model

Run a guidellm test against the endpoint:

guidellm \
  --target "http://localhost:8000/v1" \
  --model "my-model" \
  --data-type emulated \
  --data "prompt_tokens=512,generated_tokens=128"

Results

guidellm errors out with a 401 on the toeknizer_config.json for my-model since my-model isn't a valid huggingface model name.

requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/my-model/resolve/main/tokenizer_config.json

Stack Trace

The following is an example of a full stack trace of the error:

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_http.py", line 409, in hf_raise_for_status
    response.raise_for_status()
  File "/opt/app-root/lib64/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/granite/resolve/main/tokenizer_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/utils/hub.py", line 424, in cached_files
    hf_hub_download(
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 961, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1068, in _hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1596, in _raise_on_head_call_error
    raise head_call_error
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1484, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 1401, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 285, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/file_download.py", line 309, in _request_wrapper
    hf_raise_for_status(response)
  File "/opt/app-root/lib64/python3.11/site-packages/huggingface_hub/utils/_http.py", line 459, in hf_raise_for_status
    raise _format(RepositoryNotFoundError, message, response) from e
huggingface_hub.errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-67eb10fa-305b679009abf7055fe388ff;30339271-9f91-49ff-8324-c347a6b5da16)

Repository Not Found for url: https://huggingface.co/granite/resolve/main/tokenizer_config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication
Invalid username or password.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/main.py", line 239, in generate_benchmark_report
    tokenizer_inst = backend_inst.model_tokenizer()
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/backend/base.py", line 173, in model_tokenizer
    return AutoTokenizer.from_pretrained(self.model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 910, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 742, in get_tokenizer_config
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/utils/hub.py", line 266, in cached_file
    file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/transformers/utils/hub.py", line 456, in cached_files
    raise EnvironmentError(
OSError: granite is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/bin/guidellm", line 8, in <module>
    sys.exit(generate_benchmark_report_cli())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/main.py", line 171, in generate_benchmark_report_cli
    generate_benchmark_report(
  File "/opt/app-root/lib64/python3.11/site-packages/guidellm/main.py", line 241, in generate_benchmark_report
    raise ValueError(
ValueError: Could not load model's tokenizer, --tokenizer must be provided for request generation

Why is this important

OpenShift AI sets the --served-model-name argument to the name of the ServingRuntime the user provides when they are deploying a vLLM instance and does not use the actual huggingface model name. Any model deployed with OpenShift AI will not be able to be load tested with guidellm unless the user knows how to customize the --served-model-name argument and that they need to set it to the correct huggingface name.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions