Skip to content

AutoTokenizer.from_pretrained crashed:AttributeError: 'dict' object has no attribute 'model_type' (fixed) #42395

@skadiUnderTides

Description

@skadiUnderTides

System Info

Environment

  • transformers==4.57.2
  • python==3.10

Problem

In tokenization_utils_base.py (around line 2419), _config is loaded from tokenizer_config.json with:

_config = json.load(f)
transformers_version = _config.get("transformers_version")

So _config is a dict.

A few lines below, the logic in 4.57.2 does:

if transformers_version and version.parse(transformers_version) <= version.parse("4.57.2"):
    if _is_local and _config.model_type not in [
        "mistral",
        "mistral3",
        "voxstral",
        "ministral",
        "pixtral",
    ]:
        return tokenizer

Here _config.model_type assumes _config is an object, but it is actually a dict, which causes:

AttributeError: 'dict' object has no attribute 'model_type'

Suggested fix

Replace .model_type with a dict-safe access, e.g.:

if transformers_version and version.parse(transformers_version) <= version.parse("4.57.2"):
    if _is_local and _config.get("model_type") not in [
        "mistral",
        "mistral3",
        "voxstral",
        "ministral",
        "pixtral",
    ]:
        return tokenizer

Who can help?

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Expected behavior

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions