-
Notifications
You must be signed in to change notification settings - Fork 31.5k
Closed
Labels
Description
System Info
Environment
- transformers==4.57.2
- python==3.10
Problem
In tokenization_utils_base.py (around line 2419), _config is loaded from tokenizer_config.json with:
_config = json.load(f)
transformers_version = _config.get("transformers_version")So _config is a dict.
A few lines below, the logic in 4.57.2 does:
if transformers_version and version.parse(transformers_version) <= version.parse("4.57.2"):
if _is_local and _config.model_type not in [
"mistral",
"mistral3",
"voxstral",
"ministral",
"pixtral",
]:
return tokenizerHere _config.model_type assumes _config is an object, but it is actually a dict, which causes:
AttributeError: 'dict' object has no attribute 'model_type'
Suggested fix
Replace .model_type with a dict-safe access, e.g.:
if transformers_version and version.parse(transformers_version) <= version.parse("4.57.2"):
if _is_local and _config.get("model_type") not in [
"mistral",
"mistral3",
"voxstral",
"ministral",
"pixtral",
]:
return tokenizerWho can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Expected behavior
matthen, skadiUnderTides, dlombard, c-north-8, rm-NoobInCoding and 1 more