-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Closed
Description
System Info
transformersversion: 4.32.0.dev0- Platform: Linux-5.10.179-171.711.amzn2.x86_64-x86_64-with-glibc2.26
- Python version: 3.9.16
- Huggingface_hub version: 0.15.1
- Safetensors version: 0.3.1
- Accelerate version: 0.22.0.dev0
- Accelerate config: not found
- PyTorch version (GPU?): 2.0.1+cu118 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Right now any command with --deepspeed /path/to/json will fail and throw the following error
--deepspeed: invalid Dict value
This is reported in #24549 (comment) but the merged fix #24574 does not resolve this.
In fact the deepspeed: Union[str, Dict] field in training_args.py still raise the error when the deepspeed is passed as the string. This seems to be a limitation of python dataclass.
Expected behavior
deepspeed flag should support string.
Metadata
Metadata
Assignees
Labels
No labels