Description
Describe the bug
Error when using validation prompt argument:
Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
Reproduction
"accelerate launch --num_processes 1 --num_machines 1 "
"/diffusers/examples/dreambooth/train_dreambooth_lora_hidream.py "
f"--pretrained_model_name_or_path={model_path} "
f"--instance_data_dir={train_images_dir} "
f"--output_dir={output_dir} "
f"--instance_prompt='{instance_prompt}' "
f"--resolution={resolution} "
f"--train_batch_size={train_batch_size} "
"--gradient_accumulation_steps=2 "
f"--learning_rate={learning_rate} "
f"--max_train_steps={max_train_steps} "
f"--rank={rank} "
"--lr_scheduler=constant_with_warmup "
"--lr_warmup_steps=100 "
"--gradient_checkpointing "
"--cache_latents "
"--validation_epochs=100 "
"--use_8bit_adam "
"--mixed_precision=bf16 "
"--seed=0 "
f"--validation_prompt="a photo of a {instance_prompt}" "
"--allow_tf32 "
Logs
System Info
A100-80GB
python - 3.12.6
diffusers - 0.34.0.dev0nvidia-cublas-cu12 | 12.6.4.1
nvidia-cuda-cupti-cu12 | 12.6.80
nvidia-cuda-nvrtc-cu12 | 12.6.77
nvidia-cuda-runtime-cu12 | 12.6.77
nvidia-cudnn-cu12 | 9.5.1.17
nvidia-cufft-cu12 | 11.3.0.4
nvidia-curand-cu12 | 10.3.7.77
nvidia-cusolver-cu12 | 11.7.1.2
nvidia-cusparse-cu12 | 12.5.4.2
nvidia-cusparselt-cu12 | 0.6.3
nvidia-nccl-cu12 | 2.21.5
nvidia-nvjitlink-cu12 | 12.6.85
nvidia-nvtx-cu12 | 12.6.77peft | 0.15.2
torch | 2.6.0+cu126
torchvision | 0.21.0+cu126transformers | 4.51.3