-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
I found that although this is a single step diffusion, the memory consumption during training is still quite large. Considering that I only have RTX3090, I will record some of my settings, hoping to help those in need.
1、--train_batch_size need to set 1.
2、--rank_lora need to set from 64 to 16
nohup accelerate launch --config_file config/config.yaml --gpu_ids 0,1,2,3 --num_processes 4 --main_process_port 57079 --mixed_precision="fp16" train/train.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--teacher_lora_path=$TEACHER_MODEL_NAME \
--train_batch_size=1 --rank=64 --rank_vae=64 --rank_lora=16 \
--num_train_epochs=200 --checkpointing_steps=5000 --validation_steps=500 --max_train_steps=200000 \
--learning_rate=5e-06 --learning_rate_reg=1e-06 --lr_scheduler="cosine_with_restarts" --lr_warmup_steps=3000 \
--seed=43 --use_default_prompt --use_teacher_lora --use_random_bias \
--output_dir=$OUTPUT_DIR \
--report_to="wandb" --log_code --log_name=$LOG_NAME \
--gradient_accumulation_steps=1 \
--resume_from_checkpoint="latest" \
--guidance_scale=7.5 > $OUTPUT_LOG 2>&1 & \
LiWeitu and XuBao12
Metadata
Metadata
Assignees
Labels
No labels