This issue aims to raise the attention of the abnormal training setting. I found that serval settings in the model and training files only code for batch size = 1.
train_wan_t2v.py function training_step:
torch.randint(0, self.pipe.scheduler.num_train_timesteps, (1,))
models.wan_video_dit.py: Line 252:
self.modulation = nn.Parameter(torch.randn(1, 2, dim) / dim**0.5)