Description
Something is setting the learning_rate to none here in flux_train.py
I added a few debug loggers to help me locate the problem and its happening somewhere in this block around line 333 of flux_train.py.
if args.blockwise_fused_optimizers:
... something in here
logger.info("args.blockwise before learn_rate = " + str(args))
# prepare dataloader
# strategies are set here because they cannot be referenced in another process. Copy them with the dataset
# some strategies can be None
train_dataset_group.set_current_strategies()
somewhere in here between the if args.blockwise_fused optimizer at around line 333 (on mine) and around line 390 before preparing dataloader.
something in there sets the args.learning_rate to none and then later on when adafactor sets the inital_lr its throwing errors because the learning rate argument is no longer set.
Activity