You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MLPs insert a None into the router logit list. Therefore, when enabled in the model, they need to be filtered out either before or within the aux_loss function.
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
fromtransformersimportAutoConfig, AutoModelForCausalLMconfig=AutoConfig.from_pretrained("Qwen/Qwen3-30B-A3B") # or any qwen3_moe modelconfig.update({"mlp_only_layers": [0]}) # or any non-empty listmodel=AutoModelForCausalLM.from_config(config)
_=model(INPUT_TOKENS, output_router_logits=True) # raises an error in the aux_loss function
Expected behavior
Model should output router logits and not raise an exception.