I noticed that the current version only works with SFTTrainer. I'd like to ask if it also works with GRPOTrainer? If not, how should I modify it?