generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Closed
Labels
Description
Reproduction
Hello!
I think there is a small bug. I was trying to find out what the difference was between the whiten_rewards and normalize_rewards parameter in the RLOOConfig object and after inspecting the code for the RLOOTrainer class I found that it is not used. Hence, I think it should probably be removed.
Thank you for your help and the codebase! It is super helpful.
System Info
I can see this in the codebase.
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete