-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always allow ref_model=None
#2047
Comments
Quoting from the other issue:
Is there an explanation for why |
I believe that this may be due to the implementation being carried out in multiple stages: first the initial version, followed by PEFT support, then integration with DeepSpeed... It's probably a good time to re-think it as a whole. |
In that case, I think it makes sense to just fix the other issue first because the fix for that issue is an equality check, right? |
ref_model=None
Perhaps you should give it a try. It's difficult to assess the changes involved. |
Implemented for Online DPO in #2041. It can probably be taken as reference |
Feature request
For optimisation with reference model, in most cases the reference model is the same as the trained model. We should allow the user to specify the ref model only when they don't want to use the trained model.
Currently this is possible, but only when using PEFT, which is very counter-intuitive. And even using this situation, if you want to provide a ref model that is different from the trained model, you have to define force_use_model. Even more counter-intuitive.
Currently
Proposed
Motivation
Make the lib use more intuitive.
Your contribution
For sure ;)
The text was updated successfully, but these errors were encountered: