-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔀 Add MergeModelCallBack
#2282
🔀 Add MergeModelCallBack
#2282
Conversation
@lewtun |
Thanks a lot @August-murr for the work. Can you add documentation, and test? |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
I've already added most of the docs, as for the tests, unfortunately I won't be able to do it for a few days and if nobody else added them, I'll do it later. |
The tests I added validate the success of the merge and I could expand it if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this callback @August-murr - the implementation is very clean! I've left some minor comments on the PR and then I think it's good to merge (pun intended :))
Can you also fix the tests - it seems to be an issue with the path creation on Windows. Also can you add the callback to the docs here: https://github.com/huggingface/trl/blob/main/docs/source/callbacks.mdx
removing ## from docs Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
removing ## from docs Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
adding types Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
Another question that came up during the review: why have a new configuration class when we can use the mergekit one directly? I'm afraid of confusing the user, tempted to use : from mergekit import MergeConfiguration
from trl import MergeModelCallback
merge_callback = MergeModelCallback(MergeConfiguration()) |
Actually, ease of use for the user was the reason why I had to write the class in mergekit_utils since mergekit uses a yaml file to get it's Merge config, which is easier to implement but more complicated for the user. and if you wanted to use from mergekit.config import MergeConfiguration
merge_config_dict = {
"dtype": "float16",
"merge_method": "linear",
"models": [
{"model": "path_to_model_1", "parameters": {"weight": 0.4}},
{"model": "path_to_model_2", "parameters": {"weight": 0.6}},
],
}
config = MergeConfiguration.model_validate(merge_config_dict) As you add more parameters to the configuration, the dictionary becomes increasingly nested. The current implementation, although harder to maintain, simplifies everything for the user: from trl.mergekit_utils import MergeConfig
config = MergeConfig("linear")
config.policy_model_weight = 0.4
config.target_model_weight = 0.6 |
That makes sense. |
I'll figure it out. |
The main issue with using Mergekit's class MergeConfiguration(BaseModel):
merge_method: str
slices: Optional[List[OutputSliceDefinition]] = None
models: Optional[List[InputModelDefinition]] = None
parameters: Optional[Dict[str, ParameterSetting]] = None
base_model: Optional[ModelReference] = None
dtype: Optional[str] = None
tokenizer_source: Union[
Literal["union"], Literal["base"], ModelReference, None
] = None
tokenizer: Optional[TokenizerConfig] = None
chat_template: Optional[str] = None
out_dtype: Optional[str] = None If someone wanted to set up the configuration manually, they’d either need to:
Neither option is user-friendly. I admit the current implementation looks messy, but the alternative would create more complications for the user. Maybe in future versions, the Mergekit team will make |
@qgallouedec |
… into MergeModelCallBack
LGTM thanks!
|
@qgallouedec |
Ah thanks, I was debugging, but I don't have access to windows vm right now (explains fa5bafe). Any idea how to solve it? |
…eanup errors with temp dir
Found a solution with a57d88a |
@qgallouedec |
No worry, thanks a lot for this nice addition! |
What does this PR do?
Fixes #2241
Since the focus was on replicating the checkpoint merging methods from the paper, I have covered only Linear, TIES, SLERP, and DARE-TIES merging methods. These were the ones used in the paper and were primarily tested with the DPOTrainer.
Please provide feedback on any issues and suggest improvements regarding the structure of the files in this PR, as well as the documentation, to enhance clarity.
Here's an example of usage:
since it's an optional dependency
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines.
Who can review?
@lewtun
@qgallouedec