Closed
Description
Feature request
Add a MergeModelCallback
that merges the reference model with the current policy and optionally pushes the merged checkpoint to the Hub. This could be done on step/epoch end and/or the end of training. Implementation-wise, we could use Arcee's mergekit
lib and include it as an optional dependency: https://github.com/arcee-ai/mergekit
Motivation
Various papers show that model merging can non-trivially improve performance, especially if the models belong to the same architecture:
- https://arxiv.org/abs/2410.10801
- https://arxiv.org/abs/2406.16768 (for reward models)
Your contribution
Open to the community!