Skip to content

Add model merging callback #2241

Closed
@lewtun

Description

Feature request

Add a MergeModelCallback that merges the reference model with the current policy and optionally pushes the merged checkpoint to the Hub. This could be done on step/epoch end and/or the end of training. Implementation-wise, we could use Arcee's mergekit lib and include it as an optional dependency: https://github.com/arcee-ai/mergekit

Motivation

Various papers show that model merging can non-trivially improve performance, especially if the models belong to the same architecture:

Your contribution

Open to the community!

Metadata

Assignees

No one assigned

    Labels

    ✨ enhancementNew feature or request🧒 good second issueGood for contributors with basic project familiarity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions