We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The aim is for all trainers to apply the same procedure in their init function:
BCOTrainer
CPOTrainer
DPOTrainer
GKDTrainer
SFTTrainer
IterativeSFTTrainer
KTOTrainer
NashMDTrainer
OnlineDPOTrainer
ORPOTrainer
PPOTrainer
RewardTrainer
RLOOTrainer
"dataset_text_field"
dataset_text_field
"text"
XPOTrainer
get_formatting_func_from_dataset
docs/dataset_format.mdx
The text was updated successfully, but these errors were encountered:
No branches or pull requests
The aim is for all trainers to apply the same procedure in their init function:
Support todo:
Standard dataset
BCOTrainer
CPOTrainer
DPOTrainer
GKDTrainer
(same asSFTTrainer
)IterativeSFTTrainer
KTOTrainer
NashMDTrainer
OnlineDPOTrainer
ORPOTrainer
PPOTrainer
RewardTrainer
[RewardTrainer] Tokenize inputs within trainer #2102RLOOTrainer
SFTTrainer
(could be previously achieved via"dataset_text_field"
) Defaultdataset_text_field
to"text"
#2078; 🔬 SFT simplification #2405XPOTrainer
Conversational dataset
BCOTrainer
BCOTrainer
conversational dataset support #2107CPOTrainer
Conversational dataset support forCPOTrainer
#2144DPOTrainer
Conversational dataset support forDPOTrainer
#2131GKDTrainer
IterativeSFTTrainer
KTOTrainer
Conversational dataset support forKTOTrainer
#2248NashMDTrainer
Conversational dataset support for Online DPO #2075OnlineDPOTrainer
Conversational dataset support for Online DPO #2075ORPOTrainer
Conversational dataset support forORPOTrainer
#2184PPOTrainer
RewardTrainer
[RewardTrainer] Tokenize inputs within trainer #2102RLOOTrainer
SFTTrainer
(yes, viaget_formatting_func_from_dataset
for now, needs refactoring); refactor in 🔬 SFT simplification #2405XPOTrainer
Conversational dataset support for Online DPO #2075Misc
docs/dataset_format.mdx
The text was updated successfully, but these errors were encountered: