generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Labels
Description
The aim is for all trainers to apply the same procedure in their init function:
- if needed, apply the chat template, then
- if needed, tokenize.
Support todo:
Standard dataset
-
BCOTrainer -
CPOTrainer -
DPOTrainer -
GKDTrainer(same asSFTTrainer) -
removedIterativeSFTTrainer -
KTOTrainer -
NashMDTrainer -
OnlineDPOTrainer -
ORPOTrainer -
PPOTrainer -
RewardTrainer[RewardTrainer] Tokenize inputs within trainer #2102 -
RLOOTrainer🔥 [Refactor] RLOOTrainer #3801 -
SFTTrainer(could be previously achieved via"dataset_text_field") Defaultdataset_text_fieldto"text"#2078; 🔬 SFT simplification #2405 -
XPOTrainer
Conversational dataset
-
BCOTrainerBCOTrainerconversational dataset support #2107 -
CPOTrainerConversational dataset support forCPOTrainer#2144 -
DPOTrainerConversational dataset support forDPOTrainer#2131 -
GKDTrainer -
removedIterativeSFTTrainer -
KTOTrainerConversational dataset support forKTOTrainer#2248 -
NashMDTrainerConversational dataset support for Online DPO #2075 -
OnlineDPOTrainerConversational dataset support for Online DPO #2075 -
ORPOTrainerConversational dataset support forORPOTrainer#2184 -
PPOTrainer -
RewardTrainer[RewardTrainer] Tokenize inputs within trainer #2102 -
RLOOTrainer🔥 [Refactor] RLOOTrainer #3801 -
SFTTrainer(yes, viaget_formatting_func_from_datasetfor now, needs refactoring); refactor in 🔬 SFT simplification #2405 -
XPOTrainerConversational dataset support for Online DPO #2075
Misc
- Update
docs/dataset_format.mdx