We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
optimizer_cls_and_kwargs
PPOTrainer
RLOOTrainer
_save_checkpoint
LogCompletionsCallback
eval_dataset
bos_token_id
F.log(F.sigmoid(log_odds)
F.logsigmoid(log_odds)
log_reports.py
max_new_tokens
torch_dtype
KTOTrainer
setup_chat_format
processing_class
tokenizer
get_batch_sample
num_items_in_batch
compute_loss
False
remove_unused_columns
[SFT/DPO/Reward]ScriptArguments
ScriptArguments