Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor KTO setting changes + KL batch size #2153

Merged
merged 15 commits into from
Oct 6, 2024

Conversation

kawine
Copy link
Contributor

@kawine kawine commented Oct 2, 2024

What does this PR do?

  • change the default learning rate from 5e-7 to 1e-6; for models 7B and up (which is what most people use w/ or w/o LoRA), i this works better with the default beta=0.1 than an lr=5e-7, which can lead to slow learning
  • add explanations in the docs on choosing the learning rate
  • (important) change the batch size of the KL data to be the per-step batch size, not the effective batch size; the current implementation leads to worse results when gradient accumulation is used
  • allow dropout to be used, like in DPOTrainer

cc @kashif @qgallouedec

@kashif
Copy link
Collaborator

kashif commented Oct 2, 2024

in the kto_config.py docstrings can you kindly add:

disable_dropout (`bool`, *optional*, defaults to `True`):
            Whether to disable dropout in the model.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@kawine
Copy link
Contributor Author

kawine commented Oct 6, 2024

in the kto_config.py docstrings can you kindly add:

disable_dropout (`bool`, *optional*, defaults to `True`):
            Whether to disable dropout in the model.

done!

trl/trainer/kto_trainer.py Outdated Show resolved Hide resolved
docs/source/kto_trainer.mdx Outdated Show resolved Hide resolved
@kashif kashif added the 🏋 KTO Related to KTO label Oct 6, 2024
@kashif kashif merged commit f05c3fa into huggingface:main Oct 6, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏋 KTO Related to KTO
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants