generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
DeepSpeed with trl
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 DPO
Related to DPO
⏳ needs more info
Additional information or clarification is required to proceed
#2490
opened Dec 16, 2024 by
sagie-dekel
7 of 9 tasks
KeyError in DPO Trainer, evaluation_loop
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
#2473
opened Dec 13, 2024 by
qingjianbuyi
7 of 9 tasks
Packing in DPOTrainer
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2469
opened Dec 13, 2024 by
zhc7
Add the possibility to skip prepare_model_for_kbit_training
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2459
opened Dec 10, 2024 by
hugoabonizio
Probaly mistake in Related to DPO
❓ question
Seeking clarification or more information
DPOTrainer
when compute/log grad_norm
🏋 DPO
#2456
opened Dec 10, 2024 by
AIR-hl
7 of 9 tasks
Out of Memory Error: DPO Trainer
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
#2452
opened Dec 9, 2024 by
gp-1108
7 of 9 tasks
DPO with unsloth : TypeError: empty_like(): argument 'input' (position 1) must be Tensor, not NoneType
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
🦥 unsloth
Related to Unsloth
👁️ VLM
Related to Visual Language Models
#2438
opened Dec 4, 2024 by
davidszwjx
7 of 9 tasks
DPO training with 'logits/chosen':nan,'logits/rejected':nan
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
⏳ needs more info
Additional information or clarification is required to proceed
#2435
opened Dec 4, 2024 by
ZengQQQ
7 of 9 tasks
Let DPOTrainer Support padding_free
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help from community wanted
Open invitation for community members to contribute
#2422
opened Dec 1, 2024 by
fzyzcjy
DPO does not work for FIM task with non-instruct model
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
#2382
opened Nov 22, 2024 by
AML14
7 of 9 tasks
Dpo Train Issue: max step from 1000 to 996349
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
#2355
opened Nov 14, 2024 by
seTalent
8 of 9 tasks
DPO Training DataLoader is not shuffled
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2337
opened Nov 7, 2024 by
kaiwenw
4 tasks
Support for MiniCPM-V Reinforcement Learning with Direct Preference Optimization (DPO)
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
👁️ VLM
Related to Visual Language Models
#2326
opened Nov 5, 2024 by
DarioPTWR
Conflict between last version of Transformers.Trainer and DPOTrainer.get_batch_samples
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
#2275
opened Oct 24, 2024 by
lucasdegeorge
2 of 4 tasks
[Trainer] Changing the dataset dynamically during training
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
#2227
opened Oct 14, 2024 by
ilyasoulk
Handling of "auto" in deepspeed config causes crash under Zero3
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 DPO
Related to DPO
🙋 help from community wanted
Open invitation for community members to contribute
#2154
opened Oct 2, 2024 by
Ben-Schneider-code
2 of 4 tasks
Always allow Related to DPO
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help from community wanted
Open invitation for community members to contribute
ref_model=None
🏋 DPO
#2047
opened Sep 10, 2024 by
qgallouedec
[Question] Why TR-DPO default alpha and tau don't match the values suggested in the paper?
🏋 DPO
Related to DPO
👶 good first issue
Good for newcomers
🙋 help from community wanted
Open invitation for community members to contribute
#1991
opened Aug 28, 2024 by
qgallouedec
DPO models generate multiple / corrupted responses
🏋 DPO
Related to DPO
🙋 help from community wanted
Open invitation for community members to contribute
#1025
opened Nov 22, 2023 by
Devy99
ProTip!
no:milestone will show everything without a milestone.