Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

DeepSpeed with trl 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 DPO Related to DPO ⏳ needs more info Additional information or clarification is required to proceed
#2490 opened Dec 16, 2024 by sagie-dekel
7 of 9 tasks
KeyError in DPO Trainer, evaluation_loop 🐛 bug Something isn't working 🏋 DPO Related to DPO
#2473 opened Dec 13, 2024 by qingjianbuyi
7 of 9 tasks
Packing in DPOTrainer 🏋 DPO Related to DPO ✨ enhancement New feature or request
#2469 opened Dec 13, 2024 by zhc7
DPOTrainer log metrics are not gathered and meaned across ranks 🐛 bug Something isn't working 🏋 DPO Related to DPO
#2468 opened Dec 13, 2024 by zhc7
Probaly mistake in DPOTrainer when compute/log grad_norm 🏋 DPO Related to DPO ❓ question Seeking clarification or more information
#2456 opened Dec 10, 2024 by AIR-hl
7 of 9 tasks
Out of Memory Error: DPO Trainer 🏋 DPO Related to DPO ❓ question Seeking clarification or more information
#2452 opened Dec 9, 2024 by gp-1108
7 of 9 tasks
DPO with unsloth : TypeError: empty_like(): argument 'input' (position 1) must be Tensor, not NoneType 🐛 bug Something isn't working 🏋 DPO Related to DPO 🦥 unsloth Related to Unsloth 👁️ VLM Related to Visual Language Models
#2438 opened Dec 4, 2024 by davidszwjx
7 of 9 tasks
DPO training with 'logits/chosen':nan,'logits/rejected':nan 🐛 bug Something isn't working 🏋 DPO Related to DPO ⏳ needs more info Additional information or clarification is required to proceed
#2435 opened Dec 4, 2024 by ZengQQQ
7 of 9 tasks
Let DPOTrainer Support padding_free 🏋 DPO Related to DPO ✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity 🙋 help from community wanted Open invitation for community members to contribute
#2422 opened Dec 1, 2024 by fzyzcjy
DPO does not work for FIM task with non-instruct model 🏋 DPO Related to DPO ❓ question Seeking clarification or more information
#2382 opened Nov 22, 2024 by AML14
7 of 9 tasks
Dpo Train Issue: max step from 1000 to 996349 🐛 bug Something isn't working 🏋 DPO Related to DPO
#2355 opened Nov 14, 2024 by seTalent
8 of 9 tasks
DPO Training DataLoader is not shuffled 🏋 DPO Related to DPO ✨ enhancement New feature or request
#2337 opened Nov 7, 2024 by kaiwenw
4 tasks
Support for MiniCPM-V Reinforcement Learning with Direct Preference Optimization (DPO) 🏋 DPO Related to DPO ❓ question Seeking clarification or more information 👁️ VLM Related to Visual Language Models
#2326 opened Nov 5, 2024 by DarioPTWR
[Trainer] Changing the dataset dynamically during training 🏋 DPO Related to DPO ❓ question Seeking clarification or more information
#2227 opened Oct 14, 2024 by ilyasoulk
Handling of "auto" in deepspeed config causes crash under Zero3 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 DPO Related to DPO 🙋 help from community wanted Open invitation for community members to contribute
#2154 opened Oct 2, 2024 by Ben-Schneider-code
2 of 4 tasks
Always allow ref_model=None 🏋 DPO Related to DPO ✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity 🙋 help from community wanted Open invitation for community members to contribute
#2047 opened Sep 10, 2024 by qgallouedec
[Question] Why TR-DPO default alpha and tau don't match the values suggested in the paper? 🏋 DPO Related to DPO 👶 good first issue Good for newcomers 🙋 help from community wanted Open invitation for community members to contribute
#1991 opened Aug 28, 2024 by qgallouedec
DPO models generate multiple / corrupted responses 🏋 DPO Related to DPO 🙋 help from community wanted Open invitation for community members to contribute
#1025 opened Nov 22, 2023 by Devy99
ProTip! no:milestone will show everything without a milestone.