Skip to content

Issues: huggingface/trl

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Code migration suggestions
#2296 opened Oct 30, 2024 by MonolithFoundation
OOM when finetuning Llama3.2-90B on 8xA100 80GB
#2294 opened Oct 29, 2024 by maximilianmordig
2 of 4 tasks
wrong objective/entropy in RLOOTrainer 🐛 bug Something isn't working 🏋 RLOO Related to RLOO
#2281 opened Oct 25, 2024 by serendipity800
1 of 4 tasks
Helper function for getting reward model and judge ✨ enhancement New feature or request
#2271 opened Oct 24, 2024 by qgallouedec
KTOTrainer Memory Leakage 🐛 bug Something isn't working 🏋 KTO Related to KTO
#2268 opened Oct 24, 2024 by Isaaclgz
2 of 4 tasks
Significant Difference between torchrun launch and accelerate launch ❓ question Seeking clarification or more information
#2262 opened Oct 21, 2024 by SinclairCoder
2 of 4 tasks
OOM when unwrap_model_for_generation 🐛 bug Something isn't working
#2250 opened Oct 18, 2024 by hlnchen
2 of 4 tasks
Add model merging callback ✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity
#2241 opened Oct 16, 2024 by lewtun
online DPO evaluation 🐛 bug Something isn't working 🏋 Online DPO Related to Online DPO
#2228 opened Oct 14, 2024 by woshizouguo
1 of 4 tasks
[Trainer] Changing the dataset dynamically during training 🏋 DPO Related to DPO ❓ question Seeking clarification or more information
#2227 opened Oct 14, 2024 by ilyasoulk
[GKD] 0 loss 🐛 bug Something isn't working 🏋 GKD Related to GKD
#2217 opened Oct 10, 2024 by nivibilla
2 of 4 tasks
[GKD] mismatch in tensors when stacking log probs 🐛 bug Something isn't working 🏋 GKD Related to GKD
#2215 opened Oct 10, 2024 by nivibilla
2 of 4 tasks
Indefinite waiting time while training using reward_modeling.py 🐛 bug Something isn't working ⏳ needs more info Additional information or clarification is required to proceed 🏋 Reward Related to Reward modelling
#2212 opened Oct 10, 2024 by himanshushukla12
2 of 4 tasks
RuntimeError: probability tensor contains either inf, nan or element < 0 🐛 bug Something isn't working 📱 cli Related to the Command-line interface
#2205 opened Oct 9, 2024 by himanshushukla12
2 of 4 tasks
Has anyone face with problems that DPO rewards accuracy stuck at 0.5 and the loss stuck at 0.6 to 0.8? 🏋 DPO Related to DPO ⏳ needs more info Additional information or clarification is required to proceed
#2194 opened Oct 7, 2024 by Thewillman
Does AutoModelForCausalLMWithValueHead get abandoned in PPOv2Trainer ? 🐛 bug Something isn't working
#2188 opened Oct 6, 2024 by Sino-Huang
2 of 4 tasks
add support for closed source model for Generalized Knowledge Distillation Trainer ✨ enhancement New feature or request 🏋 GKD Related to GKD ⏳ needs more info Additional information or clarification is required to proceed
#2179 opened Oct 5, 2024 by imrankh46
Gradient accumulation yields worse results than the equivalent batch size ⏳ needs more info Additional information or clarification is required to proceed ❓ question Seeking clarification or more information
#2175 opened Oct 4, 2024 by benjamin-marie
Handling of "auto" in deepspeed config causes crash under Zero3 🐛 bug Something isn't working 🚀 deepspeed Related to deepspeed 🏋 DPO Related to DPO 🙋 help wanted Open invitation for community members to contribute
#2154 opened Oct 2, 2024 by Ben-Schneider-code
2 of 4 tasks
ProTip! Find all open issues with in progress development work with linked:pr.