generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Issues: huggingface/trl
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
OOM when finetuning Llama3.2-90B on 8xA100 80GB
#2294
opened Oct 29, 2024 by
maximilianmordig
2 of 4 tasks
AttributeError: 'TrainingArguments' object has no attribute 'model_init_kwargs'
#2291
opened Oct 29, 2024 by
MonolithFoundation
wrong objective/entropy in RLOOTrainer
🐛 bug
Something isn't working
🏋 RLOO
Related to RLOO
#2281
opened Oct 25, 2024 by
serendipity800
1 of 4 tasks
Feature Request: String-Based Comparison Reward model for RLOOTrainer
✨ enhancement
New feature or request
🏋 RLOO
Related to RLOO
#2280
opened Oct 25, 2024 by
HiroshigeAoki
Conflict between last version of Transformers.Trainer and DPOTrainer.get_batch_samples
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
#2275
opened Oct 24, 2024 by
lucasdegeorge
2 of 4 tasks
Helper function for getting reward model and judge
✨ enhancement
New feature or request
#2271
opened Oct 24, 2024 by
qgallouedec
KTOTrainer Memory Leakage
🐛 bug
Something isn't working
🏋 KTO
Related to KTO
#2268
opened Oct 24, 2024 by
Isaaclgz
2 of 4 tasks
Significant Difference between torchrun launch and accelerate launch
❓ question
Seeking clarification or more information
#2262
opened Oct 21, 2024 by
SinclairCoder
2 of 4 tasks
OOM when unwrap_model_for_generation
🐛 bug
Something isn't working
#2250
opened Oct 18, 2024 by
hlnchen
2 of 4 tasks
Add model merging callback
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
#2241
opened Oct 16, 2024 by
lewtun
online DPO evaluation
🐛 bug
Something isn't working
🏋 Online DPO
Related to Online DPO
#2228
opened Oct 14, 2024 by
woshizouguo
1 of 4 tasks
[Trainer] Changing the dataset dynamically during training
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
#2227
opened Oct 14, 2024 by
ilyasoulk
[GKD] 0 loss
🐛 bug
Something isn't working
🏋 GKD
Related to GKD
#2217
opened Oct 10, 2024 by
nivibilla
2 of 4 tasks
[GKD] mismatch in tensors when stacking log probs
🐛 bug
Something isn't working
🏋 GKD
Related to GKD
#2215
opened Oct 10, 2024 by
nivibilla
2 of 4 tasks
Indefinite waiting time while training using reward_modeling.py
🐛 bug
Something isn't working
⏳ needs more info
Additional information or clarification is required to proceed
🏋 Reward
Related to Reward modelling
#2212
opened Oct 10, 2024 by
himanshushukla12
2 of 4 tasks
RuntimeError: probability tensor contains either Something isn't working
📱 cli
Related to the Command-line interface
inf
, nan
or element < 0
🐛 bug
#2205
opened Oct 9, 2024 by
himanshushukla12
2 of 4 tasks
Has anyone face with problems that DPO rewards accuracy stuck at 0.5 and the loss stuck at 0.6 to 0.8?
🏋 DPO
Related to DPO
⏳ needs more info
Additional information or clarification is required to proceed
#2194
opened Oct 7, 2024 by
Thewillman
A bug when SFT mistral-7B using DataCollatorForCompletionOnlyLM
🏋 SFT
Related to SFT
#2192
opened Oct 7, 2024 by
smartliuhw
2 of 4 tasks
Does AutoModelForCausalLMWithValueHead get abandoned in PPOv2Trainer ?
🐛 bug
Something isn't working
#2188
opened Oct 6, 2024 by
Sino-Huang
2 of 4 tasks
add support for closed source model for Generalized Knowledge Distillation Trainer
✨ enhancement
New feature or request
🏋 GKD
Related to GKD
⏳ needs more info
Additional information or clarification is required to proceed
#2179
opened Oct 5, 2024 by
imrankh46
Drop GPT2 in our test in favour of a more recent instruct model
👶 good first issue
Good for newcomers
#2177
opened Oct 4, 2024 by
qgallouedec
Gradient accumulation yields worse results than the equivalent batch size
⏳ needs more info
Additional information or clarification is required to proceed
❓ question
Seeking clarification or more information
#2175
opened Oct 4, 2024 by
benjamin-marie
[CGPO] Add support for Constrained Generative Policy Optimization
✨ enhancement
New feature or request
#2156
opened Oct 2, 2024 by
gaetanlop
3 tasks
Handling of "auto" in deepspeed config causes crash under Zero3
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 DPO
Related to DPO
🙋 help wanted
Open invitation for community members to contribute
#2154
opened Oct 2, 2024 by
Ben-Schneider-code
2 of 4 tasks
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.