generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
KTOTrainer should work when actual batch size==1
✨ enhancement
New feature or request
🏋 KTO
Related to KTO
#2554
opened Jan 10, 2025 by
starmpcc
Option to disable unwrapping model for generation in PPO/RLOO/OnlineDPO
✨ enhancement
New feature or request
🏋 Online DPO
Related to Online DPO
🏋 PPO
Related to PPO
🏋 RLOO
Related to RLOO
#2529
opened Dec 28, 2024 by
dawidm
Direct Q-Function Optimization
✨ enhancement
New feature or request
#2526
opened Dec 28, 2024 by
catherinelee274
Integrate OREO into TRL and HF
✨ enhancement
New feature or request
#2525
opened Dec 28, 2024 by
August-murr
3 tasks done
Soft Actor-Critic (SAC) Trainer
✨ enhancement
New feature or request
#2517
opened Dec 23, 2024 by
AMindToThink
3 tasks
Spectrum training support
✨ enhancement
New feature or request
🏋 SFT
Related to SFT
#2504
opened Dec 19, 2024 by
ggbetz
[Tracking issue] Integrate native liger-kernel losses
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
#2495
opened Dec 17, 2024 by
qgallouedec
5 tasks
Packing in DPOTrainer
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2469
opened Dec 13, 2024 by
zhc7
Probably a more reasonable method of New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help from community wanted
Open invitation for community members to contribute
🏋 SFT
Related to SFT
packing
✨ enhancement
#2466
opened Dec 12, 2024 by
AIR-hl
Add the possibility to skip prepare_model_for_kbit_training
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2459
opened Dec 10, 2024 by
hugoabonizio
Specific peft trainers to address catastrophic forgetting
✨ enhancement
New feature or request
#2428
opened Dec 3, 2024 by
arivero
Add gen_text Argument for Custom Text Generation During Fine-tuning
✨ enhancement
New feature or request
⏳ needs more info
Additional information or clarification is required to proceed
#2415
opened Nov 29, 2024 by
dame-cell
Add use_dora and init_lora_weights to ModelConfig and get_peft_config
✨ enhancement
New feature or request
#2406
opened Nov 28, 2024 by
hommayushi3
RLOO Trainer do not support peft lora
✨ enhancement
New feature or request
🙋 help from community wanted
Open invitation for community members to contribute
⚡ PEFT
Related to PEFT
🏋 RLOO
Related to RLOO
#2404
opened Nov 28, 2024 by
harvinyou
7 of 9 tasks
eos_token config in PPOTrainer
✨ enhancement
New feature or request
👶 good first issue
Good for newcomers
🙋 help from community wanted
Open invitation for community members to contribute
🏋 PPO
Related to PPO
#2387
opened Nov 23, 2024 by
kechunFIVE
adding DRO trainer
✨ enhancement
New feature or request
🙋 help from community wanted
Open invitation for community members to contribute
#2383
opened Nov 22, 2024 by
morLev
2 of 3 tasks
DPO Training DataLoader is not shuffled
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2337
opened Nov 7, 2024 by
kaiwenw
4 tasks
Using a different New feature or request
❓ question
Seeking clarification or more information
ref_model
from model
leads to incorrect results
✨ enhancement
#2307
opened Nov 1, 2024 by
DarshanDeshpande
2 of 4 tasks
Feature Request: String-Based Comparison Reward model for RLOOTrainer
✨ enhancement
New feature or request
🏋 RLOO
Related to RLOO
#2280
opened Oct 25, 2024 by
HiroshigeAoki
Helper function for getting reward model and judge
✨ enhancement
New feature or request
#2271
opened Oct 24, 2024 by
qgallouedec
[CGPO] Add support for Constrained Generative Policy Optimization
✨ enhancement
New feature or request
#2156
opened Oct 2, 2024 by
gaetanlop
3 tasks
[Data] Implement dataset mixer for combining datasets in training
🗃️ data
Related to data
✨ enhancement
New feature or request
#2112
opened Sep 24, 2024 by
lewtun
[Reward Modelling] Add support for process / stepwise supervision
✨ enhancement
New feature or request
🙋 help from community wanted
Open invitation for community members to contribute
🏋 Reward
Related to Reward modelling
#2110
opened Sep 24, 2024 by
lewtun
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.