Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

🦦 Validate vllm_mode param in GRPO
#3866 opened Aug 7, 2025 by sergiopaniego Loading…
3 of 5 tasks
Implement DPOP
#3864 opened Aug 7, 2025 by 1485840691 Draft
🔮 Native VLM support for SFTTrainer
#3862 opened Aug 7, 2025 by qgallouedec Loading…
5 tasks
[GRPO]: Fix BOS duplication bug when using VLLM
#3856 opened Aug 6, 2025 by pramodith Draft
3 of 5 tasks
[Callbacks] BEMA
#3855 opened Aug 6, 2025 by kashif Loading…
Update profiling.py: fix scoping problems for wandb and mlflow
#3845 opened Aug 4, 2025 by markshinyounglee Loading…
5 tasks done
dynamic temperature
#3844 opened Aug 4, 2025 by shirinyamani Draft
5 tasks
Add py.typed
#3841 opened Aug 4, 2025 by cyyever Loading…
[GSPO]: Refactor _compute_loss
#3835 opened Aug 1, 2025 by pramodith Loading…
2 of 5 tasks
[CPO] Add AlphaPO method via CPOTrainer
#3824 opened Jul 31, 2025 by kashif Loading…
Fix SFTTrainer token accuracy computation with PromptEncoder
#3821 opened Jul 31, 2025 by zk-quantum Loading…
5 tasks done
support GSPO-token
#3820 opened Jul 31, 2025 by hjh0119 Loading…
Adding support for different losses which are now supported by Liger
#3815 opened Jul 31, 2025 by Manan17 Loading…
1 of 5 tasks
Rloo final
#3801 opened Jul 29, 2025 by shirinyamani Loading…
5 tasks
Add dataset mixer
#3791 opened Jul 28, 2025 by lewtun Loading…
1 of 5 tasks
[GRPO] update transformer version for CB
#3786 opened Jul 28, 2025 by kashif Loading…
Add vLLM server mode support to OnlineDPOTrainer
#3783 opened Jul 27, 2025 by vaelev Loading…
6 tasks done
Dynamic sampling option in GRPO trainer based on DAPO paper
#3758 opened Jul 23, 2025 by almeidava93 Loading…
2 of 5 tasks
ProTip! Adding no:label will show everything without a label.