Road to v1

The purpose of this issue is to list the tasks that need to be completed before we reach v1. This list is evolving and is modified based on recent discussions and progress.

## Documentation

- [x] Remove `how_to_train.md` #4267 
- [x] Remove `using_llama_models.md` #4268
- [x] Remove `logging.md` #4269
- [x] #4376
- [x] #4375 
- [x] #4377 
- [x] #4378
- [x] #4379 
- [x] #4381
- [x] #4382
- [x] #4383
- [ ] #4384
- [ ] #4385
- [x] #4386
- [x] #4396
- [x] #4397
- [ ] #4407

## Examples

- [x] #4399
- [x] #4404

## Tests

- [x] #4401

## Main codebase

- [x] Add accuracy reward to the `trl.rewards` module https://github.com/huggingface/trl/pull/4270
- [x] Add an option (default to True) to use `RichProgressCallback` in scripts (`trl.scripts`).
- [x] #4398
- [x] Remove `log_example_reports.py` #4241
- [x] Remove `commands` directory #4258 
- [x] Remove `examples/research_projects` #4258
- [ ] Remove `trl.extra.dataset_formatting` #4242
- [ ] #4387
- [ ] Remove `BestOfNSampler`. #4291 
- [x] #4380
- [x] #4403
- [ ] Refactor DPO to align implementation with SFT (WIP in #3906)
- [ ] Tool calling for GRPO/RLOO (WIP in #4300)
- [ ] ~Async generation for Online methods~
- [ ] Bump transformers to v5
- [ ] #4402

## Moving experimental features to experimental submodule

Discussed in #4223 for trainers

- [x] Move BCO to experimental submodule #4312
- [ ] Move KTO to experimental submodule
- [x] Move Nash-MD to experimental submodule #4477
- [x] Move ORPO to experimental submodule #4480
- [x] #4472
- [x] Move PPO to experimental submodule #4482 
- [x] Move PRM to experimental submodule #4483 
- [ ] Move RLOO to experimental submodule
- [x] Move XPO to experimental submodule #4485 
- [ ] #4395
- [x] #4400
- [ ] Move Winrate callback to experimental #4558 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Road to v1 #4374

Documentation

Examples

Tests

Main codebase

Moving experimental features to experimental submodule

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Road to v1 #4374

Description

Documentation

Examples

Tests

Main codebase

Moving experimental features to experimental submodule

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions