-
Notifications
You must be signed in to change notification settings - Fork 192
Labels
roadmapDevelopment plan.Development plan.
Description
AReaL 2025 Enhancement Milestone Tracker
Introduction
This issue tracks the major planned enhancements for AReaL until Oct 31, 2025. Our development roadmap is organized into two main categories to help contributors understand where they can make the most impact.
The On-going sections contain features that are currently under active development by the core AReaL team. These items represent our immediate priorities and are being actively worked on.
The Planned but not in progress sections list features that we have concrete implementation plans for, but currently lack the bandwidth to pursue. We welcome contributions from the community for these items! If you're interested in contributing to any of these planned features, please let us know and reach out to discuss implementation details.
Backends
Done
N/A
On-going
- Megatron training backend support [Feature] Support the Megatron training backend #256
- SGLang large expert parallelism (EP) inference support [Feature] Support using SGLang inference with multi-node instance and expert parallelism #259
- End-to-end MoE RL training with large EP inference and Megatron expert parallelism
- Ulysses context parallelism & tensor parallelism for FSDP backend [Feature] Support ulysses sequence parallelism and tensor parallelism for FSDP backend #258
- Single-controller mode [Feature] Add single-controller mode #260
- Remote vLLM inference engine feat: support NPU and vLLM #351
Planned but not in progress
- RL training with SGLang/vLLM pipeline parallelism
- Distributed weight resharder for Megatron training backend
- Multi-LLM training (different agents with different parameters)
- Local SGLang inference engine with inference/training colocation (hybrid engine)
- Detailed profiling of the FSDP and training backend for best performance under different scales
Usability
Done
- OpenAI-compatible client support feat: support openai-compatible rollout and add an unittest for prepare_mb_list #248
On-going
- Support RLOO RLOO algorithm to issue261 #354
Planned but not in progress
- Support training GPT-oss and Seed-oss models
- Support running distributed training and debugging in Jupyter notebooks
- Provide benchmarking configuration examples [Feature] Adding more implementation/configuration examples #261 :
- DAPO FEAT: Decoupled CLIP ratio (DAPO Trick-I) #285 FEAT: Dynamic_Sampling(DAPO Trick-II) #294 FEAT: Overlong_Reward_Penalty (DAPO Trick-III) #295
- Bradley-Terry reward modeling [Feature] Add support for Reward Model fine-tuning #331
- PPO with critic models
- REINFORCE++
Documentation
Done
- OpenAI-compatible client documentation doc: add for writing workflows with the openai-compatible client #254
- Out-of-memory (OOM) troubleshooting guide [Doc] add best practices doc, including debugging and handling OOM #287
- AReaL debugging best practices: [Doc] add best practices doc, including debugging and handling OOM #287 Document a new example of examing rollout results in both Transformers and the inference engine #361
- LLM server-only debugging - How to launch LLM servers independently and debug agent workflows
- Mock data and torchrun debugging - Creating synthetic data and using
torchrun
for algorithm debugging - Training-free evaluation experiments - Running evaluations without training or additional GPUs
Planned but not in progress
- AReaL performance tuning guide
- How to split training and inference devices
- How to set parallelism strategies for training and inference
adamlin120, samjia2000, RanchiZhao, fishcrap and ZiyiTsang
Sub-issues
Metadata
Metadata
Assignees
Labels
roadmapDevelopment plan.Development plan.