Skip to content

Comments

refactor(sunjx): refactor dataset and reward module#13

Open
Jiaxuan-Sun wants to merge 6 commits intoopendilab:mainfrom
Jiaxuan-Sun:refactor/dataset-reward-module
Open

refactor(sunjx): refactor dataset and reward module#13
Jiaxuan-Sun wants to merge 6 commits intoopendilab:mainfrom
Jiaxuan-Sun:refactor/dataset-reward-module

Conversation

@Jiaxuan-Sun
Copy link
Contributor

1. Dataset Module Refactoring (lightrft/datasets/)

Modified:

  • __init__.py: Refactored imports with unified interfaces and improved optional dependency handling

Added:

  • config.py: DatasetConfig class

    • Unified configuration for train/eval/pretrain datasets
    • Auto-normalization of data_path and data_probs (supports string/list)
    • Factory methods: for_train(), for_eval(), for_pretrain()
    • Parameter validation
  • loader.py: DatasetLoader class

    • Unified loading interface for train/eval/pretrain datasets
    • Automatic handling of blending_datasets parameters
    • Support for PromptDatasetVL and SFTDatasetVL
    • Consistent logging

2. Reward Module (lightrft/reward/)

Added:

  • __init__.py: Module entry point with unified exports

  • base.py: BaseReward abstract base class

    • Unified compute() method signature
    • Consistent return format: (rewards, metrics)
  • rule.py: RuleReward class

    • Rule-based reward implementation
    • Format checking (e.g., <think> tags, \boxed{} notation)
    • Accuracy verification using mathruler grader
    • Registry pattern for custom rule types
    • Built-in rules: default, geo3k_*, gsm8k_*
  • model.py: Reward model implementations

    • SingleRewardModel: Single reward model wrapper with auto load/offload
    • MultiRewardModel: Multiple reward model ensemble with recipe-based aggregation
    • Supports standard PyTorch models and custom engines (e.g., SGLang)
  • manager.py: RewardManager class

    • Unified manager for all reward types
    • Auto-selection of reward implementation (rule/single/multi)
    • from_config() factory method

@puyuan1996 puyuan1996 added the refactor Cleanup, formatting, or restructuring of existing code. label Jan 4, 2026
@puyuan1996 puyuan1996 mentioned this pull request Jan 23, 2026
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Cleanup, formatting, or restructuring of existing code.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants