feat(grpo): add pluggable reward functions / verifier registry

`reward.py` has only `binary_task_success` with no extensibility. RL training use cases need custom per-task verifiers and reward composition.

Current reward is hardcoded in `rollout_collector.py:140`.

**Proposed design:**
- Implement a reward function protocol matching TRL's `reward_funcs` pattern (list of callables)
- Support a `TaskVerifierRegistry` for registering task-specific verification functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(grpo): add pluggable reward functions / verifier registry #38

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(grpo): add pluggable reward functions / verifier registry #38

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions