refactor: split RL trainer into optional in-repo verifiers-rl package#843
Merged
refactor: split RL trainer into optional in-repo verifiers-rl package#843
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 4 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
2 tasks
60abf7c to
760b693
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
verifiersby moving RL training/inference code into an optional package.vf-rl,vf-train,vf-vllm) while making RL tooling installable on demand.Description
packages/verifiers-rlwith its ownpyproject.toml,verifiers_rlmodule, RL code (rl.trainer,rl.inference,scripts) and entrypoints (vf-rl,vf-train,vf-vllm).verifiers_rl.rl.*and update internal imports in the new package accordingly.verifiers/rl/*that proxy toverifiers_rland emit clear install guidance (uv add verifiers-rl) when the optional package is absent.verifiers/scripts/*with thin wrappers that import the new package and show install hints when missing, and updatepyproject.tomlto pointvf-vllmat the shim wrapper.verifiers.__getattr__lazy-import mapping to point RL symbols toverifiers_rland to raise a package-specific install hint (verifiers-rl) for RL-related names.rlextra and RL-specific build settings from corepyproject.tomland update documentation to recommenduv add verifiers-rl..github/workflows/publish-verifiers-rl.ymlto support programmatic builds/publishing of the new package.Testing
uv run ruff check --fix .which completed with all checks passing.import verifiersworks in a fresh virtualenv viauv run python -c "import verifiers; print('ok')".verifiers.RLConfig/verifiers.RLTrainerand observing theAttributeErrormessaging directing users toverifiers-rl.uv build packages/verifiers-rlsuccessfully (wheel and sdist produced).uv pip install -e packages/verifiers-rlin this environment and the editable install failed while buildingflash-attnin isolation due to a missingtorchbuild-time requirement; this is an environment/build isolation issue and not a correctness or lint failure of the refactor.Codex Task
Note
Medium Risk
Moderate risk due to packaging/import path refactors and new release automation; breakage would primarily surface as missing RL symbols/CLIs or incorrect install guidance rather than core runtime behavior.
Overview
Extracts the RL trainer/inference implementation into a new optional package,
packages/verifiers-rl, with its ownpyproject.toml,verifiers_rlmodule, scripts (vf-rl,vf-train,vf-vllm), and docs.Core
verifiersdrops therlextra and rewiresverifiers.__getattr__lazy imports to targetverifiers_rl, emitting a dedicated install hint (uv add verifiers-rl) when RL symbols are accessed without the optional package. Existingverifiers/rl/*modules andverifiers/scripts/{rl,train}.pyare replaced with thin proxy shims, andvf-vllmnow points to a newverifiers/scripts/vllm.pywrapper.Adds a GitHub Actions workflow to build/publish
verifiers-rlonverifiers-rl-v*tags (with tag↔version validation), updates style CI touv syncwithout RL extras, and refreshes training docs to point users atverifiers-rland the new source locations.Written by Cursor Bugbot for commit 3dbac23. This will update automatically on new commits. Configure here.