Support for token-in vLLM endpoint by mikasenghaas · Pull Request #626 · PrimeIntellect-ai/verifiers

mikasenghaas · 2025-12-12T15:54:29Z

Description

This PR implements integrates the custom token-in /v1/chat/completions/tokens endpoint from PRIME-RL's inference server (introdued in #1422) with verifiers so that PRIME-RL can do multi-turn RL without mismatches caused by retokenization.

The main changes are:

Make interleaved_rollouts (and any other extra env kwargs) configurable via vf-eval
If interleaved rollouts is configured, get_model_response will correctly set up prompt tokens, sampling args and the client to make a request to the custom endpoint

We decided on the following defaults for reliably building prompt tokens:

Use vLLM for tokenization (can scale API server capacity with --api-server-count to not get bottlenecked by tokenization)
Will tokenize the env_response in isolation and compute suffix tokens (tokens added in between messages by chat template, but not produced by LLM) once on dummy messages and cache for later usage. This should be safe in 99.9% of the cases.

Examples

Default behavior is unaffected, e.g. running math-python against OAI API

uv run vf-eval math-python -n1 -r1 -v

To use the token-in prompt, start a custom vLLM server from PRIME-RL

uv run inference --model.name Qwen/Qwen3-4B-Instruct-2507 --enable-auto-tool-choice --tool-call-parser hermes --enable-log-requests

uv run vf-eval math-python -n1 -r1 -b http://localhost:8000/v1 -m Qwen/Qwen3-4B-Instruct-2507 -v -x '{"interleaved_rollouts": true}'

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

Note

Adds interleaved rollouts using a vLLM token-in endpoint with pre-tokenized prompts, and introduces CLI-configurable extra environment kwargs applied at runtime.

Core (Environment):
- Implement interleaved rollouts path in get_model_response using custom /v1/chat/completions/tokens with pre-tokenized prompt_ids and normalized sampling args.
- Add overlong-prompt error handler decorator and refactor arg resolution/sampling normalization.
- New setters: set_kwargs, set_interleaved_rollouts (with warning).
Token utilities:
- New verifiers/utils/token_utils.py with tokenize_vllm, get_prompt_ids, and prepare_sampling_args_for_token_prompts (cached suffix handling, overlap logic, tokens client copy).
CLI/Config:
- Add --extra-env-kwargs to vf-eval; plumb through EvalConfig.extra_env_kwargs and apply via vf_env.set_kwargs in run_evaluation.
EnvGroup:
- Add set_interleaved_rollouts to propagate to sub-envs.
Types:
- Make State.client and State.model required (non-optional).
Tests:
- Update tests/test_eval_cli.py to include extra_env_kwargs arg and validate sampling args precedence.

^{Written by Cursor Bugbot for commit 75fa695. This will update automatically on new commits. Configure here.}

verifiers/utils/eval_utils.py

verifiers/envs/multiturn_env.py

verifiers/envs/environment.py

verifiers/envs/multiturn_env.py

verifiers/utils/response_utils.py

verifiers/envs/multiturn_env.py

verifiers/envs/environment.py

verifiers/envs/multiturn_env.py

snimu

Looks really great to me :)

verifiers/types.py

verifiers/envs/multiturn_env.py

verifiers/scripts/eval.py

verifiers/envs/environment.py

willccbb · 2025-12-15T23:58:39Z

@mikasenghaas Another thought -- I'm not sure how much sense it makes to have the tokenizer pool managed at the verifiers layer, seems spiritually similar to inference DP which is auto-managed by vLLM / prime-rl... ideally there is always only a single tokenizer endpoint available, and if replication is needed to manage load, this can be behind the endpoint

verifiers/utils/token_utils.py

verifiers/envs/environment.py

cursor · 2025-12-16T20:49:56Z

verifiers/utils/eval_utils.py

+    if config.extra_env_kwargs:
+        logger.info(f"Setting extra environment kwargs: {config.extra_env_kwargs}")
+        for k, v in config.extra_env_kwargs.items():
+            setattr(vf_env, k, v)


Bug: EnvGroup sub-environments miss interleaved_rollouts propagation

Using setattr to set extra_env_kwargs bypasses the set_interleaved_rollouts method in EnvGroup. When an EnvGroup is loaded and interleaved_rollouts is set via extra_env_kwargs, only the group's attribute is updated, but sub-environments remain with interleaved_rollouts=False. Since EnvGroup.rollout() delegates to sub-environments, and each sub-environment's get_model_response checks its own self.interleaved_rollouts, the token-in feature silently won't work for EnvGroup environments.

Additional Locations (1)

verifiers/envs/env_group.py#L288-L293

cursor · 2025-12-16T20:49:56Z

verifiers/utils/token_utils.py

+@lru_cache(maxsize=None)
+def get_tokens_client(client: AsyncOpenAI) -> AsyncOpenAI:
+    logger.debug("Lazily copying OpenAI client for requests to /tokenize API")
+    url_without_v1 = str(client.base_url).replace("/v1/", "")


Bug: URL manipulation fails without trailing slash

The replace("/v1/", "") operation only works when the base URL includes a trailing slash after /v1. If a user configures their vLLM server with base_url="http://localhost:8000/v1" (no trailing slash), the replacement doesn't match and the URL remains unchanged. The tokenize request would then be sent to /v1/tokenize instead of /tokenize, causing the request to fail with a confusing 404 or routing error.

verifiers/utils/token_utils.py

verifiers/utils/eval_utils.py

verifiers/utils/token_utils.py

verifiers/envs/environment.py

verifiers/utils/token_utils.py

verifiers/envs/environment.py

willccbb requested changes Dec 12, 2025

View reviewed changes

willccbb reviewed Dec 12, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 12, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 12, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

mikasenghaas mentioned this pull request Dec 15, 2025

Token-in vLLM endpoint PrimeIntellect-ai/prime-rl#1422

Merged

mikasenghaas requested a review from snimu December 15, 2025 16:37

mikasenghaas marked this pull request as ready for review December 15, 2025 20:34

cursor bot reviewed Dec 15, 2025

View reviewed changes

snimu requested changes Dec 15, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

verifiers/envs/multiturn_env.py Outdated Show resolved Hide resolved

verifiers/envs/multiturn_env.py Outdated Show resolved Hide resolved

mikasenghaas requested review from snimu and willccbb December 15, 2025 22:28

snimu approved these changes Dec 15, 2025

View reviewed changes

cursor bot reviewed Dec 15, 2025

View reviewed changes

verifiers/types.py Outdated Show resolved Hide resolved

willccbb marked this pull request as draft December 15, 2025 22:48

willccbb reviewed Dec 15, 2025

View reviewed changes

verifiers/envs/multiturn_env.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 15, 2025

View reviewed changes

verifiers/envs/multiturn_env.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 15, 2025

View reviewed changes

verifiers/scripts/eval.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 15, 2025

View reviewed changes

verifiers/scripts/eval.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 15, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 15, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

mikasenghaas added 9 commits December 16, 2025 20:13

skeleton of token in multi-turn conversation

0406268

fix url parsing

b584f0e

allow configuring in eval entrypoint

bf88a32

raise for status

4a2c8a8

set sampling args and print warning

046e433

correctly process tools

c0e680f

fix eval cli test

16d68c5

use setter

1d004fe

remove prints

6389aff

cursor bot reviewed Dec 16, 2025

View reviewed changes

verifiers/utils/token_utils.py Outdated Show resolved Hide resolved

willccbb reviewed Dec 16, 2025

View reviewed changes

verifiers/envs/environment.py Show resolved Hide resolved

rename to get_model_response_with_messages

3926b10

snimu reviewed Dec 16, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

mikasenghaas added 2 commits December 16, 2025 20:32

fix tokens_client caching

1b1e0c1

fix typo

c596bb9

snimu reviewed Dec 16, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

mikasenghaas added 3 commits December 16, 2025 20:37

more typo

129fb43

more accurate comment

451d1a2

allow setter

3759f7f

mikasenghaas requested a review from willccbb December 16, 2025 20:46

cursor bot reviewed Dec 16, 2025

View reviewed changes

fix url parsing edge case

e36460b

snimu reviewed Dec 16, 2025

View reviewed changes

verifiers/utils/token_utils.py Show resolved Hide resolved

cursor bot reviewed Dec 16, 2025

View reviewed changes

verifiers/utils/eval_utils.py Outdated Show resolved Hide resolved

cursor bot reviewed Dec 16, 2025

View reviewed changes

verifiers/utils/token_utils.py Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

mikasenghaas added 2 commits December 16, 2025 21:00

support generic setters

67d98b1

move method

3233ed4

cursor bot reviewed Dec 16, 2025

View reviewed changes

verifiers/envs/environment.py Outdated Show resolved Hide resolved

mikasenghaas added 3 commits December 16, 2025 21:06

more readable find_last_index

ad15055

fix

f4c44e5

remove the is not recommended for general use

4185685

cursor bot reviewed Dec 16, 2025

View reviewed changes

verifiers/utils/token_utils.py Show resolved Hide resolved

willccbb approved these changes Dec 16, 2025

View reviewed changes

fix find last index again

75fa695

cursor bot reviewed Dec 16, 2025

View reviewed changes

verifiers/envs/environment.py Show resolved Hide resolved

mikasenghaas merged commit 4de7908 into main Dec 16, 2025
5 checks passed

willccbb mentioned this pull request Dec 18, 2025

Add --env-class-attrs to vf-eval, allowing setting custom attrs after env init #636

Closed

cmunley1 mentioned this pull request Jan 30, 2026

feat: Prime Intellect verifiers integration NVIDIA-NeMo/Gym#573

Merged

Comments

Conversation

mikasenghaas commented Dec 12, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Examples

Type of Change

Testing

Checklist

Additional Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

snimu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

willccbb commented Dec 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot Dec 16, 2025

Choose a reason for hiding this comment

Bug: EnvGroup sub-environments miss interleaved_rollouts propagation

Uh oh!

cursor bot Dec 16, 2025

Choose a reason for hiding this comment

Bug: URL manipulation fails without trailing slash

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mikasenghaas commented Dec 12, 2025 •

edited by cursor bot

Loading