Skip to content

Conversation

@jthomy
Copy link

@jthomy jthomy commented Jan 6, 2026

What does this PR do?

Do not pass the instantiated tokenizer in the config via ray. This commit solves an issue with Deepseek V3.1's tokenizer in the SFT ray trainer.

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

Deepseek V3.1 successfully instantiates and trains with the SFT Ray trainer.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

No API changes.

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Avoid instantiating the tokenizer before passing the resulting object to ray remote.
TrainingWorkerConfig now contains a union type which may be debatable.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively resolves a pickling issue with the Deepseek V3.1 tokenizer in Ray by deferring the instantiation of the model configuration until it's inside the worker process. This is achieved by passing a DictConfig instead of the instantiated dataclass. The implementation is sound and correctly addresses the problem. I have one suggestion to ensure type hint consistency and maintain compatibility with older Python versions.

class TrainingWorkerConfig(BaseConfig):
model_type: str = None # model type (language_model/value_model)
model_config: HFModelConfig = None
model_config: HFModelConfig | DictConfig = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The | operator for type hints was introduced in Python 3.10. To maintain consistency with the rest of the codebase, which uses types like Optional from the typing module, and to ensure compatibility with Python versions prior to 3.10, it is recommended to use Union from typing instead.

Please update the type hint as follows:

from typing import Any, Literal, Optional, Union

...

    model_config: Union[HFModelConfig, DictConfig] = None

You will need to add Union to the import from typing at the top of the file.

@vermouth1992
Copy link
Collaborator

Good idea! Shall we introduce a defer load tokenizer option instead of modifying the config semantics directly?

@vermouth1992
Copy link
Collaborator

Let me draft a PR that defer loading tokenizer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants