Skip to content

I want use the function prefix_allowed_tokens_fn of huggingface model.generate(), where of vllm's source code shall I modify? #415

Closed
@zoubaihan

Description

@zoubaihan

Hello, we all know that in huggingface transformers' origin model.generate() method, we can set the function paremeter prefix_allowed_tokens_fn to restrict the generation rule. I want to use this function in vllm just like I used in origin model.generate() to control the generation process, could you please tell me where of the source code shall I modify to make the model generation obey my custom prefix_allowed_tokens_fn?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions