Skip to content

[RFC]: Disallow extra vocab for LoRA #23474

@WoosukKwon

Description

@WoosukKwon

Motivation.

Currently, vLLM allows each LoRA adapter to define its own additional vocabulary:

vllm/vllm/config/__init__.py

Lines 2456 to 2460 in 65197a5

lora_extra_vocab_size: int = 256
"""Maximum size of extra vocabulary that can be present in a LoRA adapter
(added to the base model vocabulary)."""
lora_vocab_padding_size: ClassVar[int] = current_platform\
.get_lora_vocab_padding_size()

However, this introduces significant complexity because:

  1. We can no longer assume a single tokenizer per model (since each LoRA adapter can have its own tokenizer).
  2. The size of the unembedding layer becomes ambiguous.

Proposed Change.

Since this feature appears to be rarely used, I propose removing it. Going forward, vLLM will assume that all LoRA adapters for a given model share the same vocabulary.

Feedback Period.

1 week

CC List.

@jeejeelee

Any Other Things.

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions