-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Open
Labels
Description
Motivation.
Currently, vLLM allows each LoRA adapter to define its own additional vocabulary:
Lines 2456 to 2460 in 65197a5
| lora_extra_vocab_size: int = 256 | |
| """Maximum size of extra vocabulary that can be present in a LoRA adapter | |
| (added to the base model vocabulary).""" | |
| lora_vocab_padding_size: ClassVar[int] = current_platform\ | |
| .get_lora_vocab_padding_size() |
However, this introduces significant complexity because:
- We can no longer assume a single tokenizer per model (since each LoRA adapter can have its own tokenizer).
- The size of the unembedding layer becomes ambiguous.
Proposed Change.
Since this feature appears to be rarely used, I propose removing it. Going forward, vLLM will assume that all LoRA adapters for a given model share the same vocabulary.
Feedback Period.
1 week
CC List.
Any Other Things.
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
andylolu2, jeejeelee, cadedaniel and joschu