[Bugfix] Fix EAGLE3 broken logits #18909

benchislett · 2025-05-29T15:27:05Z

This PR incorrectly checks the loaded_weights and will always set draft_id_to_target_id to None (sorry!).

This PR fixes the bug by checking the updated name instead of 'd2t', checks for inclusion instead of exact-match for a more reliable check in case it gets moved in the future, and adds an assertion to prevent regression.

Confirmed that the previous code hits the assertion and fails:

ERROR 05-29 15:17:48 [core.py:502]     assert logits.shape[1] == self.config.vocab_size, \
ERROR 05-29 15:17:48 [core.py:502] AssertionError: Expected logits to have shape (*, 128256), but got torch.Size([1, 32000])

Previous acceptance on sample data (4 requests from MT_Bench):

--------------------------------------------------
mean acceptance length: 1.12
--------------------------------------------------
acceptance at token 0:0.11
acceptance at token 1:0.01
acceptance at token 2:0.00
acceptance at token 3:0.00

New acceptance on the same data:

--------------------------------------------------
mean acceptance length: 2.99
--------------------------------------------------
acceptance at token 0:0.77
acceptance at token 1:0.55
acceptance at token 2:0.38
acceptance at token 3:0.29

To reproduce:

VLLM_USE_V1=1 python3 examples/offline_inference/eagle.py --method eagle3 --max_num_seqs 1 --num_spec_tokens 4 --dataset mt_bench.jsonl --num_prompts 4 --enforce_eager

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>

github-actions · 2025-05-29T15:27:29Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

ekagra-ranjan · 2025-05-29T15:32:35Z

Thank you for the PR!

Could you share more on what is t2d, d2t, and draft_id_to_target_id ? If they are present in weight name yuhuili/EAGLE3-LLaMA3.1-Instruct-8B then could you share some examples of the parameter names?

markmc · 2025-05-29T15:35:09Z

vllm/model_executor/models/llama_eagle3.py

        loaded_weights = loader.load_weights(model_weights.items())

-        if 'd2t' not in loaded_weights:
+        if not any('draft_id_to_target_id' in name for name in loaded_weights):


Why wouldn't:

if 'draft_id_to_target_id' not in loaded_weights:

work? Isn't it just a set of weight names?

I mentioned this in the description, but the idea here is to be a little more robust to future changes. Weight names are prefixed by the module they live on, such that an mlp weight would look something like model.layers.0.mlp.weight, etc. It happens that draft_id_to_target_id lives at the top level module (LlamaForCausalLM) similar to the lm_head, but if it were refactored to be a part of LlamaModel or if it were made into a class (in which case the weight might be draft_id_to_target_id.mapping or similar), this check would no longer succeed.

I am open to discussion if this seems too generic of a comparison, but I think it is a reasonable way of checking loaded weight names.

I mentioned this in the description,

I'm sorry I missed it!

but the idea here is to be a little more robust to future changes. Weight names are prefixed by the module they live on, such that an mlp weight would look something like model.layers.0.mlp.weight, etc. It happens that draft_id_to_target_id lives at the top level module (LlamaForCausalLM) similar to the lm_head, but if it were refactored to be a part of LlamaModel or if it were made into a class (in which case the weight might be draft_id_to_target_id.mapping or similar), this check would no longer succeed.

I am open to discussion if this seems too generic of a comparison, but I think it is a reasonable way of checking loaded weight names.

I'm not really experienced enough to comment on how likely such moves are, etc. But I do wonder whether and why this case is special? Wouldn't your rationale apply to all weights, in which case we would have weight loading utility functions that would handle this sort of matching generically

It at least needs a comment - otherwise it becomes something that people are fearful of changing in future because there might be some model out there that relies on it

I pushed a diff to use the AutoWeightsLoader feature for this. I think this case is somewhat unique as it enables an "optional" weight parameter, which I understand to be a bit unconventional. Let me know if this is satisfactory, or if you want additional clarification.

benchislett · 2025-05-29T16:16:01Z

@ekagra-ranjan These mappings enable a smaller draft_vocab_size. For yuhuili/EAGLE3-LLaMA3.1-Instruct-8B, the draft_vocab_size is 32k. In this case, it is a subset of the total target vocab (of size ~128k). Since the logits are dense, their indices need to correspond to the sampled token ids. The d2t map is an index mapping which correlates the index of a logit in the smaller vocab (of size 32k) with the corresponding index in the target vocab.
I believe t2d is the inverse mapping, and is needed for inference.

When implementing EAGLE3, I renamed d2t to draft_id_to_target_id for clarity in our implementation. When loading weights, the d2t weight is loaded into the member for draft_id_to_target_id.

The recent PR aimed to make this optional, to support future EAGLE3-style models which do not use a smaller vocab size for the draft model. The bug is simply that when disabling the mapping, I was checking if a weight named d2t was loaded, instead of the re-named draft_id_to_target_id.

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>

ekagra-ranjan · 2025-05-29T22:07:33Z

@benchislett - could you pls share where does this mapping come from and how to generate this? I did a quick find in the EAGLE3 paper and could not find it.

benchislett · 2025-05-30T14:26:33Z

@ekagra-ranjan it is undocumented in the paper. The inclusion in the huggingface weights and "draft_vocab_size" are indicators of what it does.

It is present in the EAGLE repository, which you can find in multiple places by searching for "d2t". Here is a discussion thread on the EAGLE repository where it is explained in more detail:

Clarification on d2t and t2d Mapping Logic in EAGLE-3 Draft Model SafeAILab/EAGLE#222

WoosukKwon · 2025-05-30T15:24:46Z

vllm/model_executor/models/llama_eagle3.py

        if self.draft_id_to_target_id is None:
+            assert logits.shape[1] == self.config.vocab_size, \
+                "Expected logits to have shape " \
+                f"(*, {self.config.vocab_size}), but got {logits.shape}"


A dumb question: Now that we don't set self.draft_id_to_target_id = None, how can this branch be taken?

@benchislett ^ Could you please answer this question?

WoosukKwon · 2025-06-01T02:57:58Z

Merging this to fix the bug.
@benchislett Could you please add a test about this? We do have a basic correctness test for eagle (test_eagle_correctness), but it seems the test is too loose to capture this bug.

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai> Signed-off-by: amit <amit.man@gmail.com>

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai> Signed-off-by: minpeter <kali2005611@gmail.com>

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>

fix eagle logits bug

ecee9fe

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>

markmc reviewed May 29, 2025

View reviewed changes

refactor to use autoweightsloader

ef6fae6

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>

WoosukKwon added the ready ONLY add when PR is ready to merge/full CI is needed label May 30, 2025

WoosukKwon reviewed May 30, 2025

View reviewed changes

houseroad mentioned this pull request May 31, 2025

[Bug]: Eagle3 in vLLM v0.9.0 has no acceleration effect. #18946

Closed

1 task

WoosukKwon approved these changes Jun 1, 2025

View reviewed changes

WoosukKwon merged commit 1bc86a3 into vllm-project:main Jun 1, 2025
74 checks passed

amitm02 pushed a commit to amitm02/vllm that referenced this pull request Jun 1, 2025

[Bugfix] Fix EAGLE3 broken logits (vllm-project#18909)

e4540b8

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai> Signed-off-by: amit <amit.man@gmail.com>

amitm02 pushed a commit to amitm02/vllm that referenced this pull request Jun 1, 2025

[Bugfix] Fix EAGLE3 broken logits (vllm-project#18909)

2334fe9

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai> Signed-off-by: amit <amit.man@gmail.com>

minpeter pushed a commit to minpeter/vllm that referenced this pull request Jun 24, 2025

[Bugfix] Fix EAGLE3 broken logits (vllm-project#18909)

f9bbf35

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai> Signed-off-by: minpeter <kali2005611@gmail.com>

googlercolin pushed a commit to googlercolin/vllm that referenced this pull request Aug 29, 2025

[Bugfix] Fix EAGLE3 broken logits (vllm-project#18909)

90bb7c1

Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Fix EAGLE3 broken logits #18909

[Bugfix] Fix EAGLE3 broken logits #18909

Uh oh!

benchislett commented May 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 29, 2025

Uh oh!

ekagra-ranjan commented May 29, 2025

Uh oh!

markmc May 29, 2025

Uh oh!

benchislett May 29, 2025

Uh oh!

markmc May 29, 2025

Uh oh!

benchislett May 29, 2025

Uh oh!

benchislett commented May 29, 2025

Uh oh!

ekagra-ranjan commented May 29, 2025

Uh oh!

benchislett commented May 30, 2025

Uh oh!

WoosukKwon May 30, 2025

Uh oh!

WoosukKwon Jun 1, 2025

Uh oh!

WoosukKwon commented Jun 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Fix EAGLE3 broken logits #18909

[Bugfix] Fix EAGLE3 broken logits #18909

Uh oh!

Conversation

benchislett commented May 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 29, 2025

Uh oh!

ekagra-ranjan commented May 29, 2025

Uh oh!

markmc May 29, 2025

Choose a reason for hiding this comment

Uh oh!

benchislett May 29, 2025

Choose a reason for hiding this comment

Uh oh!

markmc May 29, 2025

Choose a reason for hiding this comment

Uh oh!

benchislett May 29, 2025

Choose a reason for hiding this comment

Uh oh!

benchislett commented May 29, 2025

Uh oh!

ekagra-ranjan commented May 29, 2025

Uh oh!

benchislett commented May 30, 2025

Uh oh!

WoosukKwon May 30, 2025

Choose a reason for hiding this comment

Uh oh!

WoosukKwon Jun 1, 2025

Choose a reason for hiding this comment

Uh oh!

WoosukKwon commented Jun 1, 2025

Uh oh!

Uh oh!

Uh oh!

benchislett commented May 29, 2025 •

edited by github-actions bot

Loading