[Feature]: Request for SmartSpec Method Support #5886

bong-furiosa · 2024-06-27T05:46:53Z

🚀 The feature, motivation and pitch

Recently, we read a paper where the vLLM team proposed a method called SmartSpec.
We believe that the research, which dynamically adjusts the speculation length in a commercialized LLM serving system, is superior in terms of practicality compared to existing dynamic speculative length studies.

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

This idea could be applied to the current vLLM speculative decoding with Batch Expansion enabled, and it might also be applicable to future versions of vLLM with Batch Expansion disabled.
(I am curious whether the SmartSpec research was conducted on vLLM with Batch Expansion enabled. 🤔)

I wonder if the SmartSpec method will be implemented into the main repository in the near future.

Alternatives

No response

Additional context

No response

LiuXiaoxuanPKU · 2024-06-30T03:01:52Z

Hi @bong-furiosa, thanks for the attention!

Yes, we implemented SmartSpec on top of vllm with batch expansion in a forked version. We will integrate SmartSpec to vllm very soon. The first step is to remove batch expansion (#5691). In the meantime, we also need the community effort to improve speculative decoding performance (#4630) and implement tree-style speculative decoding(#4978).
SmartSpec (#4565) is very lightweight and can be implemented quickly. After all above mentioned steps, we should see similar performance as described in the paper.

bong-furiosa · 2024-06-30T05:55:31Z

Since we have received a detailed response, we will close this issue. We are very looking forward to seeing further developments in vLLM!

bong-furiosa added the feature request label Jun 27, 2024

bong-furiosa closed this as completed Jun 27, 2024

bong-furiosa reopened this Jun 28, 2024

bong-furiosa closed this as completed Jun 30, 2024

ShangmingCai mentioned this issue Jul 10, 2024

[Feature]: Multi-Proposers support for speculative decoding. #6300

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Request for SmartSpec Method Support #5886

[Feature]: Request for SmartSpec Method Support #5886

bong-furiosa commented Jun 27, 2024

LiuXiaoxuanPKU commented Jun 30, 2024

bong-furiosa commented Jun 30, 2024

[Feature]: Request for SmartSpec Method Support #5886

[Feature]: Request for SmartSpec Method Support #5886

Comments

bong-furiosa commented Jun 27, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

LiuXiaoxuanPKU commented Jun 30, 2024

bong-furiosa commented Jun 30, 2024