Closed as not planned
Description
🚀 The feature, motivation and pitch
I noticed that the current speculative mode does not support tp from this link (https://docs.vllm.ai/en/stable/models/spec_decode.html).
However, not supporting TP will greatly limit the choice of speculative models. I would like to know why there is no TP support for speculative models. I am trying to read and modify this part of the code, but I don't understand why the scorer model can support TP, but the speculative model cannot. What are the considerations in system design?
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.