feat: support speculative decoding. #31

SanftMonster · 2023-11-10T18:44:11Z

Support speculative decoding with only v5.2 and cuda.

Though I tested the example and it could run well, there's only 1.5B v5.2 model yet. Therefore we need to further test it after having other scale models. However the modifications doesn't break the current functionalities of cuda (not sure of ncnn). I'd like to suggest to review it rather than wait for the model release.

TODO:

python support
verification with different scale models

SanftMonster · 2023-11-10T18:45:12Z

model.h

  std::vector<Tensor> _embd_weights;

-private:
+public:


I'm confused that I can't pass the compilation if keeping it private. Could you please help to take a look of it? @daquexian

SanftMonster added 2 commits November 10, 2023 01:56

feat: support speculative decoding.

906e0c4

fix: errors in speculative decoding chat.

45503e0

SanftMonster commented Nov 10, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support speculative decoding. #31

feat: support speculative decoding. #31

Uh oh!

SanftMonster commented Nov 10, 2023

Uh oh!

SanftMonster Nov 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: support speculative decoding. #31

Are you sure you want to change the base?

feat: support speculative decoding. #31

Uh oh!

Conversation

SanftMonster commented Nov 10, 2023

Uh oh!

SanftMonster Nov 10, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant