[V1][Spec Decode][Ngram] 1.35x gain -> 1.95x gain on InstructCoder with prompt fix #18971
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
So far we only have InstructCoder #15303 as the only dataset where we are able to benchmark a realistic editing task where ngram shines. Earlier, we only got 1.35x gain even though the ngram match were supposed to be high since we are editiing code (see the dataset).
On investigation, I found that the ngram match is not happening because the prompt is not correctly setup due to which the model doesn't output the correct code but rather continues the older code or writes an essay about the code. To get high matches, I changed the prompt to
Just output the code, do not include any explanation.
which further increased the ngram match since the model was still writing a few lines about its changes before it wrote the code.I even tried being more descriptive but it gave same speedup, i.e.,
With these changes we go from 1.35x gain to 1.95x gain
Vanilla
start server
bench
Result
Ngram
start server
bench
Before this PR
After this PR
All bench were on BS4.
TPOT
cc: @CXIAAAAA @LiuXiaoxuanPKU