@@ -95,6 +95,24 @@ become available.
95
95
<td style="text-align: center;">✅</td>
96
96
<td><code>lmms-lab/LLaVA-OneVision-Data</code>, <code>Aeala/ShareGPT_Vicuna_unfiltered</code></td>
97
97
</tr>
98
+ <tr>
99
+ <td><strong>HuggingFace-MTBench</strong></td>
100
+ <td style="text-align: center;">✅</td>
101
+ <td style="text-align: center;">✅</td>
102
+ <td><code>philschmid/mt-bench</code></td>
103
+ </tr>
104
+ <tr>
105
+ <td><strong>HuggingFace-Blazedit</strong></td>
106
+ <td style="text-align: center;">✅</td>
107
+ <td style="text-align: center;">✅</td>
108
+ <td><code>vdaita/edit_5k_char</code>, <code>vdaita/edit_10k_char</code></td>
109
+ </tr>
110
+ <tr>
111
+ <td><strong>Spec Bench</strong></td>
112
+ <td style="text-align: center;">✅</td>
113
+ <td style="text-align: center;">✅</td>
114
+ <td><code>wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl</code></td>
115
+ </tr>
98
116
<tr>
99
117
<td><strong>Custom</strong></td>
100
118
<td style="text-align: center;">✅</td>
@@ -239,6 +257,43 @@ vllm bench serve \
239
257
--num-prompts 2048
240
258
```
241
259
260
+ ### Spec Bench Benchmark with Speculative Decoding
261
+
262
+ ``` bash
263
+ VLLM_USE_V1=1 vllm serve meta-llama/Meta-Llama-3-8B-Instruct \
264
+ --speculative-config $' {"method": "ngram",
265
+ "num_speculative_tokens": 5, "prompt_lookup_max": 5,
266
+ "prompt_lookup_min": 2}'
267
+ ```
268
+
269
+ [ SpecBench dataset] ( https://github.com/hemingkx/Spec-Bench )
270
+
271
+ Run all categories:
272
+
273
+ ``` bash
274
+ # Download the dataset using:
275
+ # wget https://raw.githubusercontent.com/hemingkx/Spec-Bench/refs/heads/main/data/spec_bench/question.jsonl
276
+
277
+ vllm bench serve \
278
+ --model meta-llama/Meta-Llama-3-8B-Instruct \
279
+ --dataset-name spec_bench \
280
+ --dataset-path " <YOUR_DOWNLOADED_PATH>/data/spec_bench/question.jsonl" \
281
+ --num-prompts -1
282
+ ```
283
+
284
+ Available categories include ` [writing, roleplay, reasoning, math, coding, extraction, stem, humanities, translation, summarization, qa, math_reasoning, rag] ` .
285
+
286
+ Run only a specific category like "summarization":
287
+
288
+ ``` bash
289
+ vllm bench serve \
290
+ --model meta-llama/Meta-Llama-3-8B-Instruct \
291
+ --dataset-name spec_bench \
292
+ --dataset-path " <YOUR_DOWNLOADED_PATH>/data/spec_bench/question.jsonl" \
293
+ --num-prompts -1
294
+ --spec-bench-category " summarization"
295
+ ```
296
+
242
297
### Other HuggingFaceDataset Examples
243
298
244
299
``` bash
@@ -295,6 +350,18 @@ vllm bench serve \
295
350
--num-prompts 80
296
351
```
297
352
353
+ ` vdaita/edit_5k_char ` or ` vdaita/edit_10k_char ` :
354
+
355
+ ``` bash
356
+ vllm bench serve \
357
+ --model Qwen/QwQ-32B \
358
+ --dataset-name hf \
359
+ --dataset-path vdaita/edit_5k_char \
360
+ --num-prompts 90 \
361
+ --blazedit-min-distance 0.01 \
362
+ --blazedit-max-distance 0.99
363
+ ```
364
+
298
365
### Running With Sampling Parameters
299
366
300
367
When using OpenAI-compatible backends such as ` vllm ` , optional sampling
0 commit comments