Skip to content

Commit 21002ac

Browse files
author
sanchit-gandhi
committed
update doc
1 parent 719a80a commit 21002ac

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

docs/source/en/model_doc/whisper.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ Here is a step-by-step guide to transcribing an audio sample using a pre-trained
7272
' Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel.'
7373
```
7474

75-
Whisper is compatible with the following optimisations:
75+
Whisper is compatible with the following optimisations for both short and long-form generation:
7676
- [PyTorch Scaled Dot Product Attention (SDPA)](../perf_infer_gpu_one#pytorch-scaled-dot-product-attention): flash attention and memory-efficient attention kernels. Enabled by default for `torch>=2.1.1`.
7777
- [Flash Attention 2](../perf_infer_gpu_one#flashattention-2): improved implementation of flash attention through better parallelism and work partitioning.
7878
- [torch.compile](../llm_optims#static-kv-cache-and-torchcompile): JIT-compile the forward pass to dispatch to efficient fused kernels.
@@ -101,7 +101,8 @@ As an example, the following codesnippet enables SDPA and `torch.compile` for up
101101
... ).input_features
102102

103103
>>> # Compile the forward pass
104-
>>> _ = model.generate(input_features)
104+
>>> for _ in range(2):
105+
>>> model.generate(input_features)
105106

106107
>>> # Generate token ids using compiled graph (fast!)
107108
>>> predicted_ids = model.generate(input_features)

0 commit comments

Comments
 (0)