Skip to content

Commit 858205a

Browse files
authored
Add workaround to recover the perf for quantized vit in torch.compile (#926)
Add temporary workaround to recover the perf for quantized vit under torch.compile Summary: Recently we found a perf drop in quantized vit due to #898 (comment) This PR add a temp fix until we figure out the longer term fix. I think ideally we should figure out why the tensor subclass check failed in torch.compile (https://github.com/pytorch/pytorch/blob/e4d294221b140fdbb49a64f297bc60c9fcc2f80e/torch/nn/modules/activation.py#L1286) and fix that Test Plan: python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
1 parent 653efe9 commit 858205a

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

tutorials/quantize_vit/run_vit_b_quant.py

+3
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,9 @@
3636
if not TORCH_VERSION_AT_LEAST_2_5:
3737
unwrap_tensor_subclass(model)
3838

39+
# temporary workaround to recover the perf with quantized model under torch.compile
40+
torch.backends.mha.set_fastpath_enabled(False)
41+
3942
model = torch.compile(model, mode='max-autotune')
4043

4144
# Must run with no_grad when optimizing for inference

0 commit comments

Comments
 (0)