Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Split quantize_pt2 to allow calling the same APIs in testing and regu…
…lar flows (#4505) Summary: Pull Request resolved: #4505 Splitting `quantize_pt2` into two steps: `convert_pt2` and `fuse_pt2`. Convert will return the converted model after `convert_pt2e`, which allows getting reference outputs for testing. Fuse will return the final fused graph. Those calls should be always be using the same quantizer. Note that we will probably split the convert step again to allow calibration in a follow up diff. `quantize_pt2` is still the one-liner API, for anything that doesn't require converted reference outputs (so mostly for e2e testing). Main benefit is that we can use the same API everywhere now, and things like decomposing SDPA and any other ATen IR passes that need to run before quantization can be done in one location (in `convert_pt2`). Reviewed By: dulinriley Differential Revision: D60544102 fbshipit-source-id: 7866d26c6ed05cb8a8bf02eb7920a7adbac5f03a
- Loading branch information