Open
Description
🚀 The feature, motivation and pitch
First surfaced in #1057, the replace_attention_with_custom_sdpa_attention
function, used when exporting models in torchchat, can be replaced with the equivalent API provided in the Excecutorch https://github.com/pytorch/executorch/blob/main/examples/models/llama2/source_transformation/sdpa.py
Task: Swap the torchchat implementation with that of ExecuTorch's. Delete the then defunct code from torchchat
Alternatives
No response
Additional context
No response
RFC (Optional)
No response