[Tracker] TorchAO activation sparsity acceleration 🚀

This is a tracker issue for all the different ways we can accelerate training / inference with activation sparsity in TorchAO. 

### Inference

- [ ] Accelerate memory-bound bs=1 decode use cases with a selective weight loading kernel, like that described in [TEAL](https://arxiv.org/pdf/2408.14690) / [CATS](https://arxiv.org/abs/2404.08763).
  
<img width="710" alt="Image" src="https://github.com/user-attachments/assets/4e43b5bf-f157-4b1f-a305-1b55e41eb2bd" />

- [ ] Accelerate compute-bound bs=n prefill use cases with 2:4 activation sparsity, as we outlined in https://arxiv.org/pdf/2503.16672
  - [ ] Add fast fused sparsification + fp8 rowwise + srelu kernels (#2012)
  - [ ] David also apparently has a triton kernel that does this, so we should benchmark and compare these two to see which one's faster. 
  - [x] Add rowwise-fp8 + 2:4 sparse CUTLASS Kernel (#1671)
  - [x] Add performance tuning configs for above kernel (#1940)
  - [ ] Add transposed support to the rowwise-fp8 sparse CUTLASS kernel. The above kernel assumes that the weight is 2:4 sparse. Since 2:4 sparsity is only supported for the first operand, I'm using the fact $xW^T = (Wx^T)^T$ to be able to use the kernel for activation sparsity, but this means that the output of the kernel is in col-major format instead of row-major.

### Training
- [ ] Activation compression to accelerate 2:4 sparse training (#2076) has an implementation that I need to benchmark / review. 
- [ ] Implement custom sparse training kernels outlined in our ICLR [paper](https://arxiv.org/pdf/2503.16672). Lower priority for now. 
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Tracker] TorchAO activation sparsity acceleration 🚀 #2095

Inference

Training

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Tracker] TorchAO activation sparsity acceleration 🚀 #2095

Description

Inference

Training

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions