Skip to content

2:4 Sparsity acceleration does not deliver any benefit. #3236

Open
@Moritz-Tho123

Description

@Moritz-Tho123

When checking out the conclusion of the tutorial for 2:4 sparsity here, the claimed advantage of 2:4 sparsity over dense execution is given as 1.3x-2.0x. However, when checking the actual values that are output in the dense and sparse section terminal sections we get the following table:

bs compile Dense Sparse Speedup
4 n 9.56 16.77 0.57x
4 y 8.98 9.49 0.95x
16 n 31.86 62.27 0.51x
16 y 30.83 34.29 0.90x
64 n 123.97 243.16 0.51x
64 y 104.98 133.49 0.79x
256 n 476.03 1195.23 0.40x
256 y 397.13 542.3 0.73x

As can be seen, the sparse matrix computation does not beat the dense one even once. I rerun these experiments with torch 2.5.1+cu2.4 on a single H100 and observed similar results.

How come the values are this much worse?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions