sparse-attention

Star

Here are 19 public repositories matching this topic...

NVlabs / LongLive

Star

LongLive: Real-time Interactive Long Video Generation

real-time interactive sparse-attention long-context efficient-tuning video-genenratio

Updated Nov 3, 2025
Python

lucidrains / native-sparse-attention-pytorch

Star

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

deep-learning artificial-intelligence attention sparse-attention

Updated Aug 15, 2025
Python

svg-project / Sparse-VideoGen

Star

[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention

wan diffusion diffusion-model sparse-attention efficientml hunyuan-video

Updated Oct 5, 2025
Python

mit-han-lab / radial-attention

Star

[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation

wan mochi diffusion-models sparse-attention efficientml hunyuan-video

Updated Nov 11, 2025
Python

flash-algo / flash-sparse-attention

Star

Trainable fast and memory-efficient sparse attention

kernel sparse-attention flash-attention flash-sparse-attention

Updated Nov 13, 2025
Python

ByteDance-Seed / ShadowKV

Star

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

research high-throughput low-rank cpu-offload sparse-attention long-context llm-inference

Updated May 1, 2025
Python

XunhaoLai / native-sparse-attention-triton

Star

Efficient triton implementation of Native Sparse Attention.

natural-language-processing sparse-attention large-language-models

Updated May 23, 2025
Python

ByteDance-Seed / FlexPrefill

Star

Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

natural-language-processing research sparse-attention large-language-models

Updated Oct 13, 2025
Python

thu-nics / MoA

Star

[CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>

model-compression sparse-attention large-language-models

Updated Jul 11, 2025
Python

thu-ml / SLA

Star

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

transformer video-generation mlsys inference-acceleration ai-infra linear-attention sparse-attention diffusion-transformer train-acceleration sparse-linear-attention

Updated Nov 12, 2025
Python

INV-WZQ / SparseD

Star

[Arxiv 2025] SparseD: Sparse Attention for Diffusion Language Models

efficiency sparse-attention diffusion-language-models

Updated Oct 7, 2025
Python

eezkni / SSIU

Star

[TIP-2025] Official Pytorch implementation of "Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution"

lightweight super-resolution sparse-attention

Updated Jul 8, 2025
Python

lim142857 / Sparsifiner

Star

Demo code for CVPR2023 paper "Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers"

attention-mechanism fast-inference sparse-neural-networks low-rank vision-transformer efficient-transformers sparse-attention efficient-vision-transformers

Updated Jul 4, 2023
Python

wenhao728 / VORTA

Star

The code implementation of paper "VORTA: Efficient Video Diffusion via Routing Sparse Attention"

diffusion-models sparse-attention video-diffusion-model

Updated Oct 15, 2025
Python

Dynamic Attention Mask (DAM) generate adaptive sparse attention masks per layer and head for Transformer models, enabling long-context inference with lower compute and memory overhead without fine-tuning.

inference-optimization sparse-attention efficient-ai

Updated Jun 16, 2025
Python

Iron-Bound / native-sparse-attention

Star

Building Native Sparse Attention

deep-learning sparse-attention flash-attention

Updated Feb 20, 2025
Python

sidcraftscode / Hydra

Star

Toy Hydra prototypes: SSM + sparse attention + MoE + memory; synthetic benchmarks. Paper: https://arxiv.org/abs/2508.15099

benchmarking memory pytorch language-model pkm state-space-models mixture-of-experts sparse-attention long-context

Updated Oct 24, 2025
Python

moon23k / Efficient_Summarization

Star

Text Summarization Modeling with three different Attention Types

text-summarization attention-mechanism sparse-attention

Updated May 29, 2024
Python

TokyozxcSpedy / benchmark_moe

Star

🔧 Optimize MoE model inference performance with automated Triton kernel tuning in the vLLM framework for various architectures and hardware setups.

benchmarking benchmark memory pytorch moe language-model pkm state-space-models mixture-of-experts sparse-attention long-context vllm

Updated Nov 16, 2025
Python

Improve this page

Add a description, image, and links to the sparse-attention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sparse-attention topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sparse-attention

Here are 19 public repositories matching this topic...

NVlabs / LongLive

lucidrains / native-sparse-attention-pytorch

svg-project / Sparse-VideoGen

mit-han-lab / radial-attention

flash-algo / flash-sparse-attention

ByteDance-Seed / ShadowKV

XunhaoLai / native-sparse-attention-triton

ByteDance-Seed / FlexPrefill

thu-nics / MoA

thu-ml / SLA

INV-WZQ / SparseD

eezkni / SSIU

lim142857 / Sparsifiner

wenhao728 / VORTA

ResponsibleAILab / DAM

Iron-Bound / native-sparse-attention

sidcraftscode / Hydra

moon23k / Efficient_Summarization

TokyozxcSpedy / benchmark_moe

Improve this page

Add this topic to your repo