slowlyC

dusan slowlyC

Ai infra

Pinned Loading

flash-linear-attention flash-linear-attention Public

Forked from fla-org/flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

Python
triton-lang/triton triton-lang/triton Public

Development repository for the Triton language and compiler

MLIR 18k 2.5k
sglang sglang Public

Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python
cutlass cutlass Public

Forked from NVIDIA/cutlass

CUDA Templates for Linear Algebra Subroutines

C++
Megatron-LM Megatron-LM Public

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

Python
pytorch pytorch Public

Forked from pytorch/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python