
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
A light llama-like llm inference framework based on the triton kernel.
Diffusion Transformers (DiTs) trained on MNIST dataset
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
[ECCV 2024] Official pytorch implementation of "Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts"
This repo implements Diffusion Transformers(DiT) in PyTorch and provides training and inference code on CelebHQ dataset
📄 Awesome CV is LaTeX template for your outstanding job application
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
simplest online-softmax notebook for explain Flash Attention
An Open-source Platform for Inverse Lithography Technology Research
tiny ring attention implement for learning purpose
Open-Sora: Democratizing Efficient Video Production for All
Adaptive Caching for Faster Video Generation with Diffusion Transformers
[NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an …
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
VideoSys: An easy and efficient system for video generation
Development repository for the Triton language and compiler
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
* this is a draft version; * this repo's commit history is omitted due to double-blind requirements for paper review, may be overwrite or deprecated later
Multithreaded matrix multiplication and analysis based on OpenMP and PThread