Skip to content
View Guangxuan-Xiao's full-sized avatar
Attention is all we need
Attention is all we need

Highlights

  • Pro

Block or report Guangxuan-Xiao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,309 67 Updated Nov 1, 2024

The best OSS video generation models

Python 1,584 150 Updated Nov 1, 2024

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,059 202 Updated Oct 29, 2024

A bibliography and survey of the papers surrounding o1

TeX 511 21 Updated Nov 1, 2024

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 279 8 Updated Oct 16, 2024

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Python 330 14 Updated Oct 31, 2024

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 640 46 Updated Sep 27, 2024

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 107 1 Updated Oct 30, 2024

A sparse attention kernel supporting mix sparse patterns

C++ 50 Updated Oct 15, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 1,375 126 Updated Oct 31, 2024

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 189 18 Updated Nov 1, 2024

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,338 213 Updated Nov 2, 2024

Helpful tools and examples for working with flex-attention

Python 446 21 Updated Oct 23, 2024

This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

31 Updated Aug 14, 2024

The Memory layer for your AI apps

Python 22,602 2,083 Updated Nov 1, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

1,108 23 Updated Jul 31, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,741 113 Updated Oct 30, 2024

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python 667 44 Updated Oct 24, 2024

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 491 27 Updated Sep 27, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 426 22 Updated Sep 5, 2024
Python 115 10 Updated Jun 12, 2024

Pipeline Parallelism for PyTorch

Python 725 86 Updated Aug 21, 2024

open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality

Python 154 13 Updated Aug 2, 2024

The official Meta Llama 3 GitHub site

Python 26,928 3,047 Updated Aug 12, 2024

LLM training in simple, raw C/CUDA

Cuda 24,279 2,733 Updated Oct 2, 2024

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,200 310 Updated Oct 6, 2024

[NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challen…

Python 13,612 1,376 Updated Oct 31, 2024

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

Python 16,741 2,669 Updated Jul 26, 2024

🙌 OpenHands: Code Less, Make More

Python 33,439 3,827 Updated Nov 2, 2024

Code examples and resources for DBRX, a large language model developed by Databricks

Python 2,503 237 Updated May 1, 2024
Next