Starred repositories
🐙Octopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Codebase for Automated Creation of Digital Cousins for Robust Policy Learning
Official implementation of "Self-Improving Video Generation"
Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
[ICML 2024] Official PyTorch implementation of "SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization"
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
A high-throughput and memory-efficient inference and serving engine for LLMs
[TMLR 2024] Efficient Large Language Models: A Survey
Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
LightSeq: A High Performance Library for Sequence Processing and Generation
The PyTorch implementation of Learned Step size Quantization (LSQ) in ICLR2020 (unofficial)
Official inference library for Mistral models
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
The official implementation of the ICML 2023 paper OFQ-ViT
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.