A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Nov 2, 2025 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
SGLang is a fast serving framework for large language models and vision language models.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
MoBA: Mixture of Block Attention for Long-Context LLMs
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案,包含从训练到推理的完整代码和脚本,以及实践中积累一些经验和结论。)
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
MoH: Multi-Head Attention as Mixture-of-Head Attention
[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Ling is a MoE LLM provided and open-sourced by InclusionAI.
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI.
Add a description, image, and links to the moe topic page so that developers can more easily learn about it.
To associate your repository with the moe topic, visit your repo's landing page and select "manage topics."