-
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedJul 15, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedJul 5, 2025 -
-
-
dynamo Public
Forked from ai-dynamo/dynamoA Datacenter Scale Distributed Inference Serving Framework
Rust Apache License 2.0 UpdatedApr 18, 2025 -
-
-
-
torch2trt Public
Forked from NVIDIA-AI-IOT/torch2trtAn easy to use PyTorch to TensorRT converter
Python MIT License UpdatedJan 3, 2025 -
llmc Public
Forked from ModelTC/LightCompress[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Python Apache License 2.0 UpdatedNov 27, 2024 -
-
ai-chatbot Public template
Forked from vercel/ai-chatbotA full-featured, hackable Next.js AI chatbot built by Vercel
-
openai-node Public
Forked from openai/openai-nodeThe official Node.js / Typescript library for the OpenAI API
TypeScript Apache License 2.0 UpdatedOct 8, 2024 -
llm-inference-benchmark Public
Forked from ninehills/llm-inference-benchmarkLLM Inference benchmark
Python MIT License UpdatedSep 30, 2024 -
-
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedJul 11, 2024 -
sarathi-serve Public
Forked from microsoft/sarathi-serveA low-latency & high-throughput serving engine for LLMs
Python Apache License 2.0 UpdatedJun 27, 2024 -
-
stable-fast Public
Forked from chengzeyi/stable-fastAn ultra lightweight inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Python MIT License UpdatedApr 21, 2024 -
Taipy-Chatbot-Demo Public
Forked from Avaiga/demo-chatbotA template to create any LLM Inference Web Apps using Python only
Python UpdatedMar 14, 2024 -
stable_diffusion_compile Public
compile stable diffusion to run faster
-
oneflow-diffusers Public
Forked from siliconflow/onediffOneFlow backend for 🤗 Diffusers and ComfyUI
Python UpdatedJan 8, 2024 -
stable diffusion webui docker
Shell Apache License 2.0 UpdatedJan 3, 2024 -
WeChatMsg Public
Forked from LC044/WeChatMsg提取微信聊天记录,将其导出成HTML、Word、CSV文档永久保存,对聊天记录进行分析生成年度聊天报告
-
StableTriton Public
Forked from arnavdantuluri/StableTritonThe first open source triton inference engine for Stable Diffusion, specifically for sdxl
Python Apache License 2.0 UpdatedNov 27, 2023 -
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
C++ Apache License 2.0 UpdatedOct 20, 2023 -
Paddle Public
Forked from PaddlePaddle/PaddlePArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
C++ Apache License 2.0 UpdatedOct 15, 2023 -
transformer_framework Public
Forked from lessw2020/transformer_frameworkframework for plug and play of various transformers (vision and nlp) with FSDP
Python Apache License 2.0 UpdatedOct 6, 2023 -