zhangjun

Zhang Jun zhangjun

38 followers · 443 following

Beijing
14:20 (UTC +08:00)
http://zhangjun.github.io

Achievements

x2 x3

Achievements

x2 x3

Highlights

Developer Program Member

xllm Public

C++ Apache License 2.0 Updated Jul 19, 2025
vllm Public
Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python Apache License 2.0 Updated Jul 15, 2025
sglang Public
Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python Apache License 2.0 Updated Jul 5, 2025
optimize-notes Public

gpu cuda deeplearning llm

Cuda Apache License 2.0 Updated Jul 1, 2025
my_notes Public

Daily stuffs

gpu cuda

C++ 1 Updated Apr 29, 2025
dynamo Public
Forked from ai-dynamo/dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust Apache License 2.0 Updated Apr 18, 2025
xllm_ops Public

Python Updated Apr 14, 2025
tmp2 Public

C++ Updated Feb 25, 2025
tmp Public

Cuda Updated Feb 24, 2025
torch2trt Public
Forked from NVIDIA-AI-IOT/torch2trt

An easy to use PyTorch to TensorRT converter

Python MIT License Updated Jan 3, 2025
llmc Public
Forked from ModelTC/LightCompress

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python Apache License 2.0 Updated Nov 27, 2024
torchtune-example Public

torchtune, llm

Shell Updated Nov 27, 2024
ai-chatbot Public template
Forked from vercel/ai-chatbot

A full-featured, hackable Next.js AI chatbot built by Vercel

TypeScript 1 Other Updated Oct 30, 2024
openai-node Public
Forked from openai/openai-node

The official Node.js / Typescript library for the OpenAI API

TypeScript Apache License 2.0 Updated Oct 8, 2024
llm-inference-benchmark Public
Forked from ninehills/llm-inference-benchmark

LLM Inference benchmark

Python MIT License Updated Sep 30, 2024
llm_chat Public

Python Apache License 2.0 Updated Sep 28, 2024
triton Public
Forked from triton-lang/triton

Development repository for the Triton language and compiler

C++ MIT License Updated Jul 11, 2024
sarathi-serve Public
Forked from microsoft/sarathi-serve

A low-latency & high-throughput serving engine for LLMs

Python Apache License 2.0 Updated Jun 27, 2024
llm-tools Public

Go Updated Apr 29, 2024
stable-fast Public
Forked from chengzeyi/stable-fast

An ultra lightweight inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Python MIT License Updated Apr 21, 2024
Taipy-Chatbot-Demo Public
Forked from Avaiga/demo-chatbot

A template to create any LLM Inference Web Apps using Python only

Python Updated Mar 14, 2024
stable_diffusion_compile Public

compile stable diffusion to run faster

inference stable-diffusion diffusers stable-diffusion-xl torch-dynamo

Python 1 Updated Mar 3, 2024
oneflow-diffusers Public
Forked from siliconflow/onediff

OneFlow backend for 🤗 Diffusers and ComfyUI

Python Updated Jan 8, 2024
stable-diffusion-webui-docker Public

stable diffusion webui docker

docker stable-diffusion stable-diffusion-webui stable-diffusion-docker

Shell Apache License 2.0 Updated Jan 3, 2024
WeChatMsg Public
Forked from LC044/WeChatMsg

提取微信聊天记录，将其导出成HTML、Word、CSV文档永久保存，对聊天记录进行分析生成年度聊天报告

Python 1 GNU General Public License v3.0 Updated Dec 3, 2023
StableTriton Public
Forked from arnavdantuluri/StableTriton

The first open source triton inference engine for Stable Diffusion, specifically for sdxl

Python Apache License 2.0 Updated Nov 27, 2023
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ Apache License 2.0 Updated Oct 20, 2023
Paddle Public
Forked from PaddlePaddle/Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

C++ Apache License 2.0 Updated Oct 15, 2023
transformer_framework Public
Forked from lessw2020/transformer_framework

framework for plug and play of various transformers (vision and nlp) with FSDP

Python Apache License 2.0 Updated Oct 6, 2023
zhangjun.github.io Public

https://zhangjun.github.io

Stylus 2 Updated Oct 3, 2023

Zhang Jun zhangjun

Achievements

Achievements

Highlights

xllm Public

Uh oh!

vllm Public

Uh oh!

sglang Public

Uh oh!

optimize-notes Public

Uh oh!

my_notes Public

Uh oh!

dynamo Public

Uh oh!

xllm_ops Public

Uh oh!

tmp2 Public

Uh oh!

tmp Public

Uh oh!

torch2trt Public

Uh oh!

llmc Public

Uh oh!

torchtune-example Public

Uh oh!

ai-chatbot Public template

Uh oh!

openai-node Public

Uh oh!

llm-inference-benchmark Public

Uh oh!

llm_chat Public

Uh oh!

triton Public

Uh oh!

sarathi-serve Public

Uh oh!

llm-tools Public

Uh oh!

stable-fast Public

Uh oh!

Taipy-Chatbot-Demo Public

Uh oh!

stable_diffusion_compile Public

Uh oh!

oneflow-diffusers Public

Uh oh!

stable-diffusion-webui-docker Public

Uh oh!

WeChatMsg Public

Uh oh!

StableTriton Public

Uh oh!

TensorRT-LLM Public

Uh oh!

Paddle Public

Uh oh!

transformer_framework Public

Uh oh!

zhangjun.github.io Public

Uh oh!