MenghuaZheng

MenghuaZheng

Popular repositories Loading

GCN GCN Public

GCN推理加速

C++ 2
Awesome-LLM-Inference Awesome-LLM-Inference Public

Forked from DefTruth/Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
llm_inference_engine_benchmark llm_inference_engine_benchmark Public

This repository aims to evaluate various open-source inference frameworks, analyzing their strengths and weaknesses.

Python
HIP-Performance-Optmization-on-7900xtx HIP-Performance-Optmization-on-7900xtx Public

Forked from fsword73/HIP-Performance-Optmization-on-VEGA64

14 basic topics for VEGA64 performance optmization

C++
CUDA-Learn-Notes CUDA-Learn-Notes Public

Forked from DefTruth/CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Cuda
ROps ROps Public