Stars
最新Claude Pro订阅教程:如何注册Claude账号?如何订阅Claude Pro会员?如何购买Claude Pro原生独立账号?如何为你现有的Claude充值?(含国内使用Claude Code教程)
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
Awesome LLM compression research papers and tools.
we want to create a repo to illustrate usage of transformers in chinese
A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Multilingual Medicine: Model, Dataset, Benchmark, Code