SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
-
Updated
Nov 25, 2025 - Python
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
Kernel Tuner
[DEPRECATED] Moved to ROCm/rocm-libraries repo
Alchemy Cat —— 🔥Config System for SOTA
Benchmark scripts for TVM
Collective Knowledge crowd-tuning extension to let users crowdsource their experiments (using portable Collective Knowledge workflows) such as performance benchmarking, auto tuning and machine learning across diverse platforms with Linux, Windows, MacOS and Android provided by volunteers. Demo of DNN crowd-benchmarking and crowd-tuning:
A Generic Distributed Auto-Tuning Infrastructure
HPCコードの全自動最適化を行う集約並列CLIエージェント
This software package accompanies the paper "A Methodology for Comparing Auto-Tuning Optimization Algorithms" (https://doi.org/10.1016/j.future.2024.05.021), making the guidelines in the methodology easy to apply.
Autotuner for Spark applications
Autotuning Google Text-to-Speech
A package for automated kernel tuning with LLMs.
DHRT optimises resource allocation for serverless workloads with strict execution deadlines in heterogeneous Kubernetes clusters. It dynamically tunes resource allocations using real-time metrics and leverages historical data to configure and schedule workloads based on their resource requirements and specified deadlines.
Library to compute auto-tuning and performance metrics.
Add a description, image, and links to the auto-tuning topic page so that developers can more easily learn about it.
To associate your repository with the auto-tuning topic, visit your repo's landing page and select "manage topics."