#

int8

Here are 15 public repositories matching this topic...

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

sparsity pruning quantization knowledge-distillation auto-tuning int8 low-precision quantization-aware-training post-training-quantization awq int4 large-language-models gptq smoothquant sparsegpt fp4 mxformat

Updated Nov 21, 2025
Python

Wulingtian / yolov5_tensorrt_int8_tools

tensorrt int8 量化yolov5 onnx模型

tensorrt int8 onnx yolov5

Updated Apr 23, 2021
Python

xuanandsix / Tensorrt-int8-quantization-pipline

a simple pipline of int8 quantization based on tensorrt.

quantization tensorrt int8 yolox classifaction

Updated Oct 14, 2022
Python

Wulingtian / RepVGG_TensorRT_int8

RepVGG TensorRT int8 量化，实测推理不到1ms一帧！

tensorrt int8 repvgg

Updated Apr 23, 2021
Python

phisat2-trustworthy-onboard-ai

sylvesterkaczmarek / phisat2-trustworthy-onboard-ai

Trustworthy onboard satellite AI in PyTorch→ONNX→INT8 with calibration, telemetry, and a PhiSat-2 EO tile-filter demo.

space telemetry calibration esa satellites cubesat quantization earth-observation int8 onnx edge-ai onnxruntime quantization-efficient-network satellite-security onboard-ai phisat-2 phisat2

Updated Nov 10, 2025
Python

the0807 / YOLOv8-ONNX-TensorRT

👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera

computer-vision object-detection fp16 tensorrt int8 onnx yolov8

Updated May 23, 2024
Python

aahouzi / llama2-chatbot-cpu

A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTorch with bfloat16.

Updated Feb 27, 2024
Python

whitelok / tensorrt-int8-python-sample

TensorRT Int8 Python version sample. TensorRT Int8 Python 实现例子。TensorRT Int8 Pythonの例です

python machine-learning ai deep-learning inference nvidia tensorrt int8 int8-inference tensorrt-int8-python

Updated Jan 28, 2019
Python

egbertYeah / mt-yolov6_tensorrt

MT-Yolov6 TensorRT Inference with Python.

tensorrt int8 yolov6

Updated Jul 2, 2022
Python

umitkacar / onnx-tensorrt-optimization

40x faster AI inference: ONNX to TensorRT optimization with FP16/INT8 quantization, multi-GPU support, and deployment

Updated Nov 14, 2025
Python

ambv231 / tinyllama-coreml-ios18-quantization

Quantize TinyLlama-1.1B-Chat from PyTorch to CoreML (float16, int8, int4) for efficient on-device inference on iOS 18+.

nlp mobile ai transformers pytorch llama quantization int8 coreml on-device huggingface apple-silicon int4 llm tinyllama ios18 mlpackage

Updated Nov 22, 2025
Python

douzsh / mxnet-quantized

mxnet GluonCV quantization binary ternary models

mxnet binary quantization ternary int8 gluoncv

Updated May 22, 2019
Python

yester31 / Quantization_Framework

development quantization framework

compression optimization quantization int8 infernece

Updated Sep 9, 2023
Python

stdlib-js / number-int8-base-identity

Evaluate the identity function for a signed 8-bit integer.

nodejs javascript identity node math stdlib mathematics signed node-js integer int8

Updated Aug 13, 2025
Python

GreenBull31 / tinyllama-coreml-ios18-quantization

Quantize TinyLlama-1.1B-Chat from PyTorch to CoreML (float16, int8, int4) for efficient on-device inference on iOS 18+.

nlp mobile ai transformers pytorch llama quantization int8 coreml on-device huggingface apple-silicon int4 llm tinyllama ios18 mlpackage

Updated May 18, 2025
Python

Improve this page

Add a description, image, and links to the int8 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the int8 topic, visit your repo's landing page and select "manage topics."