Skip to content

Conversation

@kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented Oct 28, 2025

Purpose

  • Create a pathway which can quantize model weights without needing a model definition or the use of a calibration pipeline. Such a pathway provides fast and reliable support for models which:
    • Do not have a HF model definition yet
    • Have complications with sequential pipelines (very large vision towers, tracing failure, long calibration runtime)

Usage

model_free_ptq(
    model_stub="meta-llama/Llama-3.2-1B-Instruct",
    save_directory="Llama-3.2-1B-Instruct-FP8_block",
    scheme="FP8_BLOCK",
    ignore=["model.embed_tokens", "lm_head"],
    max_workers=15,
    device="cuda:0",
):

Testing

  • Added test_model_free_ptq_matches_oneshot which tests that saved tensors and configs exactly match between model_free_ptq and oneshot entrypoints for the same arguments. This test takes about 10 seconds to run.

Future Extensions

  • Mixed-precision quantization (multiple recipes/targets)
  • Multi-GPU support (work is already parallelized by threads, but if GPU is the bottleneck we can split the work across GPUs)
  • Multi-process support (is python processing is the bottleneck, we can replace multithreading with multiprocessing)

@github-actions
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@kylesayrs kylesayrs force-pushed the kylesayrs/weights-only branch from f4423c1 to 294a78a Compare October 30, 2025 20:21
@kylesayrs kylesayrs changed the base branch from main to 03_untie_fix October 31, 2025 02:41
Base automatically changed from 03_untie_fix to main October 31, 2025 16:22
@kylesayrs kylesayrs changed the title [PTQ] weights_ptq pathway for day-zero weight quantization support [Weights-only] weights_ptq pathway for day-zero weight quantization support Nov 3, 2025
@kylesayrs kylesayrs changed the title [Weights-only] weights_ptq pathway for day-zero weight quantization support [Weights-only] ptq_weights pathway for day-zero weight quantization support Nov 3, 2025
@kylesayrs kylesayrs changed the title [Weights-only] ptq_weights pathway for day-zero weight quantization support [Weights-only] ptq_weights pathway for day-zero weight quantization support Nov 3, 2025
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs force-pushed the kylesayrs/weights-only branch from 1c56a75 to 6fe9db9 Compare November 3, 2025 16:32
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs marked this pull request as ready for review November 3, 2025 18:02
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs changed the title [Weights-only] ptq_weights pathway for day-zero weight quantization support [model_free_ptq] Add pathway for day-zero weight quantization support Nov 3, 2025
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants