Skip to content

Conversation

@n1ck-guo
Copy link
Contributor

@n1ck-guo n1ck-guo commented Sep 11, 2025

  • export llm_compressor format
  • export auto_round:llm_compressor format
  • extract save func from all export files, save in export/utils.py, rename to save_model

Signed-off-by: n1ck-guo <heng.guo@intel.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for the FP8_STATIC quantization scheme to export models in the llm_compressor format. The change enables static FP8 weight and activation quantization with specific configurations for compressed-tensors compatibility.

Key Changes

  • Adds FP8_STATIC scheme detection and format conversion to llm_compressor
  • Implements static FP8 quantization export with compressed-tensors configuration
  • Consolidates common save functionality across export modules

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
auto_round/utils.py Modified is_static_wfp8afp8 to accept string parameters for format detection
auto_round/export/utils.py Added shared save function to reduce code duplication across export modules
auto_round/export/export_to_llmcompressor/export_to_static_fp.py New module implementing FP8_STATIC export with compressed-tensors configuration
auto_round/export/export_to_llmcompressor/export.py Added FP8_STATIC support to the main export dispatcher
auto_round/autoround.py Added FP8_STATIC format detection and validation logic
auto_round/export/export_to_awq/export.py Refactored to use shared save function
auto_round/export/export_to_autoround/export_to_fp8.py Renamed class and refactored to use shared save function
auto_round/export/export_to_autoround/export.py Refactored to use shared save function
auto_round/export/export_to_autogptq/export.py Refactored to use shared save function
test/test_cpu/test_llmcompressor.py Added test case for FP8_STATIC export validation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@wenhuach21
Copy link
Contributor

wenhuach21 commented Sep 11, 2025

Support it in the AutoRound format as well, and add nvfp4/fp8_static support on the vLLM side later.

n1ck-guo and others added 2 commits September 10, 2025 22:58
Signed-off-by: n1ck-guo <heng.guo@intel.com>
@wenhuach21 wenhuach21 requested a review from yiliu30 September 11, 2025 05:56
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
yiliu30 and others added 3 commits September 19, 2025 01:14
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
@n1ck-guo n1ck-guo merged commit 1089004 into main Sep 19, 2025
8 checks passed
@n1ck-guo n1ck-guo deleted the hengguo/static_fp8 branch September 19, 2025 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants