optimize vram for gguf and add momentum #1031

wenhuach21 · 2025-11-13T07:59:35Z

No description provided.

for more information, see https://pre-commit.ci

…o fix_imatrix

for more information, see https://pre-commit.ci

…o fix_imatrix # Conflicts: # auto_round/data_type/gguf.py

for more information, see https://pre-commit.ci

…o fix_imatrix

for more information, see https://pre-commit.ci

Copilot

Pull Request Overview

This PR fixes an imatrix padding issue and optimizes VRAM usage for GGUF export. The changes introduce memory management improvements and add support for chunked quantization processing to handle large tensors more efficiently.

Key changes:

Fixed imatrix padding by ensuring it's padded to the correct group size with appropriate fill value (1e-5)
Added chunked processing for large tensors in quantization search to reduce memory usage
Consolidated memory cleanup utilities and added momentum parameter support

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
auto_round/export/export_to_gguf/packing.py	Added memory clearing and imatrix device handling in exception path
auto_round/export/export_to_awq/utils.py	Removed duplicate clear_memory function
auto_round/data_type/utils.py	Added configurable padding value parameter
auto_round/data_type/int.py	Fixed imatrix padding with appropriate fill value
auto_round/data_type/gguf.py	Added chunked processing for large tensors and refactored quantization search
auto_round/compressors/base.py	Added momentum parameter support
auto_round/main.py	Added momentum command-line argument

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

auto_round/__main__.py

auto_round/data_type/gguf.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…o fix_imatrix

for more information, see https://pre-commit.ci

wenhuach21 and others added 2 commits November 13, 2025 15:58

fix imatrix pad issue

a13bdf0

[pre-commit.ci] auto fixes from pre-commit.com hooks

4e20199

for more information, see https://pre-commit.ci

wenhuach21 marked this pull request as draft November 13, 2025 08:02

wenhuach21 and others added 3 commits November 13, 2025 21:05

update

405bde7

Merge branch 'fix_imatrix' of https://github.com/intel/auto-round int…

11171c4

…o fix_imatrix

[pre-commit.ci] auto fixes from pre-commit.com hooks

886a6c8

for more information, see https://pre-commit.ci

wenhuach21 marked this pull request as ready for review November 13, 2025 13:16

wenhuach21 and others added 5 commits November 13, 2025 21:21

refine

e2d7e70

Merge branch 'fix_imatrix' of https://github.com/intel/auto-round int…

cd97899

…o fix_imatrix # Conflicts: # auto_round/data_type/gguf.py

clean

2130075

update

9ecf7e6

[pre-commit.ci] auto fixes from pre-commit.com hooks

ea310ec

for more information, see https://pre-commit.ci

wenhuach21 changed the title ~~fix imatrix pad issue~~ fix imatrix pad issue and optimize vram for gguf Nov 13, 2025

update

967af55

wenhuach21 requested review from Kaihui-intel and n1ck-guo November 13, 2025 13:37

Merge branch 'fix_imatrix' of https://github.com/intel/auto-round int…

5c5f72d

…o fix_imatrix

wenhuach21 requested a review from Copilot November 13, 2025 13:38

[pre-commit.ci] auto fixes from pre-commit.com hooks

356ee30

for more information, see https://pre-commit.ci

Copilot AI reviewed Nov 13, 2025

View reviewed changes

auto_round/__main__.py Outdated Show resolved Hide resolved

auto_round/data_type/gguf.py Show resolved Hide resolved

auto_round/data_type/gguf.py Show resolved Hide resolved

wenhuach21 and others added 10 commits November 13, 2025 21:42

Update auto_round/__main__.py

5f4d85c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

update

a9fe211

Merge branch 'fix_imatrix' of https://github.com/intel/auto-round int…

77ea33f

…o fix_imatrix

[pre-commit.ci] auto fixes from pre-commit.com hooks

63ae0c2

for more information, see https://pre-commit.ci

Merge branch 'main' into fix_imatrix

6c289d0

refine comments

36d41af

[pre-commit.ci] auto fixes from pre-commit.com hooks

a3a19e2

for more information, see https://pre-commit.ci

Merge branch 'main' into fix_imatrix

e5fce1e

update readme

c584039

refine readme

267ff64

wenhuach21 changed the title ~~fix imatrix pad issue and optimize vram for gguf~~ optimize vram for gguf and add momentum Nov 14, 2025

refine

0bc902f

WeiweiZhang1 approved these changes Nov 14, 2025

View reviewed changes

wenhuach21 merged commit 58b3d90 into main Nov 14, 2025
16 of 22 checks passed

wenhuach21 deleted the fix_imatrix branch November 14, 2025 05:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

optimize vram for gguf and add momentum #1031

optimize vram for gguf and add momentum #1031

Uh oh!

wenhuach21 commented Nov 13, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

optimize vram for gguf and add momentum #1031

optimize vram for gguf and add momentum #1031

Uh oh!

Conversation

wenhuach21 commented Nov 13, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants