Load external data in HQQ and RTN quantization passes by Lidang-Jiang · Pull Request #2380 · microsoft/Olive

Lidang-Jiang · 2026-04-04T12:14:49Z

Summary

After the onnx-ir migration, ir.load() no longer loads external data by default. This caused OnnxHqqQuantization to produce invalid output models when the input model stores weights as external data — the original weight tensors retained stale external data references pointing to non-existent files in the output directory.

Changes:

Add ir.external_data.load_to_model(ir_model) after load_ir_model() in both OnnxHqqQuantization and OnnxBlockWiseRtnQuantization, consistent with the pattern already used in extract_adapters.py (line 104)
Add regression tests for both passes with external data models

Before

```
Created model with external data (12.0 MB)
Input model validation: PASSED

--- olive quantize --algorithm hqq ---
Exit code: 0
Output model validation: FAILED
Data of TensorProto ( tensor name: W1) should be stored in /tmp/tmpkh7vp8nn/quantized_hqq/model.onnx.data, but it is not regular file.

--- olive quantize --algorithm rtn ---
Exit code: 0
Output model validation: PASSED
```

After

```
Created model with external data (12.0 MB)
Input model validation: PASSED

--- olive quantize --algorithm hqq ---
Exit code: 0
Output model validation: PASSED

--- olive quantize --algorithm rtn ---
Exit code: 0
Output model validation: PASSED
```

Unit tests (13 passed)

```
test_hqq_quantization_pass PASSED
test_hqq_quantization_pass_produces_valid_output_when_model_has_external_data PASSED
test_rtn_quantization_pass_matmul[True] PASSED
test_rtn_quantization_pass_matmul[False] PASSED
test_rtn_quantization_pass_gather[True] PASSED
test_rtn_quantization_pass_gather[False] PASSED
test_rtn_quantization_pass_produces_valid_output_when_model_has_external_data PASSED
test_rtn_quantization_with_exclusion PASSED
test_rtn_quantization_gather_8bit[True] PASSED
test_rtn_quantization_gather_8bit[False] PASSED
test_rtn_quantization_gather_quantize_axis_forced_to_last_dim PASSED
test_rtn_quantization_shared_gather_weights PASSED
test_rtn_quantization_removes_unused_initializers PASSED

======================== 13 passed, 2 warnings in 2.78s ========================
```

Test plan

E2E: olive quantize --algorithm hqq on model with external data produces valid output
E2E: olive quantize --algorithm rtn on model with external data produces valid output
Unit test: test_hqq_quantization_pass_produces_valid_output_when_model_has_external_data
Unit test: test_rtn_quantization_pass_produces_valid_output_when_model_has_external_data
All 13 existing + new tests pass
ruff check clean on modified files

After the onnx-ir migration, `ir.load()` no longer loads external data by default. This caused HQQ quantization to produce invalid output models when the input model stores weights as external data, because the original weight tensors retained stale external data references that pointed to non-existent files in the output directory. Fix by calling `ir.external_data.load_to_model()` after `load_ir_model()` in both HQQ and RTN quantization passes, consistent with the pattern already used in `extract_adapters.py`. Fixes microsoft#2223 Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>

Lidang-Jiang · 2026-04-04T14:44:57Z

@microsoft-github-policy-service agree [company="{your company}"]

Lidang-Jiang · 2026-04-04T14:46:16Z

@microsoft-github-policy-service agree

xiaoyu-work · 2026-04-08T22:50:06Z

@justinchuby can you confirm this logic? this PR is for #2223

justinchuby

Lgtm if you want all weights in memory

jambayk requested a review from xiaoyu-work April 7, 2026 21:14

xiaoyu-work approved these changes Apr 8, 2026

View reviewed changes

justinchuby approved these changes Apr 8, 2026

View reviewed changes

xiaoyu-work merged commit 0c665dc into microsoft:main Apr 8, 2026
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load external data in HQQ and RTN quantization passes#2380

Load external data in HQQ and RTN quantization passes#2380
xiaoyu-work merged 1 commit intomicrosoft:mainfrom
Lidang-Jiang:fix/onnx-quantization-external-data

Lidang-Jiang commented Apr 4, 2026

Uh oh!

Lidang-Jiang commented Apr 4, 2026

Uh oh!

Lidang-Jiang commented Apr 4, 2026

Uh oh!

xiaoyu-work commented Apr 8, 2026

Uh oh!

justinchuby left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Lidang-Jiang commented Apr 4, 2026

Summary

Test plan

Uh oh!

Lidang-Jiang commented Apr 4, 2026

Uh oh!

Lidang-Jiang commented Apr 4, 2026

Uh oh!

xiaoyu-work commented Apr 8, 2026

Uh oh!

justinchuby left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants