Use ONNX Rewriter and IR to simplify the mnb_to_qdq pass #1482

justinchuby · 2024-11-12T20:35:35Z

Describe your changes

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

Co-authored-by: Jambay Kinley <jambaykinley@microsoft.com>

olive/passes/onnx/mnb_to_qdq.py

+
+            # Add Logic handling input 3
+
+            unpacked_weight_arrays = _unpack_weights(


olive/passes/onnx/mnb_to_qdq.py

@@ -7,8 +7,10 @@
 from pathlib import Path
 from typing import TYPE_CHECKING, Any, Dict

+import ml_dtypes


olive/passes/onnx/mnb_to_qdq.py

@@ -7,8 +7,10 @@
 from pathlib import Path
 from typing import TYPE_CHECKING, Any, Dict

+import ml_dtypes


olive/passes/onnx/mnb_to_qdq.py

@@ -7,8 +7,10 @@
 from pathlib import Path
 from typing import TYPE_CHECKING, Any, Dict

+import ml_dtypes


olive/passes/onnx/mnb_to_qdq.py

+                matmul = op.Add(matmul, bias)
+            return matmul
+
+        replace_mat_mul_n_bits = orp.RewriteRule(


olive/passes/onnx/mnb_to_qdq.py

+                return False
+            g_idx = g_idx.constant_value.numpy()
+            trivial_g_idx = np.arange(k, dtype=np.int32) // block_size
+            if not np.array_equal(g_idx, trivial_g_idx):


olive/passes/onnx/mnb_to_qdq.py

+            g_idx = g_idx.constant_value.numpy()
+            trivial_g_idx = np.arange(k, dtype=np.int32) // block_size
+            if not np.array_equal(g_idx, trivial_g_idx):
+                # TODO: We can log why the pattern is not matched here


olive/passes/onnx/mnb_to_qdq.py

+                matmul = op.Add(matmul, bias)
+            return matmul
+
+        replace_mat_mul_n_bits = orp.RewriteRule(


olive/passes/onnx/mnb_to_qdq.py

+                matmul = op.Add(matmul, bias)
+            return matmul
+
+        replace_mat_mul_n_bits = orp.RewriteRule(


jambayk · 2024-11-14T00:57:09Z

olive/passes/onnx/mnb_to_qdq.py

+            graph: ir.Graph = context.graph
+            return value in graph.initializers.values()
+
+        def mat_mul_n_bits_pattern_check(context, *, q_weight, g_idx, mat_mul_n_bits_out: ir.Value, **_) -> bool:


does q_weight here match for the input right before g_idx or it is whatever it is in the mat_mul_n_bits_pattern signature? The input before g_idx is qzero and can be optional. we want to check the second input

@gramalingam

The inputs of the pattern-function (mat_mul_n_bits_pattern) are bound to values in the graph, and these values are passed in as keyword-arguments to the rewrite function here. So, the order here doesn't really matter, though I usually just copy-paste and use the same argument list for both.

justinchuby · 2024-11-14T00:57:20Z

olive/passes/onnx/mnb_to_qdq.py

+            del node.meta["N"]
+
+        # TODO(justinchuby): Register and remove initializers
+        ir_model.opset_imports[""] = max(21, ir_model.opset_imports[""])


TODO: Use a more robust version conversion process

Snap

4d9abba

Co-authored-by: Jambay Kinley <jambaykinley@microsoft.com>

justinchuby assigned jambayk Nov 12, 2024

github-advanced-security bot found potential problems Nov 12, 2024

View reviewed changes

olive/passes/onnx/mnb_to_qdq.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Nov 12, 2024

View reviewed changes

update

2215c97

github-advanced-security bot found potential problems Nov 14, 2024

View reviewed changes

olive/passes/onnx/mnb_to_qdq.py

@@ -7,8 +7,10 @@

from pathlib import Path

from typing import TYPE_CHECKING, Any, Dict

import ml_dtypes

Check notice

Code scanning / CodeQL

Unused import Note

Import of 'ml_dtypes' is not used.

github-advanced-security bot found potential problems Nov 14, 2024

View reviewed changes

update

1e1d0f2

github-advanced-security bot found potential problems Nov 14, 2024

View reviewed changes

olive/passes/onnx/mnb_to_qdq.py

matmul = op.Add(matmul, bias)

return matmul

replace_mat_mul_n_bits = orp.RewriteRule(

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable replace_mat_mul_n_bits is not used.

update

4825082

jambayk reviewed Nov 14, 2024

View reviewed changes

olive/passes/onnx/mnb_to_qdq.py Outdated Show resolved Hide resolved

github-advanced-security bot found potential problems Nov 14, 2024

View reviewed changes

xor

e694d1d

jambayk reviewed Nov 14, 2024

View reviewed changes

justinchuby commented Nov 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use ONNX Rewriter and IR to simplify the mnb_to_qdq pass #1482

Use ONNX Rewriter and IR to simplify the mnb_to_qdq pass #1482

justinchuby commented Nov 12, 2024

jambayk Nov 14, 2024 •

edited

Loading

justinchuby Nov 14, 2024

gramalingam Nov 14, 2024

justinchuby Nov 14, 2024


		# Add Logic handling input 3

		unpacked_weight_arrays = _unpack_weights(

Use ONNX Rewriter and IR to simplify the mnb_to_qdq pass #1482

Are you sure you want to change the base?

Use ONNX Rewriter and IR to simplify the mnb_to_qdq pass #1482

Conversation

justinchuby commented Nov 12, 2024

Describe your changes

Checklist before requesting a review

(Optional) Issue link

jambayk Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

justinchuby Nov 14, 2024

Choose a reason for hiding this comment

gramalingam Nov 14, 2024

Choose a reason for hiding this comment

justinchuby Nov 14, 2024

Choose a reason for hiding this comment

jambayk Nov 14, 2024 •

edited

Loading