NXP backend: Create NeutronAtenPassManager with initial BatchNorm fusing passes #10579

skywall · 2025-04-30T08:48:58Z

Summary

Add NeutronAtenPassManager that by default runs passes for optimal model conversion. Initially, only passes for BatchNorm + Conv/Linear fusing are added. I haven't intentionally moved passes into backend/transforms because added passes are executed on Aten ops and not Edge ops. Please let me know if I should follow different approach.

This pass manager is planned to be included in Neutron backend's pipeline - https://github.com/pytorch/executorch/pull/10196/files#diff-b3b0844670da322bb4c245ae440e8c71378d631bb94334148322cdf758369baaR51

cc @digantdesai @JakeStevens @robert-kalmar

pytorch-bot · 2025-04-30T08:49:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10579

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 21ed973 with merge base 7d9dd46 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

skywall · 2025-04-30T08:50:00Z

@pytorchbot label "module: nxp" "release notes: nxp"

JakeStevens · 2025-05-01T23:39:33Z

backends/nxp/aten_passes/fuse_batch_norm_with_linear_pass.py

+
+            bn_node = node
+
+            if not _is_linear(bn_node.args[0]):


here and above, you can be defensive and also make sure args len == 1. there is nothing stopping someone from adding a skip connection after conv/linear and before batchnorm, or some other op. Can't say i've seen it in practice though

Added check for number of users of Conv/Linear op.

JakeStevens · 2025-05-01T23:41:16Z

backends/nxp/aten_passes/neutron_aten_pass_manager.py

+
+import torch
+
+from executorch.backends.nxp.aten_passes.fuse_batch_norm_with_conv_pass import (


nit: can become unwieldly since there are a few other permutations of "batch norm" fusion, maybe just one file for fuse_batch_norm would be good.

Agree with you. Maybe we can create subdirectories in backend/transforms based on what dialect/ir is pass targeting?

JakeStevens · 2025-05-02T00:18:13Z

LGTM but @digantdesai thoughts on the aten pass setup?

skywall · 2025-05-02T12:08:13Z

I have also disabled some tests that require changes from #10258 . We'll update PR that will be merged last.

digantdesai · 2025-05-06T16:10:40Z

backends/nxp/tests/test_batch_norm_fusion.py

+    program = torch.export.export_for_training(module, example_input, strict=True)
+    og_module = program.module()
+
+    pm = NeutronAtenPassManager()


curious why does it have to be before export? Typically this is done in neutron.preprocess(). The reason is, it is not easy to run these nodes if Neutron partitioner fails to pick them up. There are several examples of this in Arm, XNNPACK etc.

What export do you have in mind here? Export to lowered program?

These passes are executed before quantization on graph with nodes in ATen IR. If we try to run them later, pass will be much more complex because Quantize/Dequantize nodes will be already present in the graph.

Yeah not export_for_training but the export after quantization.

And yeah typically backend specific graph modifications are done in preprocess, unless done specifically for quantization purposes.

The rationale is, you can compose partitioners/backends together and not have to worry about someone else messing with the graph or you failing to partition your modified graph. For example Neutron + Cortex_m delegates can be composed by the user.

See other delegates doing bn fusion - https://github.com/search?q=repo%3Apytorch%2Fexecutorch+FuseBatchNorm&type=code

I guess you are following what Cadence backend is doing, the main difference between Neutron and Cadence is that Cadence is not a delegate.

The BN + Conv fusion is done as Neutron does not (yet) have kernel for BN, but typically the BN is used with Convolution or Matrix multiplication, where it can be fused to. And because of the quantization we do the fusion on ATen graph level, so that the NeutronQuantizer already quantized with respect to the fused batchnorm.

This reminds me about our discussion of quantizing the Conv/Matmul + BN+ (Relu) Activation together as one "cluster" (i.e. not apply observers on Conv and Relu separatelly). This would enable to move the batchnorm fusion even on the Edge dialect transformation in the NeutronBackend.
@skywall , can you please add a such a remark to the passes to address it with the quantizer update? And as suggested by @digantdesai, is is possible to move the transformation to the NeutronQuantizer.transform_for_annotation method as immediate resolution?

Appart from that, the rationale of backends composition makes completelly sense, I am curious how the quantizers composition is expected to be used by the user. Let say when combining the Neutron & Cortex-m backend the user will apply the backends per performance gain (that is e.g. Neutron first, cortex_m second) and apply the corresponding quantizers in the same order?

yeah ok for now, please create an issue, and put it in the code as a TODO. The rationale is if you fuse before and for some reason can't lower to neutron, there is no core-aten op in the edge-ir as "conv+bn+relu" so it will fail as missing op vs. conv, bn, and relu individually will run OK.

backends/nxp/aten_passes/fuse_batch_norm_with_linear_pass.py

backends/nxp/aten_passes/fuse_batch_norm_with_conv_pass.py

digantdesai · 2025-05-06T16:15:28Z

backends/nxp/aten_passes/fuse_batch_norm_with_conv_pass.py

+                conv_args.append(None)
+
+            weight_attr_name = conv_weight_node.target
+            _assign_attr(


you can do this without using internal methods, and insert new params and delete old params https://github.com/pytorch/executorch/blob/main/backends/transforms/utils.py#L65 (assuming you are doing this after export)

That would require to propagate exported program (via constructor?) to the pass. I don't find that particularly clean because passes are built upon GraphModules as you've mentioned in some of the previous PR's. We will end up with ExportedProgram being passed via constructor and GraphModule within call() function what doesn't sound right to me.

Yeah I guess this is for when you do run this pass in delegate.preprocess

backends/nxp/tests/test_batch_norm_fusion.py

digantdesai

Looks good, if we can move this either in NeutronQuantizer.transform_for_annotation (unless you are waiting for the other PR) with or move to Neutron.Preprocess it will be ready to go.

skywall · 2025-05-16T13:12:43Z

Moved NeutronAtenPassManager call into NeutronQuantizer.transform_for_annotation.

skywall · 2025-05-20T07:49:44Z

@digantdesai I just realized, BN + Conv fusion is already happening in prepare_pt2e function call so our pass is doing kinda duplicate work. I don't think it's good practice to introduce optimizations randomly into the pipeline (prepare_pt2e function call) and it seems to me like possible candidate for removal in the future. What is your opinion? Should we leave the pass also here or we can rely on PyTorch that fusion will stay there as is.

digantdesai · 2025-05-22T04:36:47Z

I don't think it's good practice to introduce optimizations randomly into the pipeline

100% agree, I don't remember the rationale ATM, can you please create an issue and tag @jerryzh168 and me?

digantdesai · 2025-05-22T04:37:48Z

backends/nxp/quantizer/neutron_quantizer.py

@@ -202,4 +205,5 @@ def __init__(self):
    def transform_for_annotation(
        self, model: torch.fx.GraphModule
    ) -> torch.fx.GraphModule:
-        return model
+        pass_runner = NeutronAtenPassManager()


digantdesai

LGTM, please create issues and leave it as TODOs. Thanks.

digantdesai · 2025-05-22T05:37:17Z

Arm test seems like an infra issue.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 30, 2025

pytorch-bot bot added module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate labels Apr 30, 2025

JakeStevens reviewed May 1, 2025

View reviewed changes

skywall changed the title ~~[Neutron] Create NeutronAtenPassManager with initial BatchNorm fusing passes~~ NXP backend: Create NeutronAtenPassManager with initial BatchNorm fusing passes May 2, 2025

skywall force-pushed the feature/nxf93343/neutron-batch-norm-fusion-passes branch from d1913ae to 5d69743 Compare May 2, 2025 12:06

digantdesai reviewed May 6, 2025

View reviewed changes

backends/nxp/aten_passes/fuse_batch_norm_with_linear_pass.py Outdated Show resolved Hide resolved

digantdesai reviewed May 6, 2025

View reviewed changes

backends/nxp/aten_passes/fuse_batch_norm_with_conv_pass.py Show resolved Hide resolved

digantdesai reviewed May 6, 2025

View reviewed changes

backends/nxp/tests/test_batch_norm_fusion.py Outdated Show resolved Hide resolved

digantdesai reviewed May 6, 2025

View reviewed changes

skywall force-pushed the feature/nxf93343/neutron-batch-norm-fusion-passes branch from 5d69743 to 270b8af Compare May 13, 2025 15:18

robert-kalmar mentioned this pull request May 15, 2025

NeutronBackend: Add Neutron ATen Passes to Neutron aot example #10898

Open

skywall force-pushed the feature/nxf93343/neutron-batch-norm-fusion-passes branch from 270b8af to 9f8f912 Compare May 16, 2025 13:11

skywall added 2 commits May 19, 2025 08:19

Create NeutronAtenPassManager with initial BatchNorm fusing passes

794c155

Integrate NeutronAtenPassManager passes into pipeline

21ed973

skywall force-pushed the feature/nxf93343/neutron-batch-norm-fusion-passes branch from 9f8f912 to 21ed973 Compare May 19, 2025 06:19

digantdesai reviewed May 22, 2025

View reviewed changes

digantdesai approved these changes May 22, 2025

View reviewed changes

digantdesai merged commit 360c9bb into pytorch:main May 22, 2025
88 of 89 checks passed

digantdesai mentioned this pull request May 22, 2025

Add aot example with Neutron Backend #10871

Draft

skywall mentioned this pull request May 22, 2025

BatchNorm + Convolution fusion in prepare_pt2e removal pytorch/ao#2245

Open


		import torch

		from executorch.backends.nxp.aten_passes.fuse_batch_norm_with_conv_pass import (

NXP backend: Create NeutronAtenPassManager with initial BatchNorm fusing passes #10579

NXP backend: Create NeutronAtenPassManager with initial BatchNorm fusing passes #10579

Uh oh!

Conversation

skywall commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pytorch-bot bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10579

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

skywall commented Apr 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JakeStevens commented May 2, 2025

Uh oh!

skywall commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

digantdesai May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

digantdesai left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skywall commented May 16, 2025

Uh oh!

skywall commented May 20, 2025

Uh oh!

digantdesai commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

digantdesai left a comment

Choose a reason for hiding this comment

Uh oh!

digantdesai commented May 22, 2025

Uh oh!

Uh oh!

Uh oh!

skywall commented Apr 30, 2025 •

edited

Loading

pytorch-bot bot commented Apr 30, 2025 •

edited

Loading

skywall commented May 2, 2025 •

edited

Loading

digantdesai May 6, 2025 •

edited

Loading

digantdesai left a comment •

edited

Loading

digantdesai commented May 22, 2025 •

edited

Loading