Skip to content

Conversation

@zewenli98
Copy link
Collaborator

Description

Pytorch decomposes torch.ops.aten._native_batch_norm_legit.no_stats into a bunch of aten ops, which makes it way slow. See the issue below for details.

Fixes #3731

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@zewenli98 zewenli98 self-assigned this Aug 7, 2025
@meta-cla meta-cla bot added the cla signed label Aug 7, 2025
@github-actions github-actions bot added component: lowering Issues re: The lowering / preprocessing passes component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Aug 7, 2025
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, is there a reason we use numpy instead of torch?

@zewenli98
Copy link
Collaborator Author

zewenli98 commented Aug 7, 2025

looks fine, is there a reason we use numpy instead of torch?

because it would be FakeTensor

@zewenli98 zewenli98 requested a review from narendasan August 9, 2025 00:00
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@zewenli98 zewenli98 merged commit 844cfd4 into main Aug 21, 2025
81 of 83 checks passed
@zewenli98 zewenli98 deleted the fix_raft branch January 12, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: lowering Issues re: The lowering / preprocessing passes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 [Bug] perf gap reduce on RAFT

3 participants