-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[microNPU] Fix Cascader code generation without StorageRewrite #13365
[microNPU] Fix Cascader code generation without StorageRewrite #13365
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for tracking this down! I suppose we want to wait for #13369 before merging?
Yes, with these changes, the test will not be changed. |
There were extra memory allocations for buffers when parts of the buffer for the result were replaced with a buffer for the entire result (in ReplaceOperators pass) summing up we received a larger size in the number of parts
a97223f
to
067a3c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks @alexey-yazev! |
…e#13365) There were extra memory allocations for buffers when parts of the buffer for the result were replaced with a buffer for the entire result (in ReplaceOperators pass) summing up we received a larger size in the number of parts
This PR fixes Cascader code generation without StorageRewrite (when striping is enabled) on Ethos-U NPU.
The problem was that after the Replace Operator's pass, the buffers for partial results of the operation were replaced with a buffer with the results of the entire operation and it was indicated that the buffer needed to be allocated, summing up we received a larger size in the number of parts.