-
Notifications
You must be signed in to change notification settings - Fork 603
Arm backend: Allocate the scratch buffer runtime rather than in the pte #10714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm backend: Allocate the scratch buffer runtime rather than in the pte #10714
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10714
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 1 Cancelled JobAs of commit 2e023a2 with merge base 2ec8678 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@@ -92,7 +92,7 @@ def vela_compile(tosa_flatbuffer: bytes, args: List[str], verbose: bool = False) | |||
if not isinstance(data["scratch_shape"][0], np.int64): | |||
raise RuntimeError("Expected scratch to be int64") | |||
block_length = int(data["scratch_shape"][0]) | |||
bin_blocks["scratch_data"] = b"\x00" * block_length | |||
bin_blocks["scratch_size"] = struct.pack("<I", block_length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure the CI is green, thanks.
Yes, I am adding support for Dedicated_Sram for U85 & changing the default mem mode we test on U85. This is the proper fix for the fail we see for inception_v4. With this fix, we will place the NN & scratch buffer in the DDR and use the SRAM as a cache. The reason for the failure is that the scratch_buffer for inception_v4 is around 2.6-2.7MB, we allocate the scratch buffer in the SRAM, but on the CS-300 we only 2MB of SRAM. Will update the pr soon. |
…Sram for Ethos-U85 This change lowers the size of the pte and allows you to allocate the scratch buffer in an array, usually in the SRAM, for more efficient memory usage on a MCU. Also, add support Dedicated_Sram memory mode in the runtime and make it the default memory mode for Ethos-U85. Change-Id: I04cf9de49a6116141d402b9ad5ca4f21e2025236
2a35a48
to
2e023a2
Compare
failed test are unrelated |
Could you please review #10958 for a temp fix |
Hi @kirklandsign, |
extern size_t ethosu_fast_scratch_size; | ||
extern unsigned char* ethosu_fast_scratch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok this is not the cleanest. Let me think of a better way to do this.
This change lowers the size of the pte and allows you to allocate the scratch buffer in an array, usually in the SRAM, for more efficient memory usage on a MCU
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218