Skip to content

Share state in module#17719

Open
lucylq wants to merge 1 commit intomainfrom
lfq.module-shared-buffer-test
Open

Share state in module#17719
lucylq wants to merge 1 commit intomainfrom
lfq.module-shared-buffer-test

Conversation

@lucylq
Copy link
Contributor

@lucylq lucylq commented Feb 25, 2026

Summary

Taken from D82329513

Allow Module to share activation memory when share_memory_arenas=True.

In a PTE file, each method may have:
mem_id=1: activation memory
mem_id=2: shared memory (usually for shared state)

These are specified in program.execution_plan[method_index].non_const_buffer_sizes. Tensors with mem_id=2 point to the same memory offset across different methods.

In Module.cpp, freshly allocated memory is provided to each method for activation space. This PR creates a shared_buffer and passes it into to each method instead of allocating fresh memory, when share_memory_arenas=True.

NOTE

  1. Regular activation memory is also shared (mem_id=1). The largest activation memory for each method is used.

  2. I believe this is safe with the existing concurrent execution. In TestConcurrentExecutionWithSharedProgram, it's the (immutable) program that is shared. N separate Modules hold a shared_ptr to the same program. So each module has its own memory buffers etc., and sharing is local to the Module and not across threads. I don't think concurrency within a single Module/between methods is supported.

  3. Running multiple methods concurrently in the same Module is unsafe, and this PR makes it even more unsafe.

RISKS

  1. Data invalidation across methods with shared memory
  auto m1 = module.execute("m1", {x1});
  auto m2 = module.execute("m2", {x2});

m1 and m2 will point to the same output memory buffer when sharing memory (activation memory is shared); m1 must be copied to a separate buffer, otherwise it will store the result of m2.

Test plan

cmake .. \
-DEXECUTORCH_BUILD_TESTS=ON \
-DEXECUTORCH_BUILD_XNNPACK=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_NAMED_DATA_MAP=ON \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_BUILD_KERNELS_PORTABLE=ON

cmake --build . --target extension_module_test

ctest -R extension_module_test --output-on-failure 

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 25, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17719

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 New Failures

As of commit 31004a4 with merge base 4dadf24 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2026
@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@lucylq lucylq requested a review from JacobSzwejbka February 25, 2026 23:22
@lucylq lucylq force-pushed the lfq.module-shared-buffer-test branch 2 times, most recently from a26fac8 to 7794f42 Compare February 25, 2026 23:42
@lucylq lucylq marked this pull request as ready for review February 25, 2026 23:43
@lucylq lucylq requested a review from larryliu0820 as a code owner February 25, 2026 23:43
Copilot AI review requested due to automatic review settings February 25, 2026 23:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for sharing memory arenas across methods in the Module class when share_memory_arenas=True. This feature is required for models exported with share_mutable_buffers=True, where methods need to access shared mutable state (e.g., KV cache in LLMs).

Changes:

  • Added share_memory_arenas parameter to all Module constructors (defaults to false for backward compatibility)
  • Refactored MethodHolder to extract memory planning fields into a new PlannedMemory struct that can be shared across methods
  • Added helper methods to compute maximum memory-planned buffer sizes across all methods when sharing is enabled

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
extension/module/module.h Added share_memory_arenas parameter to all constructors; extracted PlannedMemory struct; added helper method declarations
extension/module/module.cpp Implemented memory sharing logic; refactored load_method to use shared or per-method PlannedMemory
extension/module/test/module_test.cpp Added TestSharedMemoryBuffer test to validate shared state functionality
extension/module/test/CMakeLists.txt Added ModuleSharedState.pte generation and environment variable configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lucylq lucylq requested a review from kimishpatel February 26, 2026 00:57
Comment on lines +331 to +336
if (!shared_planned_memory_) {
auto max_res = get_max_mem_planned_buffer_sizes();
ET_CHECK_OK_OR_RETURN_ERROR(max_res.error());
shared_planned_memory_ = make_planned_memory(max_res.get());
}
method_holder.planned_memory = shared_planned_memory_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does mean that methods cannot be invoked in parallel as they may overwrite each others' arena. Document that explicitly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if AOT memory planning accounted for it then should we not assert that the planned memory has the same mem_id or something for them to be shareable?

Copy link
Contributor Author

@lucylq lucylq Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kimishpatel

This does mean that methods cannot be invoked in parallel as they may overwrite each others' arena. Document that explicitly

Yeah that's right. This isn't a currently supported feature (unsafe to run multiple methods in separate threads within a Module, even without sharing), but sharing makes it harder to support. I'll add an extra comment in module.h

Also if AOT memory planning accounted for it then should we not assert that the planned memory has the same mem_id or something for them to be shareable?

Initially I was thinking we only share physical buffers when mem_id=2 (buffer marked as shareable AoT).

However, @JacobSzwejbka's original diff D82329513, shares both and I don't see any issues with that besides output tensors needing to be copied into permanent memory after running each method, which I think is OK (users have to do this when running the same method twice already).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was just wondering if we should enforce that only buffers that are on the same mem_id can share memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was wondering that too. It seems fine to share all of it, and we get some memory savings as well, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think mem_id has specific meaning so I think asserting for that is good.

@lucylq lucylq force-pushed the lfq.module-shared-buffer-test branch from 7794f42 to 31004a4 Compare February 26, 2026 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants