Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] Support memory reuse for dynamic shared memory #9341

Merged
merged 7 commits into from
Oct 31, 2021

Conversation

jinhongyii
Copy link
Contributor

This PR applies part of the algorithm of storage rewrite to detect memory reuse possibility for dynamic shared memory, and does rewrite to the body. This functionality is important in matmul because we may want to allocate the shared memory for data fetch and the shared memory for write back in the same location.

@jinhongyii
Copy link
Contributor Author

cc: @junrushao1994 @vinx13 @masahi

@masahi
Copy link
Member

masahi commented Oct 21, 2021

Hmm, for reuse on the constant-size dynamic shmem, I thought the existing storage_write pass would do the job, #8571 (comment). I wonder why it is not sufficient?

@vinx13
Copy link
Member

vinx13 commented Oct 21, 2021

I think this is to support reusing heterogenous data type allocs

@jinhongyii
Copy link
Contributor Author

jinhongyii commented Oct 22, 2021

@masahi

  1. storage_rewrite can't handle buffer with different dtypes
  2. storage_rewrite can't allocate a buffer in the place of another 2 buffers. For example, you can take a look at https://github.com/jinhongyii/tvm/blob/dyn-shared-mem/tests/python/unittest/test_tir_transform_merge_dynamic_shared_memory_allocations.py#L80-L104
    C_sh's allocation reuses the memory of A_sh and B_sh. However, if we use the original algorithm of storage rewrite, it can only allocate C_sh in the place of A_sh or B_sh. The result allocation size would be block * block * 6

@masahi
Copy link
Member

masahi commented Oct 22, 2021

Thanks @jinhongyii, that you are able to reduce the alloc size from block * block * 3 * 4 to block * block * 4 in the test is very cool!

@junrushao
Copy link
Member

Let's double check and get it in if there is no further issue :-)

@jinhongyii
Copy link
Contributor Author

@vinx13 @Hzfengsy comments are all addressed. Please take another look.

Copy link
Member

@vinx13 vinx13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise Lgtm

Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jinhongyii

@Hzfengsy Hzfengsy merged commit a6e90b9 into apache:main Oct 31, 2021
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
* reuse shared dyn

* minor

* format

* format

* address comment and fix

* address comment

* address comment
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
* reuse shared dyn

* minor

* format

* format

* address comment and fix

* address comment

* address comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants