Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions include/mirage/persistent_kernel/runtime_header.h
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is because __CUDA_ARCH__ wasn't passed into this header file.

Copy link
Collaborator Author

@NorthmanPKU NorthmanPKU Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like __CUDA_ARCH__ can only be used in the implementation of GPU functions

Copy link
Collaborator Author

@NorthmanPKU NorthmanPKU Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use cudaGetDeviceProperties in host side to get & set the maximum smem size. And use CUDA_ARCH in device side to do so.

Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ constexpr int MAX_SHARE_MEMORY_SIZE = 224 * 1024;
constexpr int MAX_SHARE_MEMORY_SIZE = 96 * 1024;
#elif defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 800
constexpr int MAX_SHARE_MEMORY_SIZE = 160 * 1024;
#else
constexpr int MAX_SHARE_MEMORY_SIZE = 96 * 1024;
#else // TODO: This always fall into else case
constexpr int MAX_SHARE_MEMORY_SIZE = 160 * 1024;
#endif

typedef unsigned long long int TaskId;
Expand Down
Loading