Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relax] add sample_indices in sampling #16675

Merged
merged 1 commit into from
Mar 5, 2024
Merged

Conversation

yongwww
Copy link
Member

@yongwww yongwww commented Mar 5, 2024

Add sample_indices into multinomial_from_uniform and sample_top_p_top_k_from_sorted_prob

@yongwww
Copy link
Member Author

yongwww commented Mar 5, 2024

cc: @MasterJH5574 @tqchen @vinx13

Copy link
Contributor

@MasterJH5574 MasterJH5574 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you so much @yongwww!

@tqchen tqchen merged commit fe5a350 into apache:main Mar 5, 2024
19 checks passed
@yongwww yongwww deleted the sample_indices branch March 5, 2024 17:57
MasterJH5574 added a commit to MasterJH5574/tvm that referenced this pull request Mar 12, 2024
This PR fixes a typo in the samping kernel of top-p/top-k sampling op.
Prior to this PR, the kernel has out-of-bound global memory access
due to a miss when introducing `sample_indices` in apache#16675.

The correctness pass did not reveal this issue by directly running
the test or running through pytest. But actually, if we use
compute-sanitizer from NVIDIA, it will report the illegal memory
access:
```
> compute-sanitizer --tool memcheck --print-limit=5 --launch-timeout 3600 python tests/python/relax/test_frontend_nn_op.py
========= COMPUTE-SANITIZER
========= Invalid __global__ read of size 8 bytes
=========     at 0x4e90 in get_index_from_sorted_kernel
=========     by thread (7,0,0) in block (0,0,0)
=========     Address 0x7fe35ac00238 is out of bounds
=========     and is 9 bytes after the nearest allocation at 0x7fe35ac00200 of size 48 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
...
```
MasterJH5574 added a commit to MasterJH5574/tvm that referenced this pull request Mar 12, 2024
This PR fixes a typo in the samping kernel of top-p/top-k sampling op.
Prior to this PR, the kernel has out-of-bound global memory access
due to a miss when introducing `sample_indices` in apache#16675.

The correctness pass did not reveal this issue by directly running
the test or running through pytest. But actually, if we use
compute-sanitizer from NVIDIA, it will report the illegal memory
access:
```
> compute-sanitizer --tool memcheck --print-limit=5 --launch-timeout 3600 python tests/python/relax/test_frontend_nn_op.py
========= COMPUTE-SANITIZER
========= Invalid __global__ read of size 8 bytes
=========     at 0x4e90 in get_index_from_sorted_kernel
=========     by thread (7,0,0) in block (0,0,0)
=========     Address 0x7fe35ac00238 is out of bounds
=========     and is 9 bytes after the nearest allocation at 0x7fe35ac00200 of size 48 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
...
```
tqchen pushed a commit that referenced this pull request Mar 12, 2024
This PR fixes a typo in the samping kernel of top-p/top-k sampling op.
Prior to this PR, the kernel has out-of-bound global memory access
due to a miss when introducing `sample_indices` in #16675.

The correctness pass did not reveal this issue by directly running
the test or running through pytest. But actually, if we use
compute-sanitizer from NVIDIA, it will report the illegal memory
access:
```
> compute-sanitizer --tool memcheck --print-limit=5 --launch-timeout 3600 python tests/python/relax/test_frontend_nn_op.py
========= COMPUTE-SANITIZER
========= Invalid __global__ read of size 8 bytes
=========     at 0x4e90 in get_index_from_sorted_kernel
=========     by thread (7,0,0) in block (0,0,0)
=========     Address 0x7fe35ac00238 is out of bounds
=========     and is 9 bytes after the nearest allocation at 0x7fe35ac00200 of size 48 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
...
```
Lunderberg pushed a commit to Lunderberg/tvm that referenced this pull request Mar 12, 2024
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
This PR fixes a typo in the samping kernel of top-p/top-k sampling op.
Prior to this PR, the kernel has out-of-bound global memory access
due to a miss when introducing `sample_indices` in apache#16675.

The correctness pass did not reveal this issue by directly running
the test or running through pytest. But actually, if we use
compute-sanitizer from NVIDIA, it will report the illegal memory
access:
```
> compute-sanitizer --tool memcheck --print-limit=5 --launch-timeout 3600 python tests/python/relax/test_frontend_nn_op.py
========= COMPUTE-SANITIZER
========= Invalid __global__ read of size 8 bytes
=========     at 0x4e90 in get_index_from_sorted_kernel
=========     by thread (7,0,0) in block (0,0,0)
=========     Address 0x7fe35ac00238 is out of bounds
=========     and is 9 bytes after the nearest allocation at 0x7fe35ac00200 of size 48 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
...
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants