-
-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix][Kernel] Fix Illegal memory access in causal_conv1d in H100 #9838
[BugFix][Kernel] Fix Illegal memory access in causal_conv1d in H100 #9838
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
update, and small fix in causal_conv1d kernel Signed-off-by: mzusman <mor.zusmann@gmail.com>
f6947db
to
162361f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix!
Verified that pytest tests/kernels/test_causal_conv1d.py
was hitting an illegal memory access, and that's now fixed on an H100
if ((offset + kWidth - 2) >= kNElts){ | ||
// do not load to index 1 if we're not gonna read from there | ||
reinterpret_cast<vec_t *>(x_vals_load)[1] = smem_exchange[last_thread + 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this a bit more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, added more explanation
@mzusman, is
|
between 2 chunks Signed-off-by: mzusman <mor.zusmann@gmail.com>
I've dug into those 2 failures, and it seems to fail on a specific edge case where the final state data is split between the two iterations in the kernel (in the most of the cases the data ends up only on the last I guess it was overlooked previously because it fails on specific on sequence lengths that depend on the chunk size, while our tests generate sequence lengths randomly (for the varlen part, more specifically). |
Signed-off-by: mzusman <mor.zusmann@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are green for me now, thanks for the additional bug fix!
…_h100 Signed-off-by: mzusman <mor.zusmann@gmail.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com> Signed-off-by: Linkun Chen <github+anyscale@lkchen.net>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com> Signed-off-by: Loc Huynh <jc1da.3011@gmail.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com> Signed-off-by: Sumit Dubey <sumit.dubey2@ibm.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com> Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com>
…llm-project#9838) Signed-off-by: mzusman <mor.zusmann@gmail.com> Signed-off-by: s.kochetkov <s.m.kochetkov@tcsbank.ru>
test_selective_state_update_with_batch_indices
and match them to the constraints intest_selective_state_update_with_heads_with_batch_indices
- Without this change, it also fails in H100 as apparently the diff is a little higher there.CC @tlrmchlsmth