-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Added CUDA support for interpolation AA bilinear/bicubic #3842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// Setup local buffers | ||
scalar_t wx[256]; | ||
scalar_t wy[256]; | ||
scalar_t buffer1[256]; | ||
scalar_t buffer2[256]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an assumption that resize scale is less than 127.
Otherwise, we can specify dynamic shared memory size before calling the cuda code, however we should then ensure that device has enough shared memory...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for now as long as we raise an error if this configuration is not satisfied, which is what we currently do.
We will need to fix this for PyTorch though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick PR Victor!
The PR looks good to merge, I just have one question before, can you have a look and let me know what you think?
// Setup local buffers | ||
scalar_t wx[256]; | ||
scalar_t wy[256]; | ||
scalar_t buffer1[256]; | ||
scalar_t buffer2[256]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for now as long as we raise an error if this configuration is not satisfied, which is what we currently do.
We will need to fix this for PyTorch though
TORCH_CHECK(rheight < 127, "Max supported scale factor is 127"); | ||
TORCH_CHECK(rwidth < 127, "Max supported scale factor is 127"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the max scale factor supported different if we use bicubic
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching that, yes for bicubic max scale should be at most 63. In general the check should be
TORCH_CHECK(rheight < 255 / interp_size, "Max supported scale factor is 127 (bilinear), 63 (bicubic)");
c76e8fa
to
4e43b5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Can you rebase the PR so that we can proceed merging it?
) Summary: * Added CUDA support for interpolation AA bilinear/bicubic * Fixed code formatting error * Fixed cuda tests on cpu-only builds * Fixed cuda internal torch check for bicubic mode Reviewed By: cpuhrsch Differential Revision: D28538753 fbshipit-source-id: 8b6c16db07a05fdba4b35cb8f5ec077b50691fb7
Description: