-
Notifications
You must be signed in to change notification settings - Fork 769
[CUDA] Make PTXAS optimisation default to -O3 #5188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
740563f
to
f9737be
Compare
f9737be
to
af0b942
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes LGTM as such; that said, would it be possible to commit the patch directly to LLORG?
Review requested in LLORG https://reviews.llvm.org/D116583 |
@tra in LLORG has said:
I have provided our rationale for changing it in the review, but it seems that this needs a bit more discussion if we want the change to be made in LLORG. Let me know your thoughts. Feel free to give your opinion in https://reviews.llvm.org/D116583 |
@intel/dpcpp-clang-driver-reviewers, what is your opinion? |
In general, I'm of the opinion that default optimization levels can be set to whatever we feel is proper for our product. Use of any disabling option like |
This behavior is unchanged by this PR. The only difference is the default level of ptxas when no -O level is specified. The reason we thought this change would be beneficial is because of a few bugs we encountered in ptxas/ptxjitcompiler for different opt levels. This caused JIT errors but not offline ptxas errors when no opt value was provided. These kind of errors are more common than you'd think, and having different opt levels for ptxjitcompiler and ptxas makes them harder to track down. |
Previously the PTX optimization defaulted to -O0.
The
ptxjitcompiler
defaults to -O3, so this change makes the optimization levels of ahead of time and JIT ptxas compilation the same.