-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve DPC++ compilation times #243
Conversation
This includes two changes that should improve CTS compilation times (intel/llvm#5127, intel/llvm#5178).
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
Off the top of my head, here's where there're currently hotspots:
Even combined, all these trick are unlikely to make a huge difference. Maybe some caching is possible here? We don't have to uplift compilers every day. And most tests don't change this often.
I would argue. 10 minutes is acceptable. 1 hour is half of the time it takes to build the compiler toolchain. |
In an ideal world... But you're probably right, we should aim higher.
That is a good point. Maybe this is all moot if we simply add |
Updated to include recent changes. Unfortunately this won't profit from ccache, as the new compiler version invalidates the cache. Since that workflow now also builds the container image itself, I expect a very long run ;-). |
I guess, you don’t need to build the compiler too often. You can keep a fixed container tag for a couple of weeks and have some automation to uplift it and warm-up cache. |
Hmm.. It seems like the newer revision of DPC++ again broke |
Add `multi_ptr` back to DPC++ CI filter.
Nevermind, turns out I introduced a bug during the previous back-merge. Everything seems to work now, so I'll go ahead and merge! |
Moving discussions from #234 over to here. While DPC++ can now compile all but 5 test categories, we haven't removed them from the CI filter thus far because compilation times become rather long (over 2 hours). While we can probably still improve things somewhat on the CTS side (by reducing the number of template instantiations in non-extensive mode), there also appears to be some optimization potential on DPC++'s side. I've updated the CI container image to a newer version of DPC++ (ec97c57) that includes two recent performance improvements (intel/llvm#5127, intel/llvm#5178) and have also set the
__SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING__
macro, as suggested in #234.Let's use this PR to keep track of improvements over time, until we hopefully get compilation times into an acceptable range (which I would say is around 1 hour).
cc @bader @alexbatashev