-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Optimize UR 2 PI convert #8737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Another option might be to implement a small |
This seems to end up doing linear search, where as "switch" will likely end up in an indirect access, which is faster. |
Yes, but then we can also optimize the pattern inside the compiler as it would be pretty straightforward. |
I don't see how it could potentially do better than a "switch" |
@@ -70,7 +81,7 @@ class ConvertHelper : public ReturnHelper { | |||
public: | |||
// Convert the value using a conversion map | |||
template <typename TypeUR, typename TypePI> | |||
pi_result convert(const std::unordered_map<TypeUR, TypePI> &Map) { | |||
pi_result convert(std::function<std::optional<TypePI>(TypeUR)> Func) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need the optional
stuff at all? A conversion should always be defined, and we should die
at line 99 if it is not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I see, you return std::nullopt when there is no conversion defined, and then die if it's conversion to bool is "false". Why not just die directly in the conversion switches directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved die as default in switch statement and remove std::optional
It could be exactly as performant as switch but more concise in the actual "mapping" part: https://godbolt.org/z/7zcrq3Wzb |
I don't feel strongly about it and will let PR author to decide. But IMHO paying (presumably) lots of compile-time for (a little) better readability is not worth it. Thanks for pointing this anyways! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smaslov-intel @LukaszJobczyk
could we not merge this? we are in the process of merging the UR L0, and this is very big change that would force to do a big rebase and fix conflicts. Let's do it after it.
dcf167b
to
be0efaf
Compare
be0efaf
to
4653e7f
Compare
06e682f
to
9064adc
Compare
Thanks for video about constexpr maps! |
Why? As I see you are adding new enums to that file, what is the difference between adding to map and adding to switch/case? |
9064adc
to
216054d
Compare
216054d
to
b8e49ae
Compare
b8e49ae
to
d32968f
Compare
8bacdc2
to
2397300
Compare
2397300
to
27fe1c1
Compare
this PR has this test failing:
but saw it in another PR and for me it fails also with baseline. Do you know of any issues with that test? |
It also fails in CUDA test suite |
We have #8847 for it.
Does it happen all the time or is it flaky? |
Today, it happened in every force push |
same for me. it is very deterministic. |
same it is happening with:
failing also with CUDA. do we have a known issue for this also? |
Seems like it is being fixed in #8860 |
27fe1c1
to
3c5ab9a
Compare
3c5ab9a
to
4785706
Compare
@LukaszJobczyk, could you resolve merge, conflicts, please? |
4785706
to
a2b3a8e
Compare
Done |
It looks like, some other PR went before this PR and we have another conflict. @LukaszJobczyk, could you take a look, please? |
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
a2b3a8e
to
3c787ee
Compare
Done |
Signed-off-by: Lukasz Jobczyk lukasz.jobczyk@intel.com