-
Notifications
You must be signed in to change notification settings - Fork 5.1k
JIT: Re-enable acceleration of Vector512<long>.op_Multiply #111832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I think this needs
Avx512DQ
ISA checkThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're treating any of the
Vector512
methods other thanIsSupported
as intrinsic, that implies we have the full baseline AVX-512 set (F,DQ,BW,CD,VL). It's a bit confusing because some of the import paths assert or check that, but most don't. I'm actually cleaning up some of those redundant asserts in a different branch now.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, I thought that
Vector512.IsHardwareAccelerated
only relies onAVX512F
, but looks likeDOTNET_EnableAVX512DQ=0
turns it off so it's ok.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right,
IsHardwareAccelerated
😄The rule is
Vector512.IsHardwareAccelerated
will return false unless all of the following are satisfied:DOTNET_PreferredVectorBitWidth: >= 512
DOTNET_PreferredVectorBitWidth: < 512
The rule for whether
Vector512
methods actually import as intrinsic is only that we have the baseline AVX-512 set, meaningIsHardwareAccelerated
may return false, but all methods may actually be accelerated anyway.So the fact that we're importing the methods for
Vector512
as intrinsic in the first place means the ISA requirements have already been met.Vector128
andVector256
are a bit different, because the baseline ISA requirement may not be enough to accelerate all methods.Vector256.IsHardwareAccelerated
returns true only if AVX2 is supported, but we attempt to import methods as intrinsic as long as AVX is supported. Since many of the methods require AVX2 for acceleration, they have an extra check for AVX2 and then fall back to managed if it's not available. Hence all the(simdSize != 32) || compOpportunisticallyDependsOn(InstructionSet_AVX2)
checks.Similar checks are not included for
Vector128
, because the base requirement is SSE2, so almost all methods can be accelerated, minus a few that require SSE4.1 and check for it explicitly.Clear as mud, I know...