Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CherryTrail that doesn't indicate xmm os support #92

Closed
wants to merge 2 commits into from
Closed

Support CherryTrail that doesn't indicate xmm os support #92

wants to merge 2 commits into from

Conversation

psiegl
Copy link
Contributor

@psiegl psiegl commented Jul 23, 2019

There was already a debate in #4 if cpus that do not set the XCR0 properly related to the xmm os support should be supported or not.
I stumpled across a CherryTail Atom that explicitly has this issue: XCR0 bit for xmm not set, but providing SSE* register set.

Now, on the one hand the debate above depicted "won't fix" as there won't be that many cpus with this issue. This is odd, because on the front page of https://github.com/google/cpu_features it is clearly depicted that a dev needs to take care about poor hardware implementations (example with SNB and AVX) in his/her own code. But in this case cpu_features tries to take control in case the bit in XCR0 for xmm os support is (not) set. I'd suggest to make the XCR0 xmm os support bit explicit to the dev so he/she can decide as well?
Hence, this patch removes so far the burden of not properly indicating SSE* on cpus that didn't set the XCR0 register properly, but still support SSE*.

@Mizux
Copy link
Collaborator

Mizux commented Jan 29, 2020

I feel it overkill to purely suppress the XCR0 xmm check, why don't creating a white list where we disable the XCR0 check if we know these architectures are known to no set this bit ?

something like this ?

bool force_sse_os_support = false;
// ATOM platform support sse but don't set the XCR0 bit...
if (GetX86Microarchitecture(&info) == X86Microarchitecture::INTEL_ATOM_SMT) {
  force_sse_os_support = true;
}
....
const bool have_sse_os_support = HasXmmOsXSave(xcr0_eax) || force_sse_os_support;

@psiegl
Copy link
Contributor Author

psiegl commented Jan 29, 2020

Sounds good to me. I just can not test it right now, but code looks good. Would you mind creating a patch?

gchatelet added a commit that referenced this pull request Oct 9, 2020
Fixes #4. This is based on #115 with a few modifications:
 - Removed use of __builtin_cpu_supports since it relies on cpuid and doesn't improve on the current situation,
 - Added detection for all of sse, sse2, sse3, ssse3, sse4_1 and sse4_2,
 - Added tests for Atom, Nehalem, and P3 processors,

Thx to @gadoofou87 for providing the original PR.
It also removes the need for #92

* Fix SSE detection on non-AVX CPUs
* Fixes typo
* Mock OSX sysctlbyname in tests
* Also update other tests
* FakeCpu is reset between each tests
* Fix conflicting name on Windows
* Disable pre AVX cpu sse detection tests on Windows
* Guard OS specific code with macros
* Fix missing import for tests
* Fix wrong function prototype
* Fix wrong mocking of P3 on Windows
* Completely guard OS specific parts in x86 tests
* Store DWORD instead unsigned long for x86 tests
@gchatelet
Copy link
Collaborator

This is now fixed by #135

@gchatelet gchatelet closed this Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants