-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OneDNN]Fix ocr error since pass avx512 command #64132
[OneDNN]Fix ocr error since pass avx512 command #64132
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
related issue: #11597 |
|
||
#ifdef PADDLE_WITH_DNNL | ||
if (!phi::backends::cpu::MayIUse( | ||
phi::backends::cpu::cpu_isa_t::avx512_core)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For here, the limitation is a bit strict, since in self_dp_attention_kernel
, the commonly used ISA was AVX512F
. While this condition will block application of machines with only AVX512F
.
a2715c9
to
03877ef
Compare
thanks for fixing, skip avx512 implemented pass on non-avx512 platforms will solve the illegal instruction issue. I am kind of wondering the expected performance on non-avx512 platforms, since we have issues reporting unacceptable low speed on non-avx512 plaftorms, e.g.: PaddlePaddle/PaddleOCR#10346. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi @jzhang533, as I mentioned in the issue, this is actually the same reason causing unexpected phenomenon. AMD seems has their mechanism to deal with incompatible higher ISA (and which also makes it hard to find the root cause since it wouldn’t report the error directly). And about performance on non-avx512 platform. If we build and install Paddle on non-avx512 machine, the performance will be as expected, which means the performance will be better when enabling oneDNN/mkldnn. |
* fix ocr avx error since pass avx512 command * replace 512 core to 512f, limit restrict * skip avx2 in new pir pass * fix
PR Category
Others
PR Types
Bug fixes
Description
Fix ocr error since pass avx512 command