-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recent benchdnn matmul tolerence change is failing on aarch64 #2089
Comments
The rational behind changing threshold is that matmul op uses integer filling and should provide the precise answer on precise inputs. If there are features that affect that statement, the threshold is adjusted accordingly. Based on the output value, it looks like a cancellation happening. The first thing to try would be changing inexact 2.25 to exact 2 or 4. It will likely resolve the problem. If it helps, I'll proceed with purging this scale value from input files. If it doesn't, will need your help to figure out where exactly the difference is coming from. Thanks. |
Thanks for replying so quickly. Your suggestion seems spot-on, as changing the parameter to |
@dzarukin Were you able to look into purging the test cases? Thanks. |
I see, thanks for the update |
Apologies for the bother but I just wanted to follow up on this since it keeps getting flagged in our internal CI. |
This commit + this commit should resolve the issue. |
Summary
The recent change introduced by a8b478b causes a failure on aarch64.
Version
oneDNN v3.7.0 (commit a8b478b)
Environment
flags:
fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs paca pacg dcpodp svei8mm svebf16 i8mm bf16 dgh rng
linux-6.5.0 22.04.1-Ubuntu
gcc-10, g++10
3.22.1
CXX=g++-10 CC=gcc-10 cmake .. -DCMAKE_BUILD_TYPE=Release -DDNNL_AARCH64_USE_ACL=1 -DDNNL_BUILD_FOR_CI=ON -DDNNL_TEST_SET=NIGHTLY -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DONEDNN_BUILD_GRAPH=0 -DDNNL_ENABLE_JIT_PROFILING=0 -DDNNL_OMP_RUNTIME=1
Steps to reproduce
Failure only appears in Release build.
ONEDNN_VERBOSE=all ./build/tests/benchdnn/benchdnn --matmul --skip-impl=ref --dt=s8:s8:f32 --stag=ab --wtag=ab --dtag=ab --bia_dt=u8 --attr-scales=src:common:0.25+dst:common:2.25+wei:common:0.5 --attr-zero-points=src:common:1+dst:common:2+wei:common:-1 --attr-post-ops=sum 1x30:30x20
Observed behavior
Expected behavior
The returned value of
0
seems reasonably close to the expected1.49012e-08
. Could you share the rational behind changing the threshold? Thank you.@dzarukin
The text was updated successfully, but these errors were encountered: