Closed
Description
Please check the ci result
attempt 1
attempt 2
Please use branch wangjial/benchgc_op
and see https://github.com/intel/graph-compiler/blob/wangjial/benchgc_op/scripts/correctness.sh for details
batch_matmul fail:
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/ghrunner/.local/lib/python3.10/site-packages/benchgc/__main__.py", line 262, in <module>
engine = compiler.compile_and_jit(module)
File "/home/ghrunner/.local/lib/python3.10/site-packages/gc_mlir/graph_compiler.py", line 47, in compile_and_jit
self.compile(module, ir_printing)
File "/home/ghrunner/.local/lib/python3.10/site-packages/gc_mlir/graph_compiler.py", line 37, in compile
pm.run(module.operation)
gc_mlir._mlir_libs._site_initialize.<locals>.MLIRError: Failure while executing pass pipeline:
error: unknown: 'linalg.transpose' op dim(result, 0) = 4 doesn't match dim(input, permutation[0]) = 64
note: unknown: see current operation:
%15 = "linalg.transpose"(%10, %14) <{permutation = array<i64: 1, 0, 2>}> ({
^bb0(%arg9: f32, %arg10: f32):
"linalg.yield"(%arg9) : (f32) -> ()
}) : (tensor<16x64x64xf32>, tensor<4x16x16xf32>) -> tensor<4x16x16xf32>
matmul fail
(0, 4): ref: 72393.0000000 res: -35198.0312500 abs_diff: 107591.0312500 rel_diff: 1.4862076
(0, 8): ref: 3704.0000000 res: 307348073856968022527436828704768.0000000 abs_diff: 307348073856968022527436828704768.0000000 rel_diff: 82977343712344206321601478656.0000000
(0, 9): ref: 52705.0000000 res: 53402.8203125 abs_diff: 697.8203125 rel_diff: 0.0132401
(0, 10): ref: -62068.0000000 res: 4360480028519042306210267136.0000000 abs_diff: 4360480028519042306210267136.0000000 rel_diff: 70253271883218220482560.0000000
(0, 11): ref: -30467.0000000 res: 15272181609856882034956304384.0000000 abs_diff: 15272181609856882034956304384.0000000 rel_diff: 501269625702365198811136.0000000
(0, 12): ref: 11023.0000000 res: 75022866827070021312876380160.0000000 abs_diff: 75022866827070021312876380160.0000000 rel_diff: 6806029988930553684951040.0000000
(0, 13): ref: 1945.0000000 res: 17751089905934782744747835392.0000000 abs_diff: 17751089905934782744747835392.0000000 rel_diff: 9126524324624791448322048.0000000
(0, 20): ref: -20481.0000000 res: nan abs_diff: nan rel_diff: nan
(0, 21): ref: 19575.0000000 res: nan abs_diff: nan rel_diff: nan
(0, 22): ref: 35530.0000000 res: nan abs_diff: nan rel_diff: nan
FAIL: linalg.matmul
Both issue is not reproducible under by environment.
The failure of matmul
is random, may be related to memory corruption
Metadata
Metadata
Assignees
Labels
No labels