[Tracker] All the issue related with e2e shark test suite #812

pdhirajkumarprasad · 2024-08-27T16:00:31Z

Full ONNX FE tracker is at: #564

Running model

In alt_e2e test suite:

setenv CACHE_DIR "some Path where model will be downloaded"

If building torch-mlir and iree from source:

source /path/to/iree-build/.env && export PYTHONPATH
export PYTHONPATH=/path/to/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir:/path/to/torch-mlir/test/python/fx_importer:$PYTHONPATH
export PATH=/path/to/iree-build/tools/:/path/to/torch-mlir/build/bin/:$PATH

python ./run.py --mode=cl-onnx-iree -v --torchtolinalg -t ModelName

For onnx/models/

critical issues

import and setup failures

#	device	issue type	issue no	#model impacted	list of model
1	N/A	missing weights (remove these)	#862	30	model list
2	N/A	cannot load model in ORT (remove?)	#862	1	model list
3	N/A	OOM during ORT	#862	3	model list
4	N/A	OOM import, missing dim_params, ORT PASS	#860 #861	21	model list
5	N/A	Unable to update opset ver due to BatchNormalization, ORT PASS	#859	5	model list
6	N/A	Unable to update opset ver due to BN, OOM import, ORT PASS	#859 #861	1	model list
7	N/A	duplicate metadata_prop keys, ORT PASS	#863	1	model list
8	N/A	OOM import, ORT PASS	#861	25	model list
9	N/A	No Azure Blob Found	#864	20	model list

onnx to torch

#	device	issue type	issue no	#model impacted	list of model	assignee
1	CPU	~~'util.initializer' op failed to inline into combined initializer~~	18386	56	modelList	@vivekkhandelwal1
2	CPU	failed to legalize operation 'hal.interface.constant.load'		45	modelList	@vinayakdsci
3	CPU	crash: mlir::PatternApplicator::matchAndRewrite	867	41	modelList	@zjgarvey
4	CPU	Crash	866	22	modelList	@vinayakdsci
5	CPU	~~'memref.alloca' op expected no unbounded stack allocations~~	18810	5	modelList	@jinchen62
6	CPU	'torch.prim.If' op along control flow edge from Region #0 to parent results: source type #0	696	6	modelList	@renxida
7	CPU	~~'vector.transfer_write' op inferred mask type ('vector<1x1x4xi1>') and mask operand type ('vector<1x4x1xi1>') don't match~~		3	modelList
8	CPU	'stream.async.dispatch' op has invalid Read access range [0 to 7375872 for 7375872] of resource %15 with size 150528; length > resource size		3	modelList
9	CPU	~~'tensor.dim' op unexpected during shape cleanup; dynamic dimensions must have been resolved prior to leaving the flow dialect~~		1	modelList
10	CPU	operand #1 does not dominate this use	iree#18815	1	modelList	@IanWood1
11	CPU	failed to legalize operation onnx.NonZero	820	1	modelList	@renxida
12	CPU	type of return operand 0 ('!torch.vtensor<[?,384],f32>') doesn't match function result type ('!torch.vtensor<[1,384],f32>')		1	modelList	@Shukla-Gaurav
13	CPU	torch.aten.convolution		1	modelList	@PhaneeshB
14	CPU	boolean indexing ops: AtenNonzeroOp, AtenIndexTensorOp, AtenMaskedSelectOp	3293			@renxida
15	CPU	Add TorchToLinalg lowering for MaxUnpool operation	718			@jinchen62
16	CPU	Fix Onnx.DFT Torch->Linalg lowering	800			@PhaneeshB

torch to linalg

#	device	issue type	issue no	#model impacted	list of model	assignee	status
1	CPU	'linalg.generic' op inferred input/output operand	825	11	modelList	@zjgarvey

iree-compile

IREE project tracker: https://github.com/orgs/iree-org/projects/8/views/3

#	device	issue type	issue no	#model impacted	list of model
1	CPU	error: One or more operations with large vector sizes (8192 bytes) were found	18677	22	modelList
2	GPU	error: 'vector.transfer_read' op Anchoring on transfer_read with unsupported number of elements	18601	100+
3	GPU	func.func' op uses 401920 bytes of shared memory; exceeded the limit of 65536 bytes	18603	100+

iree runtime

#	device	issue type	issue no	#model impacted	list of model	assignee	Status
1	CPU	Abort	18741	515+	modelList

numerics

#	device	issue type	issue no	#model impacted	list of model	assignee
1	CPU	numeric	need_to_analyze	101	modleList
2	[numerics]: element at index 0 (0.332534) does not match the expected (0.308342); for LSTM ops	2	18441

IREE EP only issues

iree-compile fails with ElementsAttr does not provide iteration facilities for type 'mlir::Attribute' on int8 models at QuantizeLinear op

low priority

issue no 828 Turbine Camp
Issue no 797 Ops not in model

The text was updated successfully, but these errors were encountered:

zjgarvey · 2024-08-27T19:08:32Z

Can you update the model List links?

jinchen62 · 2024-08-27T21:00:44Z

Could you also attach the issue links you referred to so we would know if we cover all model paths. Also it seems not including #801 right?

pdhirajkumarprasad · 2024-08-28T04:38:56Z

@zjgarvey the model list contain the updated link only.

@jinchen62 Yes, so far the report is based on onnx model of e2e shark test suite

jinchen62 · 2024-08-29T23:31:02Z

@pdhirajkumarprasad I think it would be helpful to attach more details of the error message.

I feel like the onnx.Transpose one in onnx to torch is the shape inference issue that I was dealing with. I fixed it by setting opset version to 21 with locally built torch-mlir in shark testsuite llvm/torch-mlir#3593. @zjgarvey I realized that this seems not working for the CI job, right? Any ideas?

nod-ai deleted a comment Aug 27, 2024

kumardeepakamd mentioned this issue Aug 29, 2024

[Tracker] Onnx FE Support #564

Open

kumardeepakamd mentioned this issue Sep 12, 2024

Turbine Camp #828

Open

25 tasks

PhaneeshB mentioned this issue Sep 12, 2024

Fix Onnx.DFT Torch->Linalg lowering #800

Open

vinayakdsci mentioned this issue Oct 9, 2024

failed to legalize operation 'hal.interface.constant.load' iree-org/iree#18487

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracker] All the issue related with e2e shark test suite #812

[Tracker] All the issue related with e2e shark test suite #812

pdhirajkumarprasad commented Aug 27, 2024 •

edited by zjgarvey

Loading

zjgarvey commented Aug 27, 2024

jinchen62 commented Aug 27, 2024

pdhirajkumarprasad commented Aug 28, 2024

jinchen62 commented Aug 29, 2024 •

edited

Loading

[Tracker] All the issue related with e2e shark test suite #812

[Tracker] All the issue related with e2e shark test suite #812

Comments

pdhirajkumarprasad commented Aug 27, 2024 • edited by zjgarvey Loading

Running model

For onnx/models/

critical issues

import and setup failures

onnx to torch

torch to linalg

iree-compile

iree runtime

numerics

IREE EP only issues

low priority

zjgarvey commented Aug 27, 2024

jinchen62 commented Aug 27, 2024

pdhirajkumarprasad commented Aug 28, 2024

jinchen62 commented Aug 29, 2024 • edited Loading

pdhirajkumarprasad commented Aug 27, 2024 •

edited by zjgarvey

Loading

jinchen62 commented Aug 29, 2024 •

edited

Loading