TPU-MLIR v1.9 Release

github-actions released this 15 Jul 14:40

· 490 commits to master since this release

Release Note

Enhancements:

Implemented output order preservation in converters like ONNX, Caffe, Torch, and TFLite.
Added support for resnet50-v2 bm1690 f8 regression.
Improved ILP group mlir file sequences for resnet50 training.
Updated chip libraries and performance AI for A2 profiling.
Added a new dump mode "COMB" and refined abs/relu conversions.

Bug Fixes:

Fixed issues with preprocess when source layout differs from target layout.
Addressed bugs in various operations like softmax, concat, and weight reorder in conv2d.
Resolved bugs in model training, model transformation, and various pattern issues.
Fixed bugs related to CUDA inference, matmul with bias, and multi-output calibration.

New Features:

Added support for multi-graph in TPULang.
Introduced new options in TPULang for inference and model deployment.
Implemented various optimizations and enhancements for dynamic operations and model transformations.

Documentation Updates:

Refined documentation for quick start quantization and user interface sections.
Updated backend information, docker image download methods, and model deployment details in the documentation.

Miscellaneous:

Improved performance for various models like vit, yolov5s, and bm1690.
Introduced new functionalities like embedding multi-device slice and groupnorm train operations.
Added support for adaptive_avgpool inference and multiple Einsum modes.

Assets 8