Skip to content

Push for Bert, MaskRCNN and Resnet-18 E2E support using Lazy Tensor CoreΒ #365

@ramiro050

Description

@ramiro050

The purpose of this bug is to help keep track of tasks needed to have E2E support for Bert, MaskRCNN, and Resnet-18 using the Lazy Tensor Core (LTC) and lowering through torch-mlir.

At the moment, the main part that is missing is op support in LTC. Below is a table of the LTC ops needed. The list of ops was determined by running two scripts: this MaskRCNN script which generates this output, and this Resnet script which generates this output. The ops missing are the ones with the aten:: prefix in the output of the scripts. For more information of how to setup and run the examples, see here.

LTC Ops Needed

Status column symbols:
- unimplemented
+ started
x finished

Ops Status Owner Model Notes
aten::_index_put_impl_ - MaskRCNN
aten::arange.start_out + silvasean MaskRCNN
aten::exp.out x ramiro050 MaskRCNN pytorch/pytorch#67213
aten::floor.out x ramiro050 MaskRCNN pytorch/pytorch#66770
aten::index.Tensor + ramiro050 MaskRCNN
aten::log2.out x ramiro050 MaskRCNN pytorch/pytorch#66771
aten::max_pool2d_with_indices + vivekkhandelwal1 MaskRCNN, Resnet
aten::upsample_nearest2d.out - MaskRCNN
aten::mean.out x alanwaketan Resnet pytorch/pytorch#67174
aten::sort x silvasean Resnet pytorch/pytorch#67053

torch-mlir ops Needed

Below is a list of ops needed on the torch-mlir side. This list was compiled by going over the ops detected by LTC when running this MaskRCNN script (output with list of ops detected can be found here), this Bert script (output), and the Resnet-18 model in the PyTorch benchmarks (instructions for setting it up with LTC), and checking which had lowerings in torch-mlir and which did not.

Note: Bert and Resnet18 are currently the only training models. The ops needed for MaskRCNN training will be added soon.

Status column symbols:
- unimplemented
+ started
x finished

The full op lists including finished ones are moved to #365 (comment). This new table only contains ops to be done so that we can be more focused.

Op Status Owner Model Notes
aten::bernoulli_ + pashu123 Bert Training rng op
aten::embedding_dense_backward + vivekkhandelwal1 Bert Training histogram
aten::native_layer_norm_backward + gprateek93 Bert Training PR546, PR570
aten::nll_loss_backward + pashu123 Bert Training, Resnet-18 Training PR463
aten::_copy_from + pashu123 Resnet-18 Training, MaskRCNN Inference torchscript baseline won't run (to be investigated)
aten::convolution_backward_overrideable + gpetters94 Resnet-18 Training
aten::max_pool2d_with_indices_backward + vivekkhandelwal1 Resnet-18 Training
aten::native_batch_norm_backward + Shukla-Gaurav Resnet-18 Training
aten::native_batch_norm + Shukla-Gaurav Resnet-18 Training PR563
aten::random_.to + gprateek93 Resnet-18 Training rng op
aten::_copy_from_and_resize + gpetters94 Resnet-18 Training, MaskRCNN Inference
aten::convolution_overrideable + gpetters94 Resnet-18 Training, MaskRCNN Inference
aten::convolution + gpetters94 Resnet-18 Training(through AOTAutograd)
aten::convolution_backward + gpetters94 Resnet-18 Training(through AOTAutograd)
aten::max_pool2d_with_indices + vivekkhandelwal1 Resnet-18 Training, MaskRCNN Inference #518
aten::_index_put_impl_ - MaskRCNN Inference histogram
aten::stack - pashu123 MaskRCNN Inference
aten::topk - gprateek93 MaskRCNN Inference
aten::upsample_nearest2d - gprateek93 MaskRCNN Inference
torchvision::nms - MaskRCNN Inference
torchvision::roi_align - MaskRCNN Inference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions