Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] Mask RCNN implementation to support PyTorch / XLA #5404

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

miladm
Copy link

@miladm miladm commented Feb 10, 2022

Run Mask RCNN on PyTorch/XLA.

@facebook-github-bot
Copy link

facebook-github-bot commented Feb 10, 2022

💊 CI failures summary and remediations

As of commit c8f7a12 (more details on the Dr. CI page):


  • 17/17 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build cmake_linux_cpu (1/2)

Step: "packaging/build_cmake.sh" (full log | diagnosis details | 🔁 rerun)

https://wiki.centos.org/yum-errors

(2/6): epel/x86_64/updatei 0% [                 ]  0.0 B/s |    0 B   --:-- ETA 
http://mirror.es.its.nyu.edu/epel/7/x86_64/repodata/7054bc90fda978065ad38663c2e7d148dd573288bb5a1622e06734322e84b8ac-updateinfo.xml.bz2: [Errno 12] Timeout on http://mirror.es.its.nyu.edu/epel/7/x86_64/repodata/7054bc90fda978065ad38663c2e7d148dd573288bb5a1622e06734322e84b8ac-updateinfo.xml.bz2: (28, 'Connection timed out after 30000 milliseconds')
Trying other mirror.

(2/6): updates/7/x86_64/primary_db                         |  13 MB   00:00     

(3/6): ius/x86_64/primary                                  | 100 kB   00:00     

(4/6): okay/7/x86_64/primary_db                            | 3.8 MB   00:00     

epel/x86_64/updateinfo         FAILED                                          

(5/6): epel/x86_64/primary 0% [                 ]  0.0 B/s |    0 B   --:-- ETA 
http://mirrors.vcea.wsu.edu/epel/7/x86_64/repodata/7054bc90fda978065ad38663c2e7d148dd573288bb5a1622e06734322e84b8ac-updateinfo.xml.bz2: [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.
To address this issue please refer to the below wiki article 

https://wiki.centos.org/yum-errors

If above article doesn't help to resolve this issue please use https://bugs.centos.org/.


epel/x86_64/updateinfo         FAILED                                          

(5/6): epel/x86_64/primary 0% [                 ]  0.0 B/s |    0 B   --:-- ETA 
https://d2lzkl7pfhq30w.cloudfront.net/pub/epel/7/x86_64/repodata/7054bc90fda978065ad38663c2e7d148dd573288bb5a1622e06734322e84b8ac-updateinfo.xml.bz2: [Errno 14] HTTPS Error 404 - Not Found
Trying other mirror.

epel/x86_64/updateinfo         FAILED                                          

(5/6): epel/x86_64/primary 0% [                 ]  0.0 B/s |    0 B   --:-- ETA 
https://dl.fedoraproject.org/pub/epel/7/x86_64/repodata/7054bc90fda978065ad38663c2e7d148dd573288bb5a1622e06734322e84b8ac-updateinfo.xml.bz2: [Errno 14] HTTPS Error 404 - Not Found
Trying other mirror.

epel/x86_64/updateinfo         FAILED                                          

See CircleCI build type_check_python (2/2)

Step: "Check Python types statically" (full log | diagnosis details | 🔁 rerun)

Found 7 errors in 2 files (checked 226 source files)
            import torch_xla.debug.metrics as met
    ^
torchvision/models/detection/generalized_rcnn.py:122: error: "Tensor" not
callable  [operator]
            detections = self.transform.postprocess(detections_cpu, images...
                         ^
torchvision/models/detection/rpn.py:267: error: Argument 5 to "batched_nms" has
incompatible type "Callable[[], int]"; expected "int"  [arg-type]
    ....batched_nms(boxes, scores, lvl, self.nms_thresh, self.post_nms_top_n)
                                                         ^
Found 7 errors in 2 files (checked 226 source files)


Exited with code exit status 1


15 failures not recognized by patterns:

Job Step Action
CircleCI unittest_linux_cpu_py3.8 Run tests 🔁 rerun
CircleCI unittest_windows_cpu_py3.7 Run tests 🔁 rerun
CircleCI binary_linux_conda_py3.9_cu102 packaging/build_conda.sh 🔁 rerun
CircleCI unittest_linux_cpu_py3.7 Run tests 🔁 rerun
CircleCI unittest_onnx Run tests 🔁 rerun
CircleCI unittest_linux_cpu_py3.9 Run tests 🔁 rerun
CircleCI unittest_torchhub Run tests 🔁 rerun
CircleCI lint_c Lint C code 🔁 rerun
CircleCI unittest_windows_cpu_py3.9 Run tests 🔁 rerun
CircleCI binary_linux_conda_py3.7_cpu packaging/build_conda.sh 🔁 rerun
CircleCI lint_python_and_config Lint Python code and config files 🔁 rerun
CircleCI binary_linux_conda_py3.9_cpu packaging/build_conda.sh 🔁 rerun
CircleCI unittest_windows_cpu_py3.8 Run tests 🔁 rerun
CircleCI binary_linux_conda_py3.8_cpu packaging/build_conda.sh 🔁 rerun
CircleCI unittest_prototype Run tests 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@miladm
Copy link
Author

miladm commented Feb 10, 2022

The current implementation successfully runs on PyTorch/XLA.

Some performance optimizations to hand dynamic shape remain to be supported after this feature is available on PyTorch/XLA.

@miladm miladm changed the title [WIP] Mask RCNN implementation to support PyTorch / XLA [POC] Mask RCNN implementation to support PyTorch / XLA Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants