[CodeCamp #18] add ops bbox_overlaps #2477

enemy1205 · 2022-12-07T12:37:49Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Add CPU implementation of ops bbox_overlaps

cpu version

pure python version

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

No

Use cases (Optional)

passed tests/test_ops/test_bbox.py

Checklist

Before PR:

I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
New functionalities are covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with some of those projects, like MMDet or MMCls.
CLA has been signed and all committers have signed the CLA in this PR.

CLAassistant · 2022-12-07T12:37:59Z

All committers have signed the CLA.

grimoire · 2022-12-08T02:02:20Z

Since we already have a pure PyTorch implementation, could you provide a benchmark for the speed-up of the new op?

mmcv/ops/csrc/common/box_iou_rotated_utils.hpp

enemy1205 · 2022-12-08T13:22:43Z

Since we already have a pure PyTorch implementation, could you provide a benchmark for the speed-up of the new op?

I tested it using pytest-benchmark as follows :
Firstly ,in file tests/test_ops/test_bbox.py , since the two calls are the same function, only the parameters are different, I commented out the semi-precision call in order to prevent it from averaging

    @pytest.mark.parametrize('device', [
        'cpu',
        pytest.param(
            'cuda',
            marks=pytest.mark.skipif(
                not IS_CUDA_AVAILABLE, reason='requires CUDA support')),
        pytest.param(
            'mlu',
            marks=pytest.mark.skipif(
                not IS_MLU_AVAILABLE, reason='requires MLU support')),
        pytest.param(
            'mps',
            marks=pytest.mark.skipif(
                not IS_MPS_AVAILABLE, reason='requires MPS support'))
    ])
    def test_bbox_overlaps_float(self, device,benchmark):
        benchmark(self._test_bbox_overlaps,device, dtype=torch.float)

    # @pytest.mark.parametrize('device', [
    #     pytest.param(
    #         'cuda',
    #         marks=pytest.mark.skipif(
    #             not IS_CUDA_AVAILABLE, reason='requires CUDA support')),
    #     pytest.param(
    #         'mlu',
    #         marks=pytest.mark.skipif(
    #             not IS_MLU_AVAILABLE, reason='requires MLU support'))
    # ])
    # def test_bbox_overlaps_half(self, device,benchmark):
    #     benchmark(self._test_bbox_overlaps,device, dtype=torch.half)

Then ,in file mmcv/ops/bbox.py
In the end of the function, I implemented calls to the C++ and pure pytorch versions by commenting out the other one in turn

    mode_dict = {'iou': 0, 'iof': 1}
    assert mode in mode_dict.keys()
    mode_flag = mode_dict[mode]
    # Either the boxes are empty or the length of boxes' last dimension is 4
    assert (bboxes1.size(-1) == 4 or bboxes1.size(0) == 0)
    assert (bboxes2.size(-1) == 4 or bboxes2.size(0) == 0)
    assert offset == 1 or offset == 0

    rows = bboxes1.size(0)
    cols = bboxes2.size(0)
    if aligned:
        assert rows == cols
        ious = bboxes1.new_zeros(rows)
    else:
        ious = bboxes1.new_zeros((rows, cols))

    if rows * cols == 0:
        return ious
    
    # Pure Pytorch
    # return _bbox_overlaps_cpu(
    #         bboxes1, bboxes2, mode=mode, aligned=aligned, offset=offset)
    
    # C++ version
    ext_module.bbox_overlaps(
        bboxes1, bboxes2, ious, mode=mode_flag, aligned=aligned, offset=offset)
    return ious

By executing pytest tests/test_ops/test_bbox.py
The test results are as follows:

The top is the C++ version and the bottom is the pure Pytorch version, so there is some speed optimization

grimoire · 2022-12-09T02:27:50Z

mmcv/ops/csrc/pytorch/cpu/bbox_overlaps_cpu.cpp

+
+void bbox_overlaps_cpu(const Tensor boxes1, const Tensor boxes2, Tensor ious,
+                       const int mode, const bool aligned, const int offset) {
+  bbox_overlaps_cpu_kernel<float>(boxes1, boxes2, ious, mode, aligned, offset);


Assert the input and output datatype here.

grimoire · 2022-12-09T03:05:10Z

The data size in the unit test is relatively small. Have you benchmark it on large datas (1000+ boxes)?

enemy1205 · 2022-12-09T09:40:53Z

benchmark(self._test_bbox_overlaps,device, dtype=torch.float)

As you say，Pytorch optimizes large amounts of data far more efficiently than C++ loops
test code:

        b1 = torch.tensor([[1.0+i, 1.0+i, 3.0+i, 3.0+i] for i in range(100000)]).to(device).type(dtype)
        b2 = torch.tensor([[2.0+i, 2.0+i, 4.0+i, 4.0+i] for i in range(100000)]).to(device).type(dtype)
        should_output = np.array([1/7]*100000)
        out = bbox_overlaps(b1, b2, aligned=True)
        assert np.allclose(out.cpu().numpy(), should_output, 1e-2)

elapsed time

// Pure C++ loop
------------------------------------------------------ benchmark: 1 tests ------------------------------------------------------
Name (time in ms)                      Min       Max      Mean   StdDev    Median      IQR  Outliers     OPS  Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------
test_bbox_overlaps_float[cpu]     367.5079  456.5294  405.8569  36.7101  396.9816  60.0164       2;0  2.4639       5           1
--------------------------------------------------------------------------------------------------------------------------------
// C++  with openmp (#pragma omp parallel for)
------------------------------------------------------ benchmark: 1 tests ------------------------------------------------------
Name (time in ms)                      Min       Max      Mean   StdDev    Median      IQR  Outliers     OPS  Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------
test_bbox_overlaps_float[cpu]     368.4876  410.3432  381.1380  16.7261  375.3134  14.5139       1;1  2.6237       5           1
--------------------------------------------------------------------------------------------------------------------------------
// Pure Pytorch
------------------------------------------------------ benchmark: 1 tests ------------------------------------------------------
Name (time in ms)                      Min       Max      Mean   StdDev    Median      IQR  Outliers     OPS  Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------
test_bbox_overlaps_float[cpu]     191.3801  229.2963  204.5748  14.8945  200.2914  17.5259       1;0  4.8882       5           1
--------------------------------------------------------------------------------------------------------------------------------

So in practice, the pytorch version may indeed be more efficient

…laps

tests/test_ops/test_bbox.py

mmcv/ops/bbox.py

grimoire

LGTM

mmcv/ops/csrc/pytorch/cpu/bbox_overlaps_cpu.cpp

* add ops bbox_overlaps * format code * Return the pytorch version * Intermediate modification * Solve problems in parameter passing * revise bug * "add test case"

add ops bbox_overlaps

b8fa1f0

mm-assistant bot assigned HAOCHENYE Dec 7, 2022

HAOCHENYE requested a review from grimoire December 7, 2022 16:47

zhouzaida changed the title ~~[CodeCamp#18] add ops bbox_overlaps~~ [CodeCamp #18] add ops bbox_overlaps Dec 8, 2022

zhouzaida added the CodeCamp label Dec 8, 2022

HAOCHENYE reviewed Dec 8, 2022

View reviewed changes

mmcv/ops/csrc/common/box_iou_rotated_utils.hpp Outdated Show resolved Hide resolved

format code

6bf402a

grimoire reviewed Dec 9, 2022

View reviewed changes

enemy1205 added 2 commits December 12, 2022 19:46

Return the pytorch version

cbe9232

Merge remote-tracking branch 'upstream/master' into add_ops_bbox_over…

12c6143

…laps

grimoire reviewed Dec 14, 2022

View reviewed changes

tests/test_ops/test_bbox.py Outdated Show resolved Hide resolved

Intermediate modification

8be70db

grimoire reviewed Jan 12, 2023

View reviewed changes

mmcv/ops/bbox.py Outdated Show resolved Hide resolved

enemy1205 added 2 commits January 12, 2023 21:39

Solve problems in parameter passing

7495c9e

revise bug

4e87621

grimoire approved these changes Jan 13, 2023

View reviewed changes

mmcv/ops/csrc/pytorch/cpu/bbox_overlaps_cpu.cpp Outdated Show resolved Hide resolved

"add test case"

83fdb8e

zhouzaida approved these changes Jan 30, 2023

View reviewed changes

zhouzaida merged commit 422816e into open-mmlab:master Jan 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CodeCamp #18] add ops bbox_overlaps #2477

[CodeCamp #18] add ops bbox_overlaps #2477

enemy1205 commented Dec 7, 2022 •

edited by HAOCHENYE

Loading

CLAassistant commented Dec 7, 2022 •

edited

Loading

grimoire commented Dec 8, 2022

enemy1205 commented Dec 8, 2022

grimoire Dec 9, 2022

grimoire commented Dec 9, 2022

enemy1205 commented Dec 9, 2022

grimoire left a comment

[CodeCamp #18] add ops bbox_overlaps #2477

[CodeCamp #18] add ops bbox_overlaps #2477

Conversation

enemy1205 commented Dec 7, 2022 • edited by HAOCHENYE Loading

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

CLAassistant commented Dec 7, 2022 • edited Loading

grimoire commented Dec 8, 2022

enemy1205 commented Dec 8, 2022

grimoire Dec 9, 2022

Choose a reason for hiding this comment

grimoire commented Dec 9, 2022

enemy1205 commented Dec 9, 2022

grimoire left a comment

Choose a reason for hiding this comment

enemy1205 commented Dec 7, 2022 •

edited by HAOCHENYE

Loading

CLAassistant commented Dec 7, 2022 •

edited

Loading