Skip to content

Add a unified resize operation for detection and a resize op for recognition inference #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 24, 2023

Conversation

SamitHuang
Copy link
Collaborator

@SamitHuang SamitHuang commented May 18, 2023

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

@SamitHuang SamitHuang requested review from hadipash, zhtmike, HaoyangLee and Songyuanwei and removed request for hadipash May 18, 2023 06:51
@HaoyangLee HaoyangLee requested a review from liangxhao May 18, 2023 07:00
@SamitHuang SamitHuang changed the title Add a unified resize operation for detection and a resize for recognition inference Add a unified resize operation for detection and a resize op for recognition inference May 18, 2023
Comment on lines +281 to +291
padded_img = np.zeros((tar_h, tar_w, 3), dtype=np.uint8)
padded_img[:resize_h, :resize_w, :] = resized_img
data['image'] = padded_img
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
padded_img = np.zeros((tar_h, tar_w, 3), dtype=np.uint8)
padded_img[:resize_h, :resize_w, :] = resized_img
data['image'] = padded_img
data['image'] = np.pad(data['image'], ((0, tar_h - resize_h), (0, tar_w - resize_w), (0, 0)))

Comment on lines +290 to +301
data['polys'][:, :, 0] = data['polys'][:, :, 0] * scale_w
data['polys'][:, :, 1] = data['polys'][:, :, 1] * scale_h
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data['polys'][:, :, 0] = data['polys'][:, :, 0] * scale_w
data['polys'][:, :, 1] = data['polys'][:, :, 1] * scale_h
data['polys'] = data['polys'] * [scale_w, scale_h]

resize_h = self.tar_h

if self.keep_ratio==False:
assert self.tar_w is not None, 'Must specify target_width if keep_ratio is False'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move assert inside __init__?

Comment on lines +362 to +364
padded_img = np.zeros((self.tar_h, self.tar_w, 3), dtype=np.uint8)
padded_img[:, :resize_w, :] = resized_img
data['image'] = padded_img
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
padded_img = np.zeros((self.tar_h, self.tar_w, 3), dtype=np.uint8)
padded_img[:, :resize_w, :] = resized_img
data['image'] = padded_img
data['image'] = np.pad(data['image'], ((0, 0), (0, self.tar_w - resize_w), (0, 0)))

- DetResize:
target_size: [ 1152, 2048]
keep_ratio: True
limit_type: auto
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this is a bit confusing. When I want an image to have a specific resolution (set by target_size) and keep its ratio, I must need to set limit_type to auto. Otherwise, I will get completely unexpected output.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have to set limit_type auto in this case. You can just use the default "min". Here auto is just to make the same as ScalePadImage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't set limit_type to auto, the output will be of size 736 by the shortest side.

@hadipash hadipash self-requested a review May 19, 2023 06:59
@SamitHuang SamitHuang changed the title Add a unified resize operation for detection and a resize op for recognition inference Add a unified resize operation for detection and a resize op for recognition inference (don't merge) May 23, 2023
@SamitHuang
Copy link
Collaborator Author

under further improvment.

@SamitHuang SamitHuang force-pushed the base branch 2 times, most recently from 32ebb72 to d1cd911 Compare May 24, 2023 04:22
@SamitHuang SamitHuang changed the title Add a unified resize operation for detection and a resize op for recognition inference (don't merge) Add a unified resize operation for detection and a resize op for recognition inference May 24, 2023
@SamitHuang SamitHuang merged commit 4afbea7 into mindspore-lab:main May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants