Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] 5522 random crop port #5555

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

lezwon
Copy link
Contributor

@lezwon lezwon commented Mar 7, 2022

This PR ports the RandomCrop transform to prototype.transforms.
Linked issue: #5522

@facebook-github-bot
Copy link

Hi @lezwon!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@facebook-github-bot
Copy link

facebook-github-bot commented Mar 7, 2022

💊 CI failures summary and remediations

As of commit 09f6b04 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job Step Action
CircleCI lint_python_and_config Lint Python code and config files 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@pmeier pmeier linked an issue Mar 7, 2022 that may be closed by this pull request
@pmeier pmeier self-requested a review March 7, 2022 08:25
@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lezwon for this PR! I realize you marked this as [WIP], but I have one question that we should answer before you move forward.

@@ -314,3 +314,52 @@ def resized_crop_image_pil(
) -> PIL.Image.Image:
img = crop_image_pil(img, top, left, height, width)
return resize_image_pil(img, size, interpolation=interpolation)


def random_pad_image_tensor(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Shouldn't pad_image_tensor be able to handle this? In general, we don't have kernels for random functions. All randomness should be handled in the transform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pmeier, sorry I am a bit new to this repo. The reason I created a new function is that:

  1. random_pad_image_tensor does not have the same logic as this function. This function takes into account the output shape required and the current image shape. I wasn't sure if this logic should be in forward.
  2. Seeing the current code, I felt that transform-related code should be in _geometry and that there should be separate functions for tensor and PIL.
  3. _get_params in RandomCrop requires the output from this function. So adding this logic within _transform wouldn't work as the params in _transform would not be valid.

Please do provide me with any other approach you have in mind. I could incorporate those changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misjudged the situation. I was not aware that the forward actually modified the image:

if self.padding is not None:
img = F.pad(img, self.padding, self.fill, self.padding_mode)
_, height, width = F.get_dimensions(img)
# pad the width if needed
if self.pad_if_needed and width < self.size[1]:
padding = [self.size[1] - width, 0]
img = F.pad(img, padding, self.fill, self.padding_mode)
# pad the height if needed
if self.pad_if_needed and height < self.size[0]:
padding = [0, self.size[0] - height]
img = F.pad(img, padding, self.fill, self.padding_mode)

This makes things more complicated. cc @datumbox for awareness.

I would move this code into _transform. Although the structure is the same for all possible types, we still need to call different pad kernels. That would be a lot easier if we had the Pad transform from #5521 first. This way we could simply substitute pad_image_*(...) with pad(...) where pad is

pad = functools.partial(
    lambda image, padding: Pad(
        padding,
        fill=self.fill,
        padding_mode=self.padding_mode,
    )(image)
)

and not worry about the dispatch. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @pmeier , I can keep this PR on hold and work on #5521 first, if it helps.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I am a bit new to this repo

Don't be. We are super happy that you contribute and we don't expect that you do a deep dive before starting.

I elaborated on my earlier comment inline.

torchvision/prototype/transforms/_geometry.py Outdated Show resolved Hide resolved
Comment on lines 228 to 332
if isinstance(sample, features.Image):
output = F.random_pad_image_tensor(
sample,
output_size=self.size,
image_size=get_image_dimensions(sample),
padding=self.padding,
pad_if_needed=self.pad_if_needed,
fill=self.fill,
padding_mode=self.padding_mode,
)
sample = features.Image.new_like(sample, output)
elif isinstance(sample, PIL.Image.Image):
sample = F.random_pad_image_pil(
sample,
output_size=self.size,
image_size=get_image_dimensions(sample),
padding=self.padding,
pad_if_needed=self.pad_if_needed,
fill=self.fill,
padding_mode=self.padding_mode,
)
elif is_simple_tensor(sample):
sample = F.random_pad_image_tensor(
sample,
output_size=self.size,
image_size=get_image_dimensions(sample),
padding=self.padding,
pad_if_needed=self.pad_if_needed,
fill=self.fill,
padding_mode=self.padding_mode,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should go into _transform(). Calling super.forward() here will call _transform() with all the "non-container" items that were passed in. That means, you don't need to worry about transforming lists, tuple, dictionaries or the like. input will be an individual element such as a tensor or PIL image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. Will try to incorporate this.

@@ -314,3 +314,52 @@ def resized_crop_image_pil(
) -> PIL.Image.Image:
img = crop_image_pil(img, top, left, height, width)
return resize_image_pil(img, size, interpolation=interpolation)


def random_pad_image_tensor(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misjudged the situation. I was not aware that the forward actually modified the image:

if self.padding is not None:
img = F.pad(img, self.padding, self.fill, self.padding_mode)
_, height, width = F.get_dimensions(img)
# pad the width if needed
if self.pad_if_needed and width < self.size[1]:
padding = [self.size[1] - width, 0]
img = F.pad(img, padding, self.fill, self.padding_mode)
# pad the height if needed
if self.pad_if_needed and height < self.size[0]:
padding = [0, self.size[0] - height]
img = F.pad(img, padding, self.fill, self.padding_mode)

This makes things more complicated. cc @datumbox for awareness.

I would move this code into _transform. Although the structure is the same for all possible types, we still need to call different pad kernels. That would be a lot easier if we had the Pad transform from #5521 first. This way we could simply substitute pad_image_*(...) with pad(...) where pad is

pad = functools.partial(
    lambda image, padding: Pad(
        padding,
        fill=self.fill,
        padding_mode=self.padding_mode,
    )(image)
)

and not worry about the dispatch. Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port transforms.RandomCrop to prototype.transforms
3 participants