-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] 5522 random crop port #5555
base: main
Are you sure you want to change the base?
Conversation
Hi @lezwon! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
💊 CI failures summary and remediationsAs of commit 09f6b04 (more details on the Dr. CI page):
1 failure not recognized by patterns:
This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lezwon for this PR! I realize you marked this as [WIP]
, but I have one question that we should answer before you move forward.
@@ -314,3 +314,52 @@ def resized_crop_image_pil( | |||
) -> PIL.Image.Image: | |||
img = crop_image_pil(img, top, left, height, width) | |||
return resize_image_pil(img, size, interpolation=interpolation) | |||
|
|||
|
|||
def random_pad_image_tensor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? Shouldn't pad_image_tensor
be able to handle this? In general, we don't have kernels for random functions. All randomness should be handled in the transform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pmeier, sorry I am a bit new to this repo. The reason I created a new function is that:
random_pad_image_tensor
does not have the same logic as this function. This function takes into account the output shape required and the current image shape. I wasn't sure if this logic should be inforward
.- Seeing the current code, I felt that transform-related code should be in
_geometry
and that there should be separate functions for tensor and PIL. _get_params
inRandomCrop
requires the output from this function. So adding this logic within_transform
wouldn't work as theparams
in_transform
would not be valid.
Please do provide me with any other approach you have in mind. I could incorporate those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misjudged the situation. I was not aware that the forward actually modified the image:
vision/torchvision/transforms/transforms.py
Lines 663 to 674 in b4cb352
if self.padding is not None: | |
img = F.pad(img, self.padding, self.fill, self.padding_mode) | |
_, height, width = F.get_dimensions(img) | |
# pad the width if needed | |
if self.pad_if_needed and width < self.size[1]: | |
padding = [self.size[1] - width, 0] | |
img = F.pad(img, padding, self.fill, self.padding_mode) | |
# pad the height if needed | |
if self.pad_if_needed and height < self.size[0]: | |
padding = [0, self.size[0] - height] | |
img = F.pad(img, padding, self.fill, self.padding_mode) |
This makes things more complicated. cc @datumbox for awareness.
I would move this code into _transform
. Although the structure is the same for all possible types, we still need to call different pad
kernels. That would be a lot easier if we had the Pad
transform from #5521 first. This way we could simply substitute pad_image_*(...)
with pad(...)
where pad
is
pad = functools.partial(
lambda image, padding: Pad(
padding,
fill=self.fill,
padding_mode=self.padding_mode,
)(image)
)
and not worry about the dispatch. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I am a bit new to this repo
Don't be. We are super happy that you contribute and we don't expect that you do a deep dive before starting.
I elaborated on my earlier comment inline.
if isinstance(sample, features.Image): | ||
output = F.random_pad_image_tensor( | ||
sample, | ||
output_size=self.size, | ||
image_size=get_image_dimensions(sample), | ||
padding=self.padding, | ||
pad_if_needed=self.pad_if_needed, | ||
fill=self.fill, | ||
padding_mode=self.padding_mode, | ||
) | ||
sample = features.Image.new_like(sample, output) | ||
elif isinstance(sample, PIL.Image.Image): | ||
sample = F.random_pad_image_pil( | ||
sample, | ||
output_size=self.size, | ||
image_size=get_image_dimensions(sample), | ||
padding=self.padding, | ||
pad_if_needed=self.pad_if_needed, | ||
fill=self.fill, | ||
padding_mode=self.padding_mode, | ||
) | ||
elif is_simple_tensor(sample): | ||
sample = F.random_pad_image_tensor( | ||
sample, | ||
output_size=self.size, | ||
image_size=get_image_dimensions(sample), | ||
padding=self.padding, | ||
pad_if_needed=self.pad_if_needed, | ||
fill=self.fill, | ||
padding_mode=self.padding_mode, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic should go into _transform()
. Calling super.forward()
here will call _transform()
with all the "non-container" items that were passed in. That means, you don't need to worry about transforming lists, tuple, dictionaries or the like. input
will be an individual element such as a tensor or PIL image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted. Will try to incorporate this.
@@ -314,3 +314,52 @@ def resized_crop_image_pil( | |||
) -> PIL.Image.Image: | |||
img = crop_image_pil(img, top, left, height, width) | |||
return resize_image_pil(img, size, interpolation=interpolation) | |||
|
|||
|
|||
def random_pad_image_tensor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misjudged the situation. I was not aware that the forward actually modified the image:
vision/torchvision/transforms/transforms.py
Lines 663 to 674 in b4cb352
if self.padding is not None: | |
img = F.pad(img, self.padding, self.fill, self.padding_mode) | |
_, height, width = F.get_dimensions(img) | |
# pad the width if needed | |
if self.pad_if_needed and width < self.size[1]: | |
padding = [self.size[1] - width, 0] | |
img = F.pad(img, padding, self.fill, self.padding_mode) | |
# pad the height if needed | |
if self.pad_if_needed and height < self.size[0]: | |
padding = [0, self.size[0] - height] | |
img = F.pad(img, padding, self.fill, self.padding_mode) |
This makes things more complicated. cc @datumbox for awareness.
I would move this code into _transform
. Although the structure is the same for all possible types, we still need to call different pad
kernels. That would be a lot easier if we had the Pad
transform from #5521 first. This way we could simply substitute pad_image_*(...)
with pad(...)
where pad
is
pad = functools.partial(
lambda image, padding: Pad(
padding,
fill=self.fill,
padding_mode=self.padding_mode,
)(image)
)
and not worry about the dispatch. Thoughts?
c227700
to
09f6b04
Compare
This PR ports the RandomCrop transform to prototype.transforms.
Linked issue: #5522